1. HTML响应类型的XPath示例: 假设我们有一个HTML响应如下:
<html>
  <body>
    <h1>Hello World</h1>
    <div class="content">
      <p>This is a paragraph.</p>
      <p>This is another paragraph.</p>
    </div>
  </body>
</html>

我们可以使用XPath来提取其中的内容,示例代码如下:

from lxml import etree

response = """
<html>
  <body>
    <h1>Hello World</h1>
    <div class="content">
      <p>This is a paragraph.</p>
      <p>This is another paragraph.</p>
    </div>
  </body>
</html>
"""

tree = etree.HTML(response)
title = tree.xpath('//h1/text()')[0]
paragraphs = tree.xpath('//div[@class="content"]/p/text()')

print(title)  # 输出: Hello World
print(paragraphs)  # 输出: ['This is a paragraph.', 'This is another paragraph.']
  1. XML响应类型的XPath示例: 假设我们有一个XML响应如下:
<bookstore>
  <book category="fiction">
    <title lang="en">Harry Potter</title>
    <author>J.K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="cooking">
    <title lang="en">Cooking 101</title>
    <author>Chef John</author>
    <year>2010</year>
    <price>19.99</price>
  </book>
</bookstore>

我们可以使用XPath来提取其中的内容,示例代码如下:

from lxml import etree

response = """
<bookstore>
  <book category="fiction">
    <title lang="en">Harry Potter</title>
    <author>J.K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="cooking">
    <title lang="en">Cooking 101</title>
    <author>Chef John</author>
    <year>2010</year>
    <price>19.99</price>
  </book>
</bookstore>
"""

tree = etree.XML(response)
titles = tree.xpath('//title/text()')
authors = tree.xpath('//author/text()')

print(titles)  # 输出: ['Harry Potter', 'Cooking 101']
print(authors)  # 输出: ['J.K. Rowling', 'Chef John']
  1. JSON响应类型的JSONPath示例: 假设我们有一个JSON响应如下:
{
  "books": [
    {
      "title": "Harry Potter",
      "author": "J.K. Rowling",
      "year": 2005,
      "price": 29.99
    },
    {
      "title": "Cooking 101",
      "author": "Chef John",
      "year": 2010,
      "price": 19.99
    }
  ]
}

我们可以使用JSONPath来提取其中的内容,示例代码如下:

import json
from jsonpath_ng import parse

response = """
{
  "books": [
    {
      "title": "Harry Potter",
      "author": "J.K. Rowling",
      "year": 2005,
      "price": 29.99
    },
    {
      "title": "Cooking 101",
      "author": "Chef John",
      "year": 2010,
      "price": 19.99
    }
  ]
}
"""

data = json.loads(response)
titles = [match.value for match in parse('$.books[*].title').find(data)]
authors = [match.value for match in parse('$.books[*].author').find(data)]

print(titles)  # 输出: ['Harry Potter', 'Cooking 101']
print(authors)  # 输出: ['J.K. Rowling', 'Chef John']
  1. String响应类型的Regex示例: 假设我们有一个String响应如下:
response = "Hello, my name is John. I am 25 years old."

我们可以使用正则表达式来提取其中的内容,示例代码如下:

import re

response = "Hello, my name is John. I am 25 years old."

name = re.search(r"my name is (\w+)", response).group(1)
age = re.search(r"I am (\d+) years old", response).group(1)

print(name)  # 输出: John
print(age)  # 输出: 25
``
html、xml、json文件的响应类型xpath、regen、string、jsonpath示例

原文地址: https://www.cveoy.top/t/topic/idED 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录