用XPATH从网页爬取廊坊近7日天气最高温度、最低温度#导入requests库#导入etree库#设置url爬取你所在城市的7日天气#设置headers为你本机的User-Agent。#通过requestsget发起请求response#把响应的内容response解码为utf-8格式#用etreeHTML把解码后的response转换成DOM树格式#通过xpath匹配所在城市的最高温度的结果用完
import requests from lxml import etree
url = "http://www.weather.com.cn/weather/101090601.shtml" headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"} response = requests.get(url, headers=headers) html = response.content.decode('utf-8') selector = etree.HTML(html)
方法1:使用完整路径
high_temp = selector.xpath('/html/body/div[6]/div[1]/div[1]/div/ul/li[1]/p[2]/i/text()') low_temp = selector.xpath('/html/body/div[6]/div[1]/div[1]/div/ul/li[1]/p[2]/i/text()')
方法2:使用相对路径
high_temp = selector.xpath('//ul[@class="t clearfix"]/li/p[@class="tem"]/span/text()') low_temp = selector.xpath('//ul[@class="t clearfix"]/li/p[@class="tem"]/i/text()')
print("最高温度:", high_temp[0]) print("最低温度:", low_temp[0]
原文地址: https://www.cveoy.top/t/topic/eZbk 著作权归作者所有。请勿转载和采集!