Python爬取廊坊7日天气：BeautifulSoup库实战

使用BeautifulSoup库爬取廊坊7日天气

本教程将使用Python的BeautifulSoup库爬取廊坊未来7天的天气预报信息，并解析出日期、最高温度和最低温度。

步骤：

导入库

from bs4 import BeautifulSoup
import requests
import re

获取网页内容

url = 'http://www.weather.com.cn/weather/101090601.shtml'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'}
response = requests.get(url, headers=headers)
content = response.content.decode('utf-8')

解析网页结构

soup = BeautifulSoup(content, 'lxml')
result = soup.findAll('div', {'class': 'tem'})

提取数据并输出

print('廊坊未来七天天气情况：')
for i in range(7):
    date = soup.findAll('h1', {'class': 'date'})[i].string
    high_temp = result[i].find('span').string
    low_temp = result[i].find('i').string
    print(date + ' 最高温度：' + high_temp + '℃ 最低温度：' + low_temp + '℃')

代码解释:

requests 库用来获取网页内容。
BeautifulSoup 库用来解析网页结构，方便提取所需数据。
re 库可以用于更复杂的文本解析（本例未使用）。
findAll 函数用于查找所有符合条件的标签。
string 属性用于获取标签中的文本内容。
循环遍历7天，分别输出日期、最高温度和最低温度。

注意：

网页结构可能会发生变化，导致代码无法正常运行。需要根据实际情况调整代码。
爬取网页数据时，请遵守网站的robots协议，避免给网站造成负担。