黑马程序员论坛人工智能板块帖子抓取教程:获取标题、作者、发布时间和链接
抓取人工智能板块帖子的代码如下所示:
import requests
from bs4 import BeautifulSoup
url = 'https://bbs.itheima.com/forum-ai-1.html'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
posts = soup.select('.bm_c tbody')[1].select('tr')
for post in posts:
title = post.select('.xst')[0].text.strip()
author = post.select('.by cite')[0].text.strip()
date = post.select('.by em')[0].text.strip()
link = 'https://bbs.itheima.com/' + post.select('.xst')[0]['href']
print('标题:', title)
print('作者:', author)
print('发布时间:', date)
print('链接:', link)
print('------------------------')
运行代码后,可以获取到人工智能板块帖子的标题、作者、发布时间以及链接。
原文地址: https://www.cveoy.top/t/topic/m0f2 著作权归作者所有。请勿转载和采集!