python写条爬虫爬取文字信息

日期: 2026-01-27

标签: 科技

以下是一个简单的Python爬虫示例，可以爬取指定网站中的文字信息，并将其输出到控制台上：

import requests
from bs4 import BeautifulSoup

url = "https://www.example.com" # 替换为你要爬取的网站地址

response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

# 查找所有的文本信息
text_list = []
for text in soup.find_all(text=True):
    if text.parent.name not in ['style', 'script', 'head', 'title', 'meta', '[document]']:
        text_list.append(text)

# 输出文本信息
for text in text_list:
    print(text.strip())

解释：

导入所需的库：requests用于获取网页内容，BeautifulSoup用于解析网页内容。
定义要爬取的网站地址。
通过requests库获取网页内容，并用BeautifulSoup解析网页内容。
查找所有的文本信息，并将其存储到一个列表中。
遍历文本信息列表，并输出每个文本信息。

原文地址: https://www.cveoy.top/t/topic/baBl 著作权归作者所有。请勿转载和采集!