Python爬虫：将小说网站内容打包成TXT格式

使用Python编写一个爬虫程序，可以自动将小说网站的内容打包成TXT格式，方便阅读和保存。

以下是示例代码：

import requests
from bs4 import BeautifulSoup

def scrape_novel(url):
    # 发起GET请求
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # 提取小说标题
    title = soup.find('h1').text.strip()
    
    # 提取小说内容
    content = ""
    chapters = soup.find_all('div', class_='chapter')
    for chapter in chapters:
        chapter_title = chapter.find('h2').text.strip()
        chapter_content = chapter.find('div', class_='content').text.strip()
        content += f'{chapter_title}
{chapter_content}

'
    
    return title, content

def save_to_txt(title, content):
    # 保存为TXT文件
    with open(f'{title}.txt', 'w', encoding='utf-8') as file:
        file.write(content)
        
if __name__ == "__main__":
    url = "http://www.example.com/novel"  # 小说网站的URL
    title, content = scrape_novel(url)
    save_to_txt(title, content)
    print("小说已成功保存为TXT文件。")

使用说明：

确保已安装requests和beautifulsoup4库。
将代码中的url变量替换为你要爬取的小说网站的实际URL。
运行程序，程序会将小说标题和内容保存为一个名为小说标题.txt的TXT文件。

注意事项：

此示例仅适用于简单的小说网站结构。如果目标网站的HTML结构复杂或具有特殊的反爬虫机制，可能需要进行额外的处理。
爬取网站内容时，请遵守网站的robots协议，避免对网站造成负面影响。
使用爬虫程序进行数据采集时，请注意个人信息保护，不要采集敏感数据。

希望本教程能够帮助你使用Python编写爬虫程序，轻松获取小说内容并保存为TXT格式。