Python爬虫实战：批量下载图片指南

想要学习如何使用Python从网站上批量下载图片吗？本教程将为你提供一个简单的代码示例，帮助你快速入门图片爬虫。

代码示例

以下代码使用requests库获取网页内容，并使用BeautifulSoup库解析HTML，提取图片链接，最后下载图片到本地。

import requests
from bs4 import BeautifulSoup
import os

def download_image(url, save_path):
    response = requests.get(url, stream=True)
    if response.status_code == 200:
        with open(save_path, 'wb') as file:
            for chunk in response.iter_content(1024):
                file.write(chunk)

def crawl_images(url, save_directory):
    response = requests.get(url)
    if response.status_code == 200:
        soup = BeautifulSoup(response.text, 'html.parser')
        image_tags = soup.find_all('img')
        for image_tag in image_tags:
            image_url = image_tag['src']
            image_name = image_url.split('/')[-1]
            save_path = os.path.join(save_directory, image_name)
            download_image(image_url, save_path)

# 设置爬取的网站URL和保存图片的目录
url = 'https://example.com'
save_directory = 'images'

# 创建保存图片的目录
if not os.path.exists(save_directory):
    os.makedirs(save_directory)

# 开始爬取图片
crawl_images(url, save_directory)

代码说明

导入必要的库：requests、BeautifulSoup和os。
download_image函数：接收图片链接和保存路径作为参数，下载图片并保存到指定路径。
crawl_images函数：接收网站URL和保存目录作为参数，获取网页内容，解析HTML，提取图片链接，并调用download_image函数下载图片。
设置目标网站URL和保存图片的目录。
创建保存图片的目录（如果不存在）。
调用crawl_images函数开始爬取图片。

注意事项

这只是一个简单的示例代码，具体的网站结构和爬取规则可能会有所不同。你可能需要根据目标网站的实际情况进行一些调整和修改。
爬取网站的图片时请确保遵守相关法律法规和网站的使用条款，例如robots.txt协议。
为了避免对目标网站造成过大压力，建议设置合理的爬取频率和超时机制。

希望本教程能够帮助你入门Python图片爬虫！