以下是一个简单的爬虫示例,它可以从ACG网站抓取二次元电脑壁纸,并保存到本地文件夹。它会过滤掉低于1920x1080分辨率的壁纸,只保留等于或高于该分辨率的壁纸。

import requests
import os

url = 'https://www.acgarea.com/'

def download_image(url, path):
    response = requests.get(url)
    with open(path, 'wb') as f:
        f.write(response.content)

def get_image_urls():
    response = requests.get(url)
    image_urls = []
    for line in response.text.splitlines():
        if 'data-src=' in line:
            image_url = line.split('data-src=')[1].split('')[0]
            image_urls.append(image_url)
    return image_urls

def main():
    image_urls = get_image_urls()
    for image_url in image_urls:
        if 'http' not in image_url:
            image_url = url + image_url
        response = requests.get(image_url)
        if response.status_code == 200:
            content_length = int(response.headers.get('content-length', 0))
            if content_length >= 1920 * 1080:
                filename = image_url.split('/')[-1]
                path = os.path.join('images', filename)
                download_image(image_url, path)
                print(f'Saved image {filename}')

if __name__ == '__main__':
    main()

该爬虫首先通过发送一个GET请求获取网站的HTML代码,然后在HTML代码中查找所有带有data-src属性的图片元素,获取它们的URL。接下来,它会遍历所有的图片URL,下载高于或等于1920x1080分辨率的壁纸,并保存在本地文件夹中。

Python爬虫实战:自动下载二次元壁纸(1920x1080及以上)

原文地址: https://www.cveoy.top/t/topic/nhuk 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录