Python爬虫百度汽车前100张图片
以下是使用Python编写的爬虫程序,可以从百度汽车搜索结果中获取前100张图片:
import requests
from bs4 import BeautifulSoup
import os
def get_image_urls(search_term):
url = f"https://image.baidu.com/search/index?tn=baiduimage&word={search_term}"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
img_tags = soup.find_all("img", class_="main_img")
img_urls = [img["src"] for img in img_tags]
return img_urls[:100]
def download_images(search_term):
img_urls = get_image_urls(search_term)
if not os.path.exists(search_term):
os.makedirs(search_term)
for i, url in enumerate(img_urls):
response = requests.get(url)
file_name = os.path.join(search_term, f"{search_term}_{i}.jpg")
with open(file_name, "wb") as f:
f.write(response.content)
print(f"Downloaded {file_name}")
if __name__ == "__main__":
search_term = "汽车"
download_images(search_term)
该程序首先定义了一个get_image_urls函数,它接受一个搜索词作为参数,并返回前100张相关图片的URL。
然后定义了一个download_images函数,它使用get_image_urls函数获取图片URL,然后下载每个图片并保存到以搜索词为名的文件夹中。
最后,在程序的main函数中,将搜索词设置为“汽车”,并调用download_images函数来下载相关图片。
原文地址: https://www.cveoy.top/t/topic/9Y9 著作权归作者所有。请勿转载和采集!