以下是Python实现可用网站里的代理IP筛选并写入文件的示例代码:

import requests
from bs4 import BeautifulSoup

# 获取代理IP列表
url = 'https://www.xicidaili.com/nn/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
table = soup.find('table', {'id': 'ip_list'})
trs = table.find_all('tr')[1:]  # 第一行是表头,需要去掉
ips = []
for tr in trs:
    tds = tr.find_all('td')
    ip = tds[1].text + ':' + tds[2].text
    ips.append(ip)

# 筛选可用代理IP
valid_ips = []
for ip in ips:
    try:
        proxies = {'http': 'http://' + ip, 'https': 'https://' + ip}
        response = requests.get('https://www.baidu.com', proxies=proxies, timeout=5)
        if response.status_code == 200:
            valid_ips.append(ip)
    except:
        pass

# 将可用代理IP写入文件
with open('valid_ips.txt', 'w') as f:
    for ip in valid_ips:
        f.write(ip + '\n')

上述代码首先获取代理IP列表,然后遍历每个IP,使用该IP构建代理字典并发送请求,如果请求成功则将该IP加入可用代理列表中。最后将可用代理IP写入文件中。

python实现可用网站里的代理ip筛选并写入文件

原文地址: https://www.cveoy.top/t/topic/bLrm 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录