Python Program to Download All Links from a Website and Save to Text File
Here's a Python program that downloads all link pages from a website and saves them to a text file:
import requests
from bs4 import BeautifulSoup
# Set the URL of the website to be scraped
url = 'https://example.com'
# Send a request to the URL and get the HTML content
response = requests.get(url)
html = response.content
# Parse the HTML content with BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Find all the links in the HTML content
links = soup.find_all('a')
# Create an empty list to store the link URLs
link_urls = []
# Loop through the links and extract their URLs
for link in links:
link_url = link.get('href')
if link_url.startswith('http') or link_url.startswith('www'):
link_urls.append(link_url)
# Create a new file called 'links.txt' and write the link URLs to it
with open('links.txt', 'w') as f:
for link_url in link_urls:
f.write(link_url + '\n')
print('All link pages downloaded and saved to links.txt')
This program utilizes the requests library to send a request to the website and retrieve the HTML content. Then, the BeautifulSoup library parses the HTML content and identifies all the links. It subsequently extracts the URLs of these links and stores them in a list. Finally, it creates a file named 'links.txt' and writes the collected link URLs into it.
原文地址: https://www.cveoy.top/t/topic/nio4 著作权归作者所有。请勿转载和采集!