1. 使用requests库发送GET请求到问题URL,并使用代理和Cookie进行身份验证。
import requests

url = "问题URL"
proxies = {"http": "代理地址"}
cookies = {"Cookie": "身份验证的Cookie"}

response = requests.get(url, proxies=proxies, cookies=cookies)
  1. 使用BeautifulSoup库解析响应,获取所有
      元素。
from bs4 import BeautifulSoup

soup = BeautifulSoup(response.text, "html.parser")
ul_elements = soup.find_all("ul")
  1. 遍历每个
      元素,获取其中的二级标题和
    • 元素。
for ul_element in ul_elements:
    h2_element = ul_element.find("h2")
    li_elements = ul_element.find_all("li")
    
    # 处理二级标题和<li>元素
  1. 解析元素,获取问题的链接地址、标题和ID,并创建对应的目录。
for li_element in li_elements:
    a_element = li_element.find("a")
    question_link = a_element["href"]
    question_title = a_element.text
    question_id = question_link.split("/")[-1]
    
    # 创建对应的目录
  1. 再次发送GET请求获取问题的内容,并保存到content.md文件中。
response = requests.get(question_link, proxies=proxies, cookies=cookies)
content = response.text

with open("content.md", "w", encoding="utf-8") as file:
    file.write(content)
  1. 获取测试用例的输入和输出,并保存到对应的文件中。
for li_element in li_elements:
    input_element = li_element.find("input")
    output_element = li_element.find("output")
    
    input_text = input_element.text
    output_text = output_element.text
    
    with open("input.txt", "w", encoding="utf-8") as file:
        file.write(input_text)
        
    with open("output.txt", "w", encoding="utf-8") as file:
        file.write(output_text)

原文地址: https://www.cveoy.top/t/topic/o2gu 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录