编写python代码实现以下功能网易云音乐网站数据的爬取主要任务:设计一个窗体应用系统具有以下功能:1加载需要用到的各种第三方库如requests;BeautifulSoup4;lxml;sqlite3;jieba;;WordCloud;openpyxl等。将信息保存到Excel表中显示处理后的信息
由于涉及到网站数据的爬取,需要先获取网站的API接口,以下是网易云音乐的API接口:
https://music.163.com/api/playlist/detail?id=歌单ID
其中,歌单ID可以在网易云音乐网站上找到,例如:https://music.163.com/#/playlist?id=2193993710,该歌单的ID为2193993710。
接下来,我们需要使用requests库向API接口发送请求,获取歌单信息。代码如下:
import requests
url = "https://music.163.com/api/playlist/detail?id=2193993710" headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers) print(response.text)
在获取到歌单信息后,我们需要使用BeautifulSoup库解析HTML页面,获取歌曲的详细信息。代码如下:
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.text, 'lxml') songs = soup.find_all('ul', {'class': 'f-hide'})[0].find_all('a')
for song in songs: print(song.text)
最后,我们需要将歌曲信息保存到Excel表中,可以使用openpyxl库实现。代码如下:
import openpyxl
wb = openpyxl.Workbook() sheet = wb.active sheet.title = "歌单"
for i, song in enumerate(songs): sheet.cell(row=i+1, column=1, value=song.text)
wb.save("歌单.xlsx")
完整代码如下:
import requests from bs4 import BeautifulSoup import openpyxl
url = "https://music.163.com/api/playlist/detail?id=2193993710" headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'lxml') songs = soup.find_all('ul', {'class': 'f-hide'})[0].find_all('a')
wb = openpyxl.Workbook() sheet = wb.active sheet.title = "歌单"
for i, song in enumerate(songs): sheet.cell(row=i+1, column=1, value=song.text)
wb.save("歌单.xlsx"
原文地址: https://www.cveoy.top/t/topic/hnCp 著作权归作者所有。请勿转载和采集!