使用 Pandas 从 Excel 文件中提取并输出用户评论数据

以下代码展示了如何使用 Python 的 Pandas 库读取 Excel 文件中的用户评论数据，并按照用户昵称对每个字段的句子进行分类输出。

import pandas as pd

# 读取 Excel 文件
df = pd.read_excel('C:\Users\86186\Desktop\汽车之家_秦plus_评论.xls')

# 初始化字典列表
sentences = []

# 遍历每一行数据
for index, row in df.iterrows():
    if pd.notnull(row['用户昵称']) and pd.notnull(row['最满意']):
        sentence_dict = {}
        columns = df.columns.tolist()
        start_index = columns.index('最满意')
        for column in columns[start_index:]:
            if column == '智能化' and pd.notnull(df.iloc[index+1]['智能化']):
                sentence_dict[column] = [sentence.strip() for sentence in str(df.iloc[index+1][column]).split('，')]
            else:
                sentence_dict[column] = [sentence.strip() for sentence in str(row[column]).split('，')]
        sentences.append(sentence_dict)

# 打印结果
for i, sentence_dict in enumerate(sentences):
    print(f'第{i+1}个用户昵称:')
    for column, sentences_list in sentence_dict.items():
        print(column + ':')
        for sentence in sentences_list:
            print('- ' + sentence)
        print()

这段代码会输出每个用户昵称的内容，并将每个字段的句子列表逐行打印。代码添加了一个计数器 i 来标记每个用户昵称的序号。

注意:

请确保你已经安装了 pandas 库和 xlrd 库，并将文件路径修改为你的实际路径。
代码中假设 Excel 文件的列名包括 '用户昵称' 和 '最满意'。
代码假设每个用户昵称对应多行数据，其中 '智能化' 字段需要从下一行数据中获取。
代码假设 '智能化' 字段内容以逗号分隔。

使用方法:

将代码保存为 .py 文件。
在终端或命令行中运行代码文件。
代码将输出每个用户昵称的评论信息，并按照字段进行分类。

本代码示例可以帮助你了解如何使用 Pandas 库读取 Excel 文件，并进行数据处理和输出。你可以根据自己的实际需求修改代码，以实现更复杂的功能。

相关资源: