Python Pandas 计算中位数并保存到文本文件
使用 Python 和 Pandas 计算中位数并保存到文本文件
本教程将展示如何使用 Python 和 Pandas 库计算特定数据集中不同类别特征的中位数并将其保存到文本文件。
步骤:
- **读取数据:**使用
pd.read_csv()函数读取 CSV 文件。
import pandas as pd
df = pd.read_csv('./2023_2_20No2/2023_2_20_19.csv', encoding='utf-8')
- **选择列:**选择需要计算中位数的列。
df = df.iloc[:, 1:33]
- **筛选数据:**根据特定类别特征筛选数据,例如 'tumor' 和 'peritumor'。
df_tumor = df[df['name'] == 'tumor']
df_peritumor = df[df['name'] == 'peritumor']
- **创建字典:**创建字典来存储不同类别特征的中位数。
tumor_dict = {}
peritumor_dict = {}
- **计算中位数:**循环遍历每个类别特征,并计算中位数。
for name in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']:
df_tumor_name = df_tumor[df_tumor['property'] == name]
df_peritumor_name = df_peritumor[df_peritumor['property'] == name]
tumor_dict[name] = df_tumor_name
peritumor_dict[name] = df_peritumor_name
tumor_median = df_tumor_name.iloc[:, 2:33].median()
peritumor_median = df_peritumor_name.iloc[:, 2:33].median()
tumor_median_df = tumor_median.to_frame().transpose()
peritumor_median_df = peritumor_median.to_frame().transpose()
print(f'Tumor {name}: {tumor_median_df}')
print(f'Peritumor {name}: {peritumor_median_df}')
- **保存到文本文件:**使用
with open()语句打开一个文本文件,并将计算结果写入文件。
with open('./2023_2_20No2/median_values.txt', 'w') as f:
for name in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']:
df_tumor_name = df_tumor[df_tumor['property'] == name]
df_peritumor_name = df_peritumor[df_peritumor['property'] == name]
tumor_dict[name] = df_tumor_name
peritumor_dict[name] = df_peritumor_name
tumor_median = df_tumor_name.iloc[:, 2:33].median()
peritumor_median = df_peritumor_name.iloc[:, 2:33].median()
tumor_median_df = tumor_median.to_frame().transpose()
peritumor_median_df = peritumor_median.to_frame().transpose()
f.write(f'Tumor {name}: {tumor_median_df}
')
f.write(f'Peritumor {name}: {peritumor_median_df}
')
完整代码:
import pandas as pd
df = pd.read_csv('./2023_2_20No2/2023_2_20_19.csv', encoding='utf-8')
df = df.iloc[:, 1:33]
df_tumor = df[df['name'] == 'tumor']
df_peritumor = df[df['name'] == 'peritumor']
tumor_dict = {}
peritumor_dict = {}
for name in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']:
df_tumor_name = df_tumor[df_tumor['property'] == name]
df_peritumor_name = df_peritumor[df_peritumor['property'] == name]
tumor_dict[name] = df_tumor_name
peritumor_dict[name] = df_peritumor_name
tumor_median = df_tumor_name.iloc[:, 2:33].median()
peritumor_median = df_peritumor_name.iloc[:, 2:33].median()
tumor_median_df = tumor_median.to_frame().transpose()
peritumor_median_df = peritumor_median.to_frame().transpose()
with open('./2023_2_20No2/median_values.txt', 'w') as f:
f.write(f'Tumor {name}: {tumor_median_df}
')
f.write(f'Peritumor {name}: {peritumor_median_df}
')
注意:
- 确保将
./2023_2_20No2/2023_2_20_19.csv替换为实际的 CSV 文件路径。 - 将
./2023_2_20No2/median_values.txt替换为所需的文本文件路径。 - 可根据实际情况调整代码中使用的列索引和类别特征名称。
- 以上代码只是示例代码,具体实现方式可以根据实际需求进行调整。
原文地址: https://www.cveoy.top/t/topic/kjIo 著作权归作者所有。请勿转载和采集!