Python Pandas: Calculate and Save Median Values of Tumor and Peritumor Data in CSV Files
This code snippet demonstrates how to calculate median values of different properties in tumor and peritumor data using Python Pandas and save them as separate CSV files.
df = pd.read_csv("./2023_2_20No2/2023_2_20_19.csv",encoding = 'utf-8')
df = df.iloc[:,1:33]
df_tumor = df[df['name'] == 'tumor']
df_peritumor = df[df['name'] == 'peritumor']
tumor_dict = {}
peritumor_dict = {}
for name in ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']:
df_tumor_name = df_tumor[df_tumor['property'] == name]
df_peritumor_name = df_peritumor[df_peritumor['property'] == name]
tumor_dict[name] = df_tumor_name
peritumor_dict[name] = df_peritumor_name
tumor_median = df_tumor_name.iloc[:,2:33].median()
peritumor_median = df_peritumor_name.iloc[:,2:33].median()
tumor_median_df = tumor_median.to_frame().transpose()
peritumor_median_df = peritumor_median.to_frame().transpose()
# Save to CSV
tumor_median_df.to_csv(f"./2023_2_20No2/tumor_median_{name}.csv", index=False)
peritumor_median_df.to_csv(f"./2023_2_20No2/peritumor_median_{name}.csv", index=False)
Explanation:
- Read CSV: The code starts by reading a CSV file named '2023_2_20_19.csv' located in the './2023_2_20No2/' folder.
- Subset Data: It then extracts specific columns (1:33) from the DataFrame and separates data based on 'name' into 'tumor' and 'peritumor' DataFrames.
- Calculate Median Values: The code iterates through different property values ('A', 'B', 'C', ..., 'H') and calculates the median values for each property in both tumor and peritumor groups.
- Save to CSV: For each property and group, the median values are converted into DataFrames and saved as separate CSV files in the './2023_2_20No2/' folder, using names like 'tumor_median_A.csv' and 'peritumor_median_B.csv'.
Key Points:
- This approach provides a systematic way to calculate and store median values for different properties and groups.
- Using 'to_csv' with
index=Falseprevents the index column from being saved in the CSV files. - The code is flexible and can be adapted to analyze different datasets and properties.
Example Output (CSV Files):
- tumor_median_A.csv: The file will contain a single row with the median values for property 'A' in the tumor group.
- peritumor_median_B.csv: The file will contain a single row with the median values for property 'B' in the peritumor group.
This code snippet demonstrates a practical application of Pandas for data processing and analysis, providing valuable insights into tumor and peritumor data.
原文地址: https://www.cveoy.top/t/topic/kjIB 著作权归作者所有。请勿转载和采集!