继续输出剩余的代码注释
导入所需的库
import pandas as pd import matplotlib.pyplot as plt
读取数据
df = pd.read_csv('data.csv')
数据预处理
df.dropna(inplace=True) # 删除缺失值 df = df[df['age'] > 0] # 删除年龄小于等于0的记录
绘制直方图
plt.hist(df['age'], bins=30) plt.title('Age Distribution') plt.xlabel('Age') plt.ylabel('Frequency') plt.show()
计算年龄的基本统计量
age_mean = df['age'].mean() age_median = df['age'].median() age_mode = df['age'].mode()[0] age_std = df['age'].std()
print('Mean age: {:.2f}'.format(age_mean)) print('Median age: {:.2f}'.format(age_median)) print('Mode age: {:.2f}'.format(age_mode)) print('Standard deviation of age: {:.2f}'.format(age_std)
原文地址: https://www.cveoy.top/t/topic/g89W 著作权归作者所有。请勿转载和采集!