由于没有提供相关资料,无法进行完整的实验报告。以下仅提供PCA和K-Means的代码示例供参考。

PCA降维:

import pandas as pd
from sklearn.decomposition import PCA

# 读取数据
data = pd.read_csv('MNIST_train.csv')

# 获取像素数据
pixels = data.iloc[:, 1:]

# 进行PCA降维,保留85%的能量
pca = PCA(n_components=0.85)
pixels_pca = pca.fit_transform(pixels)

# 将降维后的数据存储到文件中
data_pca = pd.DataFrame(data['label'])
data_pca = pd.concat([data_pca, pd.DataFrame(pixels_pca)], axis=1)
data_pca.to_csv('MNIST_train_pca.csv', index=False)

K-Means聚类:

import pandas as pd
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

# 读取数据
data = pd.read_csv('MNIST_train_pca.csv')

# 获取像素数据
pixels = data.iloc[:, 1:]

# 选取超参数k
best_k = 0
best_score = -1
for k in range(2, 11):
    kmeans = KMeans(n_clusters=k, random_state=42)
    labels = kmeans.fit_predict(pixels)
    score = silhouette_score(pixels, labels)
    if score > best_score:
        best_score = score
        best_k = k

# 使用最佳的k进行聚类
kmeans = KMeans(n_clusters=best_k, random_state=42)
labels = kmeans.fit_predict(pixels)

# 将聚类结果存储到文件中
data_clustered = pd.DataFrame(data['label'])
data_clustered['cluster'] = labels
data_clustered.to_csv('MNIST_train_clustered.csv', index=False)

# 计算聚类结果的评价指标
score = silhouette_score(pixels, labels)
print('Best k:', best_k)
print('Silhouette score:', score)
MNIST 手写体识别数据集降维与聚类实验报告

原文地址: https://www.cveoy.top/t/topic/oqba 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录