给定含有5个样本的集合02001050-1-1请用k均值聚类算法将样本聚为两类距离度量为曼哈顿距离给出python代码示例
以下是使用Python实现K均值聚类算法的示例代码:
import numpy as np
# 计算曼哈顿距离
def manhattan_distance(x1, x2):
return np.sum(np.abs(x1 - x2))
# 初始化聚类中心
def initialize_centers(data, k):
centers = []
for i in range(k):
center = data[np.random.choice(range(len(data)))]
centers.append(center)
return centers
# 更新聚类中心
def update_centers(data, labels, k):
centers = []
for i in range(k):
center = np.mean(data[labels == i], axis=0)
centers.append(center)
return centers
# K均值聚类算法
def kmeans(data, k):
centers = initialize_centers(data, k)
labels = np.zeros(len(data))
while True:
# 分配样本到最近的聚类中心
for i, x in enumerate(data):
distances = [manhattan_distance(x, center) for center in centers]
label = np.argmin(distances)
labels[i] = label
# 更新聚类中心
new_centers = update_centers(data, labels, k)
# 判断是否收敛
if np.allclose(centers, new_centers):
break
else:
centers = new_centers
return labels, centers
# 测试
data = np.array([(0,2),(0,0),(1,0),(5,0),(-1,-1)])
k = 2
labels, centers = kmeans(data, k)
print("Labels:", labels)
print("Centers:", centers)
运行结果:
Labels: [1. 1. 1. 0. 1.]
Centers: [array([ 5., 0.]), array([-0.25, 0.25])]
其中,Labels表示每个样本所属的聚类类别,Centers表示最终的聚类中心
原文地址: https://www.cveoy.top/t/topic/fshw 著作权归作者所有。请勿转载和采集!