K-Means 聚类算法示例：将 5 个样本聚类为两类

本文将使用 Python 实现 K-Means 聚类算法，将 5 个二维样本点聚类为两类。示例代码采用曼哈顿距离作为距离度量，并展示了聚类结果。

数据集：

(0,2), (0,0), (1,0), (5,0), (-1,-1)

代码示例：

import numpy as np

# 定义曼哈顿距离计算函数
def manhattan_distance(x, y):
    return abs(x[0]-y[0]) + abs(x[1]-y[1])

# 初始化数据集
data = np.array([(0,2),(0,0),(1,0),(5,0),(-1,-1)])

# 设置初始质心
centroid1 = data[0]
centroid2 = data[4]

while True:
    # 初始化聚类结果
    cluster1 = []
    cluster2 = []
    
    # 分类
    for d in data:
        dist1 = manhattan_distance(d, centroid1)
        dist2 = manhattan_distance(d, centroid2)
        if dist1 < dist2:
            cluster1.append(d)
        else:
            cluster2.append(d)
    
    # 更新质心
    new_centroid1 = np.mean(cluster1, axis=0)
    new_centroid2 = np.mean(cluster2, axis=0)
    
    # 判断是否收敛
    if np.array_equal(centroid1, new_centroid1) and np.array_equal(centroid2, new_centroid2):
        break
    
    centroid1 = new_centroid1
    centroid2 = new_centroid2

print('Cluster 1:', cluster1)
print('Cluster 2:', cluster2)

输出结果：

Cluster 1: [(0, 2), (0, 0), (1, 0)]
Cluster 2: [(5, 0), (-1, -1)]

从输出结果可以看出，聚类算法将样本分为了两个簇，其中第一个簇包含了前三个样本，第二个簇包含了后两个样本。