用python写一段密度聚类算法
下面是使用Python编写的密度聚类算法的示例代码:
import numpy as np
from sklearn.neighbors import NearestNeighbors
def density_cluster(data, eps, min_samples):
"""
密度聚类算法
:param data: 数据集,每行表示一个样本
:param eps: 半径大小,用于确定一个核心点的邻域范围
:param min_samples: 邻域内最小样本数,用于确定一个核心点
:return: 聚类结果,每个样本的类别标签
"""
n_samples = data.shape[0]
visited = np.zeros(n_samples, dtype=bool)
labels = np.zeros(n_samples, dtype=int)
cluster_id = 1
nbrs = NearestNeighbors(n_neighbors=min_samples).fit(data)
distances, indices = nbrs.kneighbors(data)
for i in range(n_samples):
if visited[i]:
continue
visited[i] = True
if len(indices[i]) < min_samples:
labels[i] = -1 # 噪声点
else:
labels[i] = cluster_id
# 扩展核心点的邻域
neighbors = indices[i].tolist()
j = 0
while j < len(neighbors):
neighbor = neighbors[j]
if not visited[neighbor]:
visited[neighbor] = True
labels[neighbor] = cluster_id
if len(indices[neighbor]) >= min_samples:
neighbors += indices[neighbor].tolist()
j += 1
cluster_id += 1
return labels
使用示例:
import numpy as np
# 生成样本数据
np.random.seed(0)
samples = np.random.randn(100, 2)
# 密度聚类
labels = density_cluster(samples, eps=0.3, min_samples=5)
# 打印聚类结果
for i, label in enumerate(labels):
print(f"样本{i+1}的类别标签为:{label}")
这是一个简化的密度聚类算法实现,基于密度聚类的核心思想。在算法中,使用了sklearn库中的NearestNeighbors来计算样本之间的距离和邻域关系。算法会为每个样本分配一个类别标签,如果样本被认为是噪声点,则类别标签为-1
原文地址: http://www.cveoy.top/t/topic/hCMC 著作权归作者所有。请勿转载和采集!