以下是一个使用numpy实现DBSCAN算法的示例代码:

import numpy as np

def dbscan(X, eps, min_samples):
    """
    DBSCAN clustering algorithm implementation using numpy
    
    Parameters:
        X: numpy array, shape (n_samples, n_features)
            The input data to be clustered
        eps: float
            The maximum distance between two samples for them to be considered
            as in the same neighborhood
        min_samples: int
            The minimum number of samples in a neighborhood for a point to be
            considered as a core point
    
    Returns:
        labels: numpy array, shape (n_samples,)
            The cluster labels for each sample. Noise points are labeled as -1.
        centers: numpy array, shape (n_clusters, n_features)
            The cluster centers. Only available if at least one cluster is found.
    """
    n_samples, n_features = X.shape
    
    # Compute pairwise distances between all samples
    dists = np.sqrt(np.sum((X[:, np.newaxis, :] - X[np.newaxis, :, :]) ** 2, axis=-1))
    
    # Initialize cluster labels and core point flags
    labels = -np.ones(n_samples, dtype=np.int32)
    is_core = np.zeros(n_samples, dtype=np.bool)
    
    # Find core points
    for i in range(n_samples):
        neighbors = np.where(dists[i] <= eps)[0]
        if len(neighbors) >= min_samples:
            is_core[i] = True
    
    # Assign cluster labels to core points
    curr_label = 0
    for i in range(n_samples):
        if is_core[i] and labels[i] == -1:
            labels[i] = curr_label
            stack = [i]
            while stack:
                curr = stack.pop()
                neighbors = np.where(dists[curr] <= eps)[0]
                for j in neighbors:
                    if is_core[j] and labels[j] == -1:
                        labels[j] = curr_label
                        stack.append(j)
                    elif labels[j] == -1:
                        labels[j] = -2
            curr_label += 1
    
    # Assign cluster labels to non-core points
    for i in range(n_samples):
        if labels[i] == -1:
            neighbors = np.where(dists[i] <= eps)[0]
            for j in neighbors:
                if labels[j] >= 0:
                    labels[i] = labels[j]
                    break
    
    # Remove noise points
    n_clusters = np.max(labels) + 1
    if -1 in labels:
        labels = labels[labels != -1]
        X = X[labels != -1]
        is_core = is_core[labels != -1]
        n_samples, n_features = X.shape
    
    # Compute cluster centers
    centers = np.zeros((n_clusters, n_features))
    for i in range(n_clusters):
        centers[i] = np.mean(X[labels == i], axis=0)
    
    return labels, centers

该函数接受输入数据 X,最大距离 eps 和最小样本数 min_samples 作为参数,并返回每个样本所属的簇标签以及簇中心。其中,簇标签为一个 numpy 数组,簇中心为一个 numpy 数组,其中每行表示一个簇的中心点。

以下是一个使用示例:

X = np.random.rand(100, 2)
labels, centers = dbscan(X, eps=0.3, min_samples=5)
print(labels)
print(centers)

该示例将生成一个随机二维数据集,并使用 DBSCAN 算法对其进行聚类。聚类结果将打印输出

调用numpy用python实现DBSCAN算法返回值为 簇中心各样本所属簇的标签

原文地址: http://www.cveoy.top/t/topic/hlAk 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录