Reverse-Nearest-Neighbor-Based Clustering by Fast Search and Find of Density Peaks
-
Abstract
Clustering by fast search and find of density peaks (CFSFDP) has the advantages of a novel idea, easy implementation, and efficient clustering. It has been widely recognized in various fields since it was proposed in Science in 2014. The CFSFDP algorithm also has certain limitations, such as non-unified sample density metrics defined by cutoff distance, the domino effect for the assignment of remaining samples triggered by unstable assignment strategy, and the phenomenon of picking wrong density peaks as cluster centers. We propose reverse-nearest-neighbor-based clustering by fast search and find of density peaks (RNN-CFSFDP) to avoid these shortcomings. We redesign and unify the sample density metric by introducing reverse nearest neighbor. The newly defined local density metric and the K-nearest neighbors of each sample are combined to make the assignment process more robust and alleviate the domino effect. A cluster fusion algorithm is proposed, which further alleviates the domino effect and effectively avoids the phenomenon of picking wrong density peaks as cluster centers. Experimental results on publicly available synthetic data sets and real-world data sets show that in most cases, the proposed algorithm is superior to or at least equivalent to the comparative methods in clustering performance. The proposed algorithm works better on manifold data sets and uneven density data sets.
-
-