WANG Shuliang, LI Qi, YUAN Hanning, GENG Jing, DAI Tianru, DENG Chenwei. Robust Clustering with Topological Graph Partition[J]. Chinese Journal of Electronics, 2019, 28(1): 76-84. doi: 10.1049/cje.2018.09.005
Citation: WANG Shuliang, LI Qi, YUAN Hanning, GENG Jing, DAI Tianru, DENG Chenwei. Robust Clustering with Topological Graph Partition[J]. Chinese Journal of Electronics, 2019, 28(1): 76-84. doi: 10.1049/cje.2018.09.005

Robust Clustering with Topological Graph Partition

doi: 10.1049/cje.2018.09.005
Funds:  This work is supported by National Key Research and Development Plan of China (No.2016YFC0803000, No.2016YFB0502604), National Natural Science Fund of China (No.61472039), and Frontier and interdisciplinary innovation program of Beijing Institute of Technology (No.2016CX11006)
More Information
  • Corresponding author: LI Qi (corresponding author), Ph.D. candidate in Beijing Institute of Technology. His major interest is data mining and mathematics. (Email:liqi_bitss@163.com)
  • Received Date: 2017-05-16
  • Rev Recd Date: 2018-01-22
  • Publish Date: 2019-01-10
  • Clustering is fundamental in many fields with big data. In this paper, a novel method based on Topological graph partition (TGP) is proposed to group objects. A topological graph is created for a data set with many objects, in which an object is connected to k nearest neighbors. By computing the weight of each object, a decision graph under probability comes into being. A cut threshold is conveniently selected where the probability of weight anomalously becomes large. With the threshold, the topological graph is cut apart into several sub-graphs after the noise edges are cut off, in which a connected subgraph is treated as a cluster. The compared experiments demonstrate that the proposed method is more robust to cluster the data sets with high dimensions, complex distribution, and hidden noises. It is not sensitive to input parameter, we need not more priori knowledge.
  • loading
  • G. Karypis, E.H. Han and V. Kumar, "Chameleon:A hierarchical clustering algorithm using dynamic modeling", Computer, Vol.32, No.8, pp.68-75, 1999.
    B.J. Frey and D. Dueck, "Clustering by passing messages between data points", Science, Vol.315, No.5814, pp.972-976, 2007.
    G. Dong and M. Xie, "Color clustering and learning for image segmentation based on neural networks", IEEE Transactions on Neural Networks, Vol.16, No.4, pp.925-936, 2005.
    Kanungo, Tapas, Mount, et al., "An efficient k-means clustering algorithm:Analysis and implementation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.24, No.7, pp.881-892, 2002.
    V. Estivill-Castro and I. Lee, "Argument free clustering for large spatial point-data sets via boundary extraction from Delaunay graph", Computers Environment & Urban Systems, Vol.26, No.4, pp.315-334, 2002.
    D. Liu and O. Sourina, "Free-parameters clustering of spatial data with non-uniform density", Proc. of IEEE Conference on Cybernetics and Intelligent Systems, pp.387-392, 2008.
    J. MacQueen, "Some Methods for Classification and analysis of multivariate observations", Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp.281-297, 1966.
    L. Kaufman and P.J. Rousseeuw, "Finding groups in data:An introduction to cluster analysis", DBLP, 1990.
    F. Hoppner, F. Klawonn, R. Kruse, et al., "Fuzzy cluster analysis-methods for classification", Journal of the Operational Research Society, Vol.51, No.6, pp.769-770, 1998.
    A.K. Jain, "Data clustering:50 years beyond K-means", Pattern Recognition Letters, Vol.31, No.8, pp.651-666, 2008.
    Y. Avrithis, Y. Kalantidis, E. Anagnostopoulos, et al., "Web-scale image clustering revisited", IEEE Proc. of IEEE International Conference on Computer Vision, pp.1502-1510, 2015.
    Y. Gong, M. Pawlowski, F. Yang, et al., "Web scale photo hash clustering on a single machine", Proc. of IEEE Computer Vision and Pattern Recognition, pp.19-27, 2015.
    Y. Matsui, K. Ogaki, T. Yamasaki, et al., "PQk-means:Billion-scale Clustering for Product-quantized Codes", 2017.
    M. Ester, H. Kriegel, J. Sander, et al., "A density-based algorithm for discovering clusters in large spatial databases with noise", Proc. of the 2nd International Conference on Knowledge Discovery and Data Mining, pp.226-231, 1996.
    A. Rodriguez and A. Laio, "Clustering by fast search and find of density peaks", Science, Vol.344, No.6191, pp.1492-1496, 2014.
    M. Ankerst, M. Breuing, H.P. Kriegel, et al., "OPTICS:Ordering points to identify the clustering structure", Proc. of the ACM SIGMOD International Conference on Management of Data (SIG-MOD), pp.49-60, 1999.
    A. Hinneburg and D.A. Keim, "An efficient approach to clustering in large multimedia databases with nose", Proc. of the 4th International Conference on Knowledge Discovery and Data Mining (KDD), pp.58-65, 1998.
    T. Zhang, R. Ramakrishman and M. Livny, "BIRCH:An efficient data clustering method for very large databases", Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp.103-114, 1996.
    S. Guha, R. Rastogi and K. Shim, "ROCK:A robust clustering algorithm for categorical attributes", Proc. of the 15th International Conference on Data Engineering (ICDE), pp.512-521, 1999.
    A.P. Dempster, "Maximum likelihood estimation from incomplete data via the EM algorithm with discussion", Journal of the Royal Statistical Society, Vol.39, No.1, pp.1-38, 1977.
    W. Chen, Y. Song, H. Bai, et al., "Parallel spectral clustering in distributed systems", IEEE Trans Pattern Anal Mach Intell, Vol.33, No.3, pp.568-586, 2011.
    S. Wang, D. Wang, C.Y. Li, et al., "Clustering by fast search and find of density peaks with data field", Chinese Journal of Electronics, Vol.25, No.3, pp.397-402, 2016.
    G. Chen, "Deep learning with nonparametric clustering", arXiv:Learning, 2015.
    P. Ji, T. Zhang, H. Li, et al., "Deep subspace clustering networks", 2017.
    C.J. Veenman, M.J.T. Reinders and E. Backer, "A maximum variance cluster algorithm", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.24, No.9, pp.1273-1280, 2002.
    P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pattern Recognition, Vol.39, No.5, pp.761-765, 2006.
    H. Chang and D.Y. Yeung, "Robust path-based spectral clustering", Pattern Recognition, Vol.41, No.1, pp.191-203, 2008.
    F.S. Samaria and A.C. Harter, "Parameterisation of a stochastic model for human face identification", Proc. of 1994 IEEE Workshop on Applications of Computer Vision (IEEE, New York, 1994), pp.138-142, 1994.
    M.P. Sampat, Z. Wang, S. Gupta, et al., "Complex wavelet structural similarity:A new image similarity index", IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, Vol.18, No.11, pp.2385-2401, 2009.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (211) PDF downloads(194) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return