YANG Zhen, WANG Laitao, FAN Kefeng, LAI Yingxu. Exemplar-Based Clustering Analysis Optimized by Genetic Algorithm[J]. Chinese Journal of Electronics, 2013, 22(4): 735-740.
Citation: YANG Zhen, WANG Laitao, FAN Kefeng, LAI Yingxu. Exemplar-Based Clustering Analysis Optimized by Genetic Algorithm[J]. Chinese Journal of Electronics, 2013, 22(4): 735-740.

Exemplar-Based Clustering Analysis Optimized by Genetic Algorithm

Funds:  This work is supported by the National Natural Science Foundation of China (No.61001178, No.61175004, No.61002029, No.61202266, No.6117205), National Soft Science Research Program (No.2010GXQ5D317), Beijing Natural Science Foundation (No.4102012, No.4112009), Scientific Research Common Program of Beijing Municipal Commission of Education (No.KM201210005024), Funding Project for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality (No.PHR201108016), the National High Technology Research and Development Program of China (863 Program) (No.2012AA011706), Special Project of Public Industry R&D of aqsiq, China (No.201110233, No.201310302).
More Information
  • Corresponding author: YANG Zhen, WANG Laitao, FAN Kefeng, LAI Yingxu
  • Received Date: 2012-07-01
  • Rev Recd Date: 2012-10-01
  • Publish Date: 2013-09-25
  • Exemplar-based clustering algorithm is very efficient to handle large scale and high dimensional data, while it does not require the user to specify many parameters. For current algorithms, however, are the inabilities to identify the optimal results or specify the number of clusters automatically. To remedy these, in this work, we propose and explore the idea of exemplar-based clustering analysis optimized by genetic algorithms, abbreviated as ECGA framework, which use genetic algorithms for optimizing and combining the results. First, an exemplarbased clustering framework based on canonical genetic algorithm is introduced. Then the framework is optimized with three new genetic operators: (1) Geometry operator which limits the typology distribution of exemplars based on pair-wise distances, (2) EM operator which apply EM (Expectation maximization) algorithmto generate children from previous population and (3) Vertex substitution operator which is initialized with genetic algorithm and select exemplars by using the variable neighborhood search meta-heuristic framework. Theoretical analysis proves the ECGA can achieve better chance t ofind the optimal clustering results. Experimental results on several synthetic and real data sets show our ECGA provide comparable or better results at the cost of slightly longer CPU time.
  • loading
  • A.K. Jain, et al., “Data clustering: A review”, ACM ComputingS urveys, Vol.31, No.3, pp.264-323, 1999.
    J. Lei, H. Zhang, C. Hou, L. Lin, “Segmentation-based adaptivev ergence control for parallel multiview stereoscopic images”,O ptik International Journal for Light and Electron Optics,http://dx.doi.org/10.1016/j.ijleo.2012.06.021, 2012.
    S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4rd Edition,A cademic Press, Burlington, USA, 2009.
    C.R. Lin, K.H. Liu, M. and S. Chen, “Dual clustering: integratingd ata clustering over optimization and constraint domains”,I EEE Transactions on Knowledge and Data Engineering,Vol.17, No.5, pp.628-637, 2005.
    Z. Yang, K. Fan, J. Lei, J. Guo, “Text manifold based on semantica nalysis”, Acta Electronic Sinica, Vol.37, No.3, pp.557-561,2 009. (in Chinese)
    A.L.N. Fred, A.K. Jain, “Combining multiple clusterings usinge vidence accumulation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.27, No.6, pp.835-850, 2005.
    D. Dueck, “Affinity propagation: Clustering data by passingm essages”, Ph.D. Thesis, University of Toronto, Canada, 2009.
    P. Hansen, N. Mladenovic, “Variable neighborhood search for the p-median”, Location Science, Vol.5, No.4, pp.207-226, 1997.
    B.J. Frey, D. Dueck, “Clustering by passing messages betweend ata points”, Science, Vol.315, No.5814, pp.972-976, 2007.
    J. Holland, Adaptation in Natural and Artificial Systems: AnI ntroductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT press, Cambridge, USA, 1992.
    G. Mclachlan and T. Krishnan, The EM Algorithm and Extensions,J ohn Wiley & Sons, New York, USA, 1996.
    A.E. Eiben, E.H.L. Aarts, K.M. Van Hee, “Global convergence of genetic algorithms: A Markov chain analysis”, Proc. of 1stW orkshop on Parallel Problem Solving from Nature, Dortmund,B erlin, Germany, pp.4-12, 1991.
    G. Rudolph, “Convergence analysis of canonical genetic algorithms”,I EEE Transaction on Neural Networks, Vol.5, No.1,p p.96-101, 1994.
    K. Krishna, M.N. Murty, “Genetic k-means algorithm”, IEEET ransaction on Systems, Man, and Cybernetics-Part B: Cybernetics,Vol.29, No.3, pp.433-439, 1999.
    H.Mühlenbeina, M. Schomischa, J. Bornb, “The parallel genetica lgorithm as function optimizer”, Parallel Computing, Vol.17,N o.6-7, pp.619-632, 1991.
    V. Perlibaksa, “Distance measures for PCA-based face recognition”, Pattern Recognition Letters, Vol.25, No.6, pp.711-724,2 004.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (270) PDF downloads(1231) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return