Exemplar-Based Clustering Analysis Optimized by Genetic Algorithm
-
Graphical Abstract
-
Abstract
Exemplar-based clustering algorithm is very efficient to handle large scale and high dimensional data, while it does not require the user to specify many parameters. For current algorithms, however, are the inabilities to identify the optimal results or specify the number of clusters automatically. To remedy these, in this work, we propose and explore the idea of exemplar-based clustering analysis optimized by genetic algorithms, abbreviated as ECGA framework, which use genetic algorithms for optimizing and combining the results. First, an exemplarbased clustering framework based on canonical genetic algorithm is introduced. Then the framework is optimized with three new genetic operators: (1) Geometry operator which limits the typology distribution of exemplars based on pair-wise distances, (2) EM operator which apply EM (Expectation maximization) algorithmto generate children from previous population and (3) Vertex substitution operator which is initialized with genetic algorithm and select exemplars by using the variable neighborhood search meta-heuristic framework. Theoretical analysis proves the ECGA can achieve better chance t ofind the optimal clustering results. Experimental results on several synthetic and real data sets show our ECGA provide comparable or better results at the cost of slightly longer CPU time.
-
-