YANG Zhen, YAO Fei, FAN Kefeng, HUANG Jian. Text Dimensionality Reduction with Mutual Information Preserving Mapping[J]. Chinese Journal of Electronics, 2017, 26(5): 919-925. doi: 10.1049/cje.2017.08.020
Citation: YANG Zhen, YAO Fei, FAN Kefeng, HUANG Jian. Text Dimensionality Reduction with Mutual Information Preserving Mapping[J]. Chinese Journal of Electronics, 2017, 26(5): 919-925. doi: 10.1049/cje.2017.08.020

Text Dimensionality Reduction with Mutual Information Preserving Mapping

doi: 10.1049/cje.2017.08.020
Funds:  This work is supported by the National Natural Science Foundation of China (No.61671030), the Excellent Talents Foundation of Beijing, the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (No.CIT&TCD201404052), and the Guangxi Colleges and Universities Key Laboratory of Cloud Computing and Complex Systems (No.15205).
More Information
  • Corresponding author: FAN Kefeng (corresponding author) was born in 1978. He received the Ph.D. degree in test signal processing from Xidian University, Shaanxi, China, in 2007. From 2008 to 2010, he was a postdoctoral stuff in the State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications. Now, He is deputy director of the research center of CESI. His current research interests include signal processing and cyberspace security. (Email:fankf@cesi.ac.cn)
  • Received Date: 2015-07-24
  • Rev Recd Date: 2016-04-08
  • Publish Date: 2017-09-10
  • With the explosion of information, it is becoming increasingly difficult to get what is really wanted. Dimensionality reduction is the first step in efficient processing of large data. Although dimensionality can be reduced in many ways, little work has been done to achieve dimensionality reduction without changing the inner semantic relationship among high dimension data. To remedy this problem, we introduced a manifold learning based method, named Mutual information preserving mapping (MIPM), to explore the low-dimensional, neighborhood and mutual information preserving embeddings of highdimensional inputs. Experimental results show that the proposed method is effective for the text dimensionality reduction task. The MIPM was used to develop a temporal summarization system for efficiently monitoring the information associated with an event over time. With respect to the established baselines, results of these experiments show that our method is effective in the temporal summarization.
  • loading
  • J. Aslam, M. Ekstrand-Abueg and V. Pavlu, "TREC 2014 temporal summarization track overview", The Twenty-Second Text REtrieval Conference, NIST Special Publication 500-308, pp.1-16, Washington D.C., 2014.
    Z. Yang, L. Wang, K. Fan, et al., "Exemplar-based clustering analysis optimized by genetic algorithm", Chinese Journal of Electronics, Vol.22, No.4, pp.735-740, 2013.
    Z. Yang, K. Gao, K. Fan, et al., "Sensational headline identification by normalized cross entropy-based metric", The Computer Journal, Vol.58, No.4, pp.644-655, 2015.
    H. Liu and H. Motoda, Computational Methods of Feature Selection. CRC Press, Boca Raton, FL, 2007.
    S. Roweis and L. Saul, "Nonlinear dimensionality reduction by locally linear embedding", Science, Vol.290, No.5500, pp.2323-2326, 2000.
    A. Trotman and D. Keeler, "Ad Hoc IR:Not much room for improvement", Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Beijing, pp.1095-1096, 2011.
    R. Barzilay, K. McKeown and M. Elhadad, "Information fusion in the context of multi-document summarization", Proceedings of The 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, Maryland, pp.550-557, 1999.
    C. Lin and E. Hovy, "From single to multi-document summarization:A prototype system and its evaluation", Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, PA, pp.457-464, 2002.
    L. Wang, H. Raghavan, V. Castelli V, et al., "A sentence compression based framework to query-focused multi-document summarization", Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Bulgaria, pp.1384-1394, 2013.
    Z. Zhao and H. Liu, "Semi-supervised feature selection via spectral analysis", Proceedings of the 2007 SIAM International Conference on Data Mining, Minnesota, pp.641-646, 2007.
    S. Wang, J. Tang, H. Liuang, et al., "Embedded unsupervised feature selection", Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Texas, pp.470-476, 2015.
    I. Fodor, "A survey of dimension reduction techniques", Technique Report, UCRL-ID-148494, 2002.
    L. John and M. Verleysen, Nonlinear Dimensionality Reduction, Springer, Berlin Heidelberg, 2007.
    E. Keogh, K. Chakrabarti, M. Pazzani, et al., "Dimensionality reduction for fast similarity search in large time series databases", Knowledge and Information Systems, Vol.3, No.3, pp.263-286, 2001.
    S. Yan, D. Xu, B. Zhang, et al., "Graph embedding and extensions:A general framework for dimensionality reduction", IEEE Transactions Pattern Analysis and Machine Intelligence, Vol.29, No.1, pp.40-51, 2007.
    P. Besl and R. Jain, "Three-dimensional object recognition", ACM Computing Surveys, Vol.17, No.1, pp.75-145, 1985.
    H. Seung and D. Lee, "The manifold ways of perception", Science, Vol.290, No.5500, pp.2268-2269, 2000.
    J. Tenenbaum, V. Silva and J. Langford, "A global geometric framework for nonlinear dimensionality reduction", Science, Vol.290, No.5500, pp.2319-2323, 2000.
    B. Raducanu and D. Dornaika, "A supervised non-linear dimensionality reduction approach for manifold learning", Pattern Recognition, Vol.45, No.6, pp.2432-2444, 2012.
    S. Ingram and T. Munzner, "Dimensionality reduction for documents with nearest neighbor queries", Neurocomputing, Vol.150, No.SI, pp.557-569, 2015.
    T. Strohman, D. Metzler, H. Turtle, et al., "Indri:A language model-based search engine for complex queries", Proceedings of the 2005 International Conference on Intelligence Analysis, May 2-6, McLean, VA, USA, pp.2-6, 2005.
    I. Jolliffe, Principal Component Analysis, John Wiley & Sons, Ltd, Hoboken, NJ, 2002.
    M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation", Neural Computation, Vol.15, No.6, pp.1373-1396, 2003.
    K.Q. Weinberger and L.K. Saul, "An introduction to nonlinear dimensionality reduction by maximum variance unfolding", Proceedings of the Twenty-First AAAI Conference on Artificial Intelligence, Massachusetts, pp.1683-1686, 2006.
  • 加载中


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (139) PDF downloads(384) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint