WANG Junhua, ZUO Wanli, PENG Tao. Hyponymy Graph Model for Word Semantic Similarity Measurement[J]. Chinese Journal of Electronics, 2015, 24(1): 96-101.
Hyponymy Graph Model for Word Semantic Similarity Measurement

Funds:  This work is supported by the National Natural Science Foundation of China (No.60973040, No.61300148, No.60903098), and the Key Scientific and Technological Break-through Program of Jilin Province (No.20130206051GX).
  • Corresponding author: ZUO Wanli was born in 1957, received the B.E., M.S. and Ph.D. degrees from Jilin University in 1982, 1985, and 2005, respectively. He is currently a professor and doctor supervisor at the College of Computer Science and Technology, Jilin University. He is also an ACM professional member, CCF distinguished member, member of System Software Disciplinary Committee of CCF. His research interest include database, Web mining, information retrieval, machine learning, and natural language processing. (Email:
  • Received Date: 2013-05-01
  • Rev Recd Date: 2013-09-01
  • Publish Date: 2015-01-10
  • Measuring word semantic similarity is a generic problem with a broad range of applications such as ontology mapping, computational linguistics and artificial intelligence. Previous approaches to computing word semantic similarity did not consider concept occurrence frequency and word's sense number. This paper introduced Hyponymy graph, and based on which proposed a novel word semantic similarity model. For two words to be compared, we first retrieve their related concepts; then produce lowest common ancestor matrix and distance matrix between concepts; finally calculate distance-based similarity and information-based similarity, which are integrated to get final semantic similarity. The main contribution of our method is that both concept occurrence frequency and word's sense number are taken into account. This similarity measurement more closely fits with human rating and effectively simulates human thinking process. Our experimental results on benchmark dataset M&C and R&G with WordNet2.1 as platform demonstrate roughly 0.9%-1.2% improvements over existing best approaches.
