ZHOU Junsheng, QU Weiguang, ZHANG Fen. Chinese Named Entity Recognition via Joint Identification and Categorization[J]. Chinese Journal of Electronics, 2013, 22(2): 225-230.
Citation: ZHOU Junsheng, QU Weiguang, ZHANG Fen. Chinese Named Entity Recognition via Joint Identification and Categorization[J]. Chinese Journal of Electronics, 2013, 22(2): 225-230.

Chinese Named Entity Recognition via Joint Identification and Categorization

Funds:  This work is supported by the National Natural Science Foundation of China (No.61073119) and the Jiangsu Natural Science Foundation of China (No.BK2010547).
  • Received Date: 2012-04-01
  • Rev Recd Date: 2012-07-01
  • Publish Date: 2013-04-25
  • Chinese Named entity recognition (NER) is an important task for Chinese information processing. Traditional sequence labeling approaches to Chinese NER cannot treat globally a string of continuous characters as a named entity candidate so that the entity-level features cannot be exploited in a natural way. To deal with this problem, we formulate Chinese NER as a joint identification and categorization task that performs the two subtasks simultaneously: boundary identification and entity categorization, together with segmentation. The proposed approach provides a natural formulation to treats pieces of continuous characters as named entity candidates, which allows for more accurate prediction by examining both the internal evidence and contextual information of the candidates. Within this framework, we explored a variety of effective feature representations for Chinese NER. Closed tests on two quite different corpora from the third SIGHAN bakeoff show that our approach significantly outperforms the best in the literature, achieving state-of-theart performance.
  • loading
  • Gina-Anne Levow, "The third international Chinese language processing bakeoff: Word segmentation and named entity recognition", Proc. of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, pp.108-117, 2006.
    H. Zhang, Q. Liu, H.K. Yu, Y.Q. Cheng and S. Bai, "Chinese named entity recognition using role model", Computational Linguistics and Chinese Language Processing, Vol.8, No.2, pp.29-60, 2003.
    W. Chen, Yujie Zhang and Hitoshi Isahara, "Chinese named entity recognition with conditional random fields", Proc. of 5th SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, pp.118-121, 2006.
    J. Zhou, L. He, X. Dai and J. Chen, "Chinese named entity recognition with a multiphase model", Proc. of 5th SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, pp.213-216, 2006.
    A. Chen, F. Peng, R. Shan and G. Sun, "Chinese named entity recognition with conditional probabilistic models", Proc. of 5th SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, pp.173-176, 2006.
    J. Lafferty, A. McCallum and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data", Proc. of ICML, San Francisco, USA, pp.282- 289, 2001.
    Yue Zhang and Stephen Clark, "Joint word segmentation and POS tagging using a single perceptron", Proc. of ACL/HLT, Columbus, OH, pp.888-896, 2008.
    Yue Zhang and Stephen Clark, "A fast decoder for joint word segmentation and POS-tagging using a single discriminative model", Proc. of EMNLP, Cambridge, MA, pp.843-852, 2010.
    W. Jiang, Haitao Mi and Qun Liu, "Word lattice reranking for Chinese word segmentation and part-of-speech tagging", Proc. of COLING, Manchester, UK, pp.385-392, 2008.
    Canasai Kruengkrai, Kiyotaka Uchimoto, Jun'ichi Kazama et al., "An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging", Proc. of ACL/AFNLP, Suntec, Singapore, pp.513-521, 2009.
    Michael Collins, "Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms", Proc. of EMNLP, Philadelphia, USA, pp.1-8, 2002.
    Michael Collins and Brian Roark, "Incremental parsing with the perceptron algorithm", Proc. of 42nd ACL, Barcelona, Spain, pp.111-118, 2004.
    Richard Sproat, Chilin Shih, William Gale and Nancy Chang, "A stochastic finite-state word-segmentation algorithm for Chinese", Computational Linguistics, Vol.22, No.3, pp.377-404, 1996.
    Pavel Pecina, "Lexical association measures and collocation extraction", Language Resources and Evaluation, Vol.44, No.1-2, pp.137-158, 2010.
    G. Zhou and J. Su, "Named entity recognition using an HMMbased chunk tagger", Proc. of the 40th ACL, Philadelphia, USA, pp.473-480, 2002.
    Andrew McCallum, Wei Li, "Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons", Proc. of the Seventh CoNLL Conference, Edmonton, pp.188-191, 2003.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (519) PDF downloads(2283) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return