YANG Li, ZHOU Yanhong, ZHENG Ying. Annotating the Literature with Disease Ontology[J]. Chinese Journal of Electronics, 2017, 26(6): 1261-1268. doi: 10.1049/cje.2017.09.020
Citation: YANG Li, ZHOU Yanhong, ZHENG Ying. Annotating the Literature with Disease Ontology[J]. Chinese Journal of Electronics, 2017, 26(6): 1261-1268. doi: 10.1049/cje.2017.09.020

Annotating the Literature with Disease Ontology

doi: 10.1049/cje.2017.09.020
Funds:  This work is supported by the National Natural Science Foundation of China (No.61602060).
  • Received Date: 2014-12-01
  • Rev Recd Date: 2015-09-26
  • Publish Date: 2017-11-10
  • With the rapid growth of inquiry in biomedicine concerning diseases, the recognition of diseases becomes especially important. But only the recognition of the biomedical concepts in literature is not enough, annotations and normalizations of the concepts with normalized Metathesaurus get even more important. This paper proposes a system to annotate the literature with normalized Metathesaurus. First, a two-phase Conditional random fields (CRFs) is used to recognize the disease mentions, including the location and identification. Then, the paper adapts the Disease ontology (DO) to annotate the diseases recognized for normalization by computing the similarity between disease mentions and concepts. According to the similarities, the disease mentions are denoted as disease concepts and instances distinctively. The experiments carried out on the Arizona disease corpus show that our system makes a good achievement and outperforms the other works.
  • loading
  • K. Poos, J. Smida, M. Nathrath, et al., "Structuring osteosarcoma knowledge:An osteosarcoma-gene association database based on literature mining and manual annotation", Database:The Journal of Biological Databases and Curation, Vol.12, pp.2159-2160, 2014.
    Lynn M. Schriml and Elvira Mitraka, "The disease ontology:Fostering interoperability between biological and clinical human disease-related data", Mamm Genome, Vol.26, No.9, pp.584-589, 2015.
    L.M. Schriml, C. Arze, S. Nadendla, et al., "Disease ontology:A backbone for disease semantic integration", Nucleic Acids Res., Vol.40, Database Issue, pp.940-946, 2012.
    S. Kohler, S.C. Doelken, C.J. Mungall, et al., "The human phenotype ontology project:Linking molecular biology and disease through phenotype data", Nucleic Acids Res., Vol.42, Database Issue, pp.966-974, 2014.
    J.S. Amberger, C.A. Bocchini, F. Schiettecatte, et al., "OMIM.org:Online mendelian inheritance in man (OMIM?), an online catalog of human genes and genetic disorders", Nucleic Acids Research, Vol.43, Database Issue, pp.789-798, 2015.
    R. Hoehndorf, M. Dumontier and G.V. Gkoutos, "Evaluation of research in biomedical ontologies", Briefings in Bioinformatics, Vol.14, No.6, pp.696-712, 2013.
    U. Hahn, E. Buyko, R. Landefeld, et al., "An overview of JCoRe, the JULIE lab UIMA component repository", Proc. of the LREC'08 Workshop Towards Enhanced Interoperability for Large HLT Systems:UIMA for NLP, Marrakech, Morocco, pp.1-7, 2008.
    R. Leaman, C. Miller and G. Gonzalez, "Enabling recognition of diseases in biomedical text with machine learning:Corpus and benchmark", Proc. of the 2009 Symposium on Languages in Biology and Medicine, Jeju island, South Korea, pp.12-13, 2009.
    J.D. Lafferty, A. Mccallum and F.C.N. Pereira, "Conditional random fields:Probabilistic models for segmenting and labeling sequence data", Proc. of the International Conference on Machine Learning, Williams, MA, pp.282-289, 2001.
    A. Jimeno, E. Jimenezruiz, V. Lee, et al., "Assessment of disease named entity recognition on a corpus of annotated sentences", BMC Bioinformatics, Vol.11, No.9, pp.696-712, 2008.
    W. Kim, W.J. Wilbur and Z. Lu, "Exploring two biomedical text genres for disease recognition", Association for Computational Linguistics, Vol.32, No.11, pp.424-431, 2009.
    A.R. Aronson and F. Lang, "An overview of MetaMap:Historica perspective and recent advances", Journal of the American Medical Informatics Association, Vol.17, No.3, pp.229-236, 2015.
    L. Yang and Y. Zhou, "Two-phase biomedical named entity recognition based on semi-CRFs", IEEE Fifth International Conference on Bio-Inspired Computing:Theories and Applications, Changsha, China, pp.1061-1065, 2010.
    P. Thompson, J. Mcnaught, S. Montemagni, et al., "The BioLexicon:A large-scale terminological resource for biomedical text mining", BMC Bioinformatics, Vol.12, No.1, pp.397-425, 2011.
  • 加载中


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (146) PDF downloads(217) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint