ZHANG Yajun, LIU Zongtian, ZHOU Wen. Biomedical Named Entity Recognition Based on Self-supervised Deep Belief Network[J]. Chinese Journal of Electronics, 2020, 29(3): 455-462. doi: 10.1049/cje.2020.03.001
Citation: ZHANG Yajun, LIU Zongtian, ZHOU Wen. Biomedical Named Entity Recognition Based on Self-supervised Deep Belief Network[J]. Chinese Journal of Electronics, 2020, 29(3): 455-462. doi: 10.1049/cje.2020.03.001

Biomedical Named Entity Recognition Based on Self-supervised Deep Belief Network

doi: 10.1049/cje.2020.03.001
Funds:  This work is supported by the National Natural Science Foundation of China (No.61305053, No.61273328, No.71203135).
  • Received Date: 2018-10-08
  • Rev Recd Date: 2019-01-02
  • Publish Date: 2020-05-10
  • Named entity recognition is a fundamental and crucial issue of biomedical data mining. For effectively solving this issue, we propose a novel approach based on Deep belief network (DBN). We select nine entity features, and construct feature vector mapping tables by the recognition contribution degree of different values of them. Using the mapping tables, we transform words in biomedical texts to feature vectors. The DBN will identify entities by reducing dimensions of vector data. The extensive experimental results reveal that the novel approach has achieved excellent recognition performance, with 69.96% maximum value of F-measure on GENIA 3.02 testing corpus. We propose a self-supervised DBN, which can decide whether to add supervised fine-tuning or not according to the recognition performance of each layer, can overcome the errors propagation problem, while the complexity of model is limited. Test analysis shows that the new DBN improves recognition performance, the Fmeasure increases to 72.12%.
  • loading
  • T.h. Tsai, W.C. Chou, S.H. Wu, et al., “Integrating linguistic knowledge into a conditional random field framework to identify biomedical named entities”, Expert Systems with Applications, Vol.30, No.1, pp.117-128, 2006.
    K.J. Lee, Y.S. Hwang and H.C. Rim, “Two-phase biomedical named entity recognition based on svms”, Proc. of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, Sapporo, Hokkaido, Japan, pp.33-40, 2003.
    S.F. Altschul, T.L. Madden, A.A. Schaffer, et al., “Gapped blast and psi-blast: A new generation of protein database search programs”, Nucleic Acids Research, Vol.25, No.17, pp.3389-3402, 1997.
    K.J. Lee, Y.S. Hwang, S. Kim, et al., “Named entity recognition using two-phase model based on svms”, Journal of Biomedical Informatics, Vol.37, No.6, pp.436-447, 2004.
    Y. Song, E. Kim, G.G. Lee, et al., “Posbiotmner: A trainable biomedical named-entity recognition system”, Bioinformatics, Vol.21, No.11, pp.2794-2796, 2005.
    R.T.H. Tsai, C.L. Sung, H.J. Dai, et al., “Nerbio: Using selected word conjunctions,term normalization, and global patterns to improve biomedical named entity recognition”, BMC Bioinformatics, Vol.7, No.5, pp.5-11, 2006.
    Y. Tsuruoka and J. Tsujii, “Boosting precision and recall of dictionary-based protein name recognition”, Proc. of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, Sapporo, Hokkaido, Japan, pp.41-48, 2003.
    Z. Yang, H. Lin and Y. Li, “Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature”, Computational Biology and Chemistry, Vol.32, No.4, pp.287-291, 2008.
    M. J. Schuemie, B. Mons, M. Weeber, et al., “Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification”, Journal of Biomedical Informatics, Vol.40, No.3, pp.316-324, 2007.
    K. Franzen, G. Eriksson, F. Olsson, et al., “Protein names and how to find them”, International Journal of Medical Informatics, Vol.67, No.1, pp.49-61, 2002.
    D. Hanisch, K. Fundel, H.T. Mevissen, et al., “Rule-based protein and gene entity recognition”, BMC Bioinformatics, Vol.6, No.1, pp.1-14, 2005.
    Y. Tsuruoka, J. McNaught and S. Ananiadou, “Normalizing biomedical terms by minimizing ambiguity and variability”, BMC Bioinformatics, Vol.9, No.3, pp.2-3, 2008.
    G.E. Hinton and R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks”, Science, Vol.313, No.5786, pp.504-507, 2006.
    G.E. Hinton, “Training products of experts by minimizing contrastive divergence”, Neural Computation, Vol.14, No.8, pp.1771-1880, 2002.
    R. Salakhutdinov and G. Hinton, “Deep Boltzmann machines”, Journal of Machine Learning Research, Vol.5, No.2, pp.1967-2006, 2009.
    G.E. Hinton, S. Osindero and Y.W. Teh, “A fast learning algorithm for deep belief nets”, Neural Computation, Vol.18, No.7, pp.1527-1554, 2006.
    C. Guoan, “Fast backpropagation learning using optimal learning rate and momentum”, Journal of Southeast University, Vol.10, No.3, pp.517-527, 1999.
    S. Zhao, “Named entity recognition in biomedical texts using an hmm model”, Proc. of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, pp.84-87, 2004.
    B. Settles, “Biomedical named entity recognition using conditional random fields and rich feature sets”, Proc. of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, pp.104-107, 2004.
    L. Yao, H. Liu, Y. Liu, et al., “Biomedical named entity recognition based on deep neutral network”, International Journal of Hybrid InformationTechnology, Vol.8, No.8, pp.279-288, 2015.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (232) PDF downloads(135) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return