A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding

ZHAO Lingling; WANG Junjie; WANG Chunyu; GUO Maozu

doi:10.1049/cje.2020.00.326

ZHAO Lingling, WANG Junjie, WANG Chunyu, GUO Maozu. A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding[J]. Chinese Journal of Electronics, 2022, 31(5): 860-869. DOI: 10.1049/cje.2020.00.326

Citation:

A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding

Abstract

Abstract

A common but critical task in biological ontologies data analysis is to compare the difference between ontologies. There have been numerous ontology-based semantic-similarity measures proposed in specific ontology domain, but it still remains a challenge for cross-domain ontologies comparison. An ontology contains the scientific natural language description for the corresponding biological aspect. Therefore, we develop a new method based on natural language processing (NLP) representation model bidirectional encoder representations from transformers (BERT) for cross-domain semantic representation of biological ontologies. This article uses the BERT model to represent the word-level of the ontologies as a set of vectors, facilitating the semantic analysis or comparing the biomedical entities named in an ontology or associated with ontology terms. We evaluated the ability of our method in two experiments: calculating similarities of pair-wise disease ontology and human phenotype ontology terms and predicting the pair-wise of proteins interaction. The experimental results demonstrated the comparative performance. This gives promise to the development of NLP methods in biological data analysis.

FullText(HTML)

References (46)

Cited By

A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding

Abstract

Catalog

Links

Chinese Journal of Electronics

A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding

Abstract

Catalog

Links

Chinese Journal of Electronics

Export File

Citation

Format

Content