Citation: | ZHAO Lingling, WANG Junjie, WANG Chunyu, et al., “A Cross-Domain Ontology Semantic Representation Based on NCBI-BlueBERT Embedding,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 860-869, 2022, doi: 10.1049/cje.2020.00.326 |
[1] |
T. R. Gruber, “A translation approach to portable ontology specifications,” Knowledge Acquisition, vol.5, no.2, pp.199–220, 1993. doi: 10.1006/knac.1993.1008
|
[2] |
M. A. Rodríguez and M. J. Egenhofer, “Determining semantic similarity among entity classes from different ontologies,” IEEE Transactions on Knowledge and Data Engineering, vol.15, no.2, pp.442–456, 2003. doi: 10.1109/TKDE.2003.1185844
|
[3] |
B. Smith, M. Ashburner, C. Rosse, et al., “The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration,” Nature Biotechnology, vol.25, no.11, pp.1251–1255, 2007. doi: 10.1038/nbt1346
|
[4] |
G. K. Mazandu and N. J. Mulder, “A topology-based metric for measuring term similarity in the gene ontology,” Advances in Bioinformatics, vol.2012, article no.975783, 2012.
|
[5] |
L. Cheng, Y. Jiang, H. Ju, et al., “InfAcrOnt: Calculating cross-ontology term similarities using information flow by a random walk,” BMC Genomics, vol.19, no.1, pp.125–134, 2018. doi: 10.1186/s12864-018-4500-9
|
[6] |
R. Rada, H. Mili, E. Bicknell, and M. Blettner, “Development and application of a metric on semantic nets,” IEEE Transactions on Systems, Man, and Cybernetics, vol.19, no.1, pp.17–30, 1989. doi: 10.1109/21.24528
|
[7] |
Z. Wu and M. Palmer, “Verb semantics and lexical selection,” in Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, Las Cruces, New Mexico, USA, pp.133–138, 1994.
|
[8] |
C. Pesquita, D. Faria, A. O. Falcao, et al., “Semantic similarity in biomedical ontologies,” PLOS Computational Biology, vol.5, no.7, article no.e1000443, 2009. doi: 10.1371/journal.pcbi.1000443
|
[9] |
P. Resnik, “Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language,” Journal of Artificial Intelligence Research, vol.11, pp.95–130, 1999. doi: 10.1613/jair.514
|
[10] |
D. Lin, “An information-theoretic definition of similarity,” in Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco, USA, pp.296–304, 1998.
|
[11] |
J. Z. Wang, Z. Du, R. Payattakool, et al., “A new method to measure the semantic similarity of GO terms,” Bioinformatics, vol.23, no.10, pp.1274–1281, 2007. doi: 10.1093/bioinformatics/btm087
|
[12] |
F. Z. Smaili, X. Gao and R. Hoehndorf, “Opa2vec: Combining formal and informal content of biomedical ontologies to improve similarity-based prediction,” Bioinformatics, vol.35, no.12, pp.2133–2140, 2019. doi: 10.1093/bioinformatics/bty933
|
[13] |
F. Z. Smaili, X. Gao and R. Hoehndorf, “Onto2vec: Joint vector-based representation of biological entities and their ontology-based annotations,” Bioinformatics, vol.34, no.13, pp.i52–i60, 2018. doi: 10.1093/bioinformatics/bty259
|
[14] |
D. Duong, W. U. Ahmad, E. Eskin, et al., “Word and sentence embedding tools to measure semantic similarity of gene ontology terms by their definitions,” Journal of Computational Biology, vol.26, no.1, pp.38–52, 2019. doi: DOI:10.1089/cmb.2018.0093
|
[15] |
J. Lafferty, A. McCallum, and F.C. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” in Proceedings of the Eighteenth International Conference on Machine Learning, San Francisco, USA, pp.282–289, 2001.
|
[16] |
J. Zhang, Y. Song, C. Zhang, and S. Liu, “Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora,” in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, pp.1–10, 2010.
|
[17] |
T. Mikolov, I. Sutskever, K. Chen, et al., “Distributed representations of words and phrases and their compositionality,” in Proceedings of the 26th International Conference on Neural Information Processing Systems, Red Hook, USA, vol.2, pp.3111–3119, 2013.
|
[18] |
J. Pennington, R. Socher and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp.1532–1543, 2014.
|
[19] |
A. Joulin, E. Grave, P. Bojanowski, et al., “Fasttext. zip: Compressing text classification models,” arXiv preprint, arXiv: 1612.03651, 2016.
|
[20] |
F. Shen, S. Peng, Y. Fan, et al., “HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology,” Journal of Biomedical Informatics, vol.96, article no.103246, 2019. doi: 10.1016/j.jbi.2019.103246
|
[21] |
M. E. Peters, M. Neumann, M. Iyyer, et al., “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, vol.1, pp.2227–2237, 2018.
|
[22] |
J. Lee, W. Yoon, S. Kim, et al., “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol.36, no.4, pp.1234–1240, 2020.
|
[23] |
I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A pretrained language model for scientific text,” arXiv preprint, arXiv: 1903.10676, 2019.
|
[24] |
Peng Y, Yan S, and Lu Z., “Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets,” in Proceedings of the Workshop on Biomedical Natural Language Processing (BioNLP), Florence, Italy, pp.58–65, 2019.
|
[25] |
A. Conneau, D. Kiela, H. Schwenk, et al., “Supervised learning of universal sentence representations from natural language inference data,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, ACL, Copenhagen, Denmark, pp.670–680, 2017.
|
[26] |
R. Kiros, Y. Zhu, R. R. Salakhutdinov, et al., “Skip-thought vectors,” in Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, vol.2, pp.3294–3302, 2015.
|
[27] |
D. Cer, Y. Yang, S. -y. Kong, et al., “Universal sentence encoder for English,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, ACL, Brussels, Belgium, pp.169–174, 2018.
|
[28] |
H. Al-Mubaid and H.A. Nguyen, “A cluster-based approach for semantic similarity in the biomedical domain,” in Proceedings of 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, pp.2713–2717, 2006.
|
[29] |
G. Pirró, “A semantic similarity metric combining features and intrinsic information content,” Data & Knowledge Engineering, vol.68, no.11, pp.1289–1308, 2009.
|
[30] |
D. Bollegala, Y. Matsuo, and M. Ishizuka. “A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web,” in Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, pp.803–812, 2009.
|
[31] |
E. G. Petrakis, G. Varelas, A. Hliaoutakis, et al., “X-similarity: Computing semantic similarity between concepts from different ontologies,” Journal of Digital Information Management, vol.4, no.4, pp.233–237, 2006.
|
[32] |
L. Ding, T. Finin, A. Joshi, et al., “Swoogle: A search and metadata engine for the semantic web,” in Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, Washington, DC, USA, pp.652–659, 2004.
|
[33] |
D. Sánchez, D. Isern, and M. Millan, “Content annotation for the semantic web: an automatic web-based approach,” Knowledge and Information Systems, vol.27, no.3, pp.393–418, 2011. doi: 10.1007/s10115-010-0302-3
|
[34] |
D. Duong, A. Uppunda, L. Gai, et al., “"Evaluating representations for gene ontology terms,” bioRxiv preprint, DOI: 10.1101/765644, 2020.
|
[35] |
G. O. Consortium, “Expansion of the gene ontology knowledgebase and resources,” Nucleic Acids Research, vol.45, no.D1, pp.D331–D338, 2017. doi: 10.1093/nar/gkw1108
|
[36] |
G. K. Mazandu, E. R. Chimusa, and N. J. Mulder, “Gene ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery,” Briefings in Bioinformatics, vol.18, no.5, pp.886–901, 2017.
|
[37] |
A. Pesaranghader, S. Matwin, M. Sokolova, and R.G. Beiko, “simDEF: Definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes,” Bioinformatics, vol.32, no.9, pp.1380–1387, 2016. doi: 10.1093/bioinformatics/btv755
|
[38] |
L. v. d. Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol.9, no.86, pp.2579–2605, 2008.
|
[39] |
C. Szegedy, V. Vanhoucke, S. Ioffe, et al., “Rethinking the inception architecture for computer vision,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.2818–2826, 2016.
|
[40] |
J. Peng, J. Chen, and Y. Wang, “Identifying cross-category relations in gene ontology and constructing genome-specific term association networks,” BMC Bioinformatics, vol.14, no.Suppl 2, article no.S15, 2013. doi: 10.1186/1471-2105-14-S2-S15
|
[41] |
A. Bellandi, B. Furletti, V. Grossi, et al., “Ontology-driven association rule extraction: A case study,” in Proceedings of the International Workshop on Contexts and Ontologies: Representation and Reasoning (C&O:RR) Collocated with the 6th International and Interdisciplinary Conference on Modelling and Using Context, Roskilde, Denmark, available at: http://ceur-ws.org/Vol-298/paper1.pdf, 2007.
|
[42] |
O. Bodenreider, M. Aubry, and A. Burgun, “Non-lexical approaches to identifying associative relations in the gene ontology,” in Proceedings of Pacific Symposium on Biocomputing 2005: World Scientific, World Scientific Publishing Co. Pte. Ltd, pp.91–102, 2005.
|
[43] |
J. Peng, H. Wang, J. Lu, et al., “Identifying term relations cross different gene ontology categories,” BMC Bioinformatics, vol.18, no.16, article no.573, 2017.
|
[44] |
G. Salton, A. Wong, and C.-S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol.18, no.11, pp.613–620, 1975. doi: 10.1145/361219.361220
|
[45] |
A. Kumar, B. Smith, and C. Borgelt. “Dependence relationships between Gene Ontology terms based on TIGR gene product annotations,” in Proceedings of CompuTerm 2004: 3rd International Workshop on Computational Terminology, COLING, Geneva, Switzerland, pp.31–38, 2004.
|
[46] |
K.-H. Chen, T.-F. Wang, and Y.-J. Hu, “Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme,” BMC Bioinformatics, vol.20, no.1, article no.308, 2019. doi: 10.1186/s12859-019-2907-1
|