Citation: | ZHAO Lingling, WANG Junjie, WANG Chunyu, GUO Maozu. A Cross-Domain Ontology Semantic Representation Based on NCBI-blueBERT Embedding[J]. Chinese Journal of Electronics. doi: 10.1049/cje.2020.00.326 |
[1] |
T.R. Gruber, “A translation approach to portable ontology specifications,” Knowledge acquisition, vol.5, no.2, pp.199–220, 1993. doi: 10.1006/knac.1993.1008
|
[2] |
M.A. Rodríguez and M.J. Egenhofer, “Determining semantic similarity among entity classes from different ontologies,” IEEE transactions on knowledge and data engineering, vol.15, no.2, pp.442–456, 2003. doi: 10.1109/TKDE.2003.1185844
|
[3] |
B. Smith, M. Ashburner, C. Rosse, et al., “The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration,” Nature biotechnology, vol.25, no.11, pp.1251–1255, 2007. doi: 10.1038/nbt1346
|
[4] |
G.K. Mazandu and N.J. Mulder, “A topology-based metric for measuring term similarity in the gene ontology,” Advances in bioinformatics, vol.2012, 2012.
|
[5] |
L. Cheng, Y. Jiang, H. Ju, et al., “InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk,” BMC genomics, vol.19, no.1, pp.125–134, 2018. doi: 10.1186/s12864-018-4500-9
|
[6] |
R. Rada, H. Mili, E. Bicknell and M. Blettner, “Development and application of a metric on semantic nets,” IEEE transactions on systems, man, and cybernetics, vol.19, no.1, pp.17–30, 1989. doi: 10.1109/21.24528
|
[7] |
Z. Wu and M. Palmer, “Verb semantics and lexical selection.” arXiv preprint cmp-lg/9406033, 1994.
|
[8] |
C. Pesquita, D. Faria, A.O. Falcao, et al., “Semantic similarity in biomedical ontologies,” PLoS comput biol, vol.5, no.7, article no.e1000443, 2009. doi: 10.1371/journal.pcbi.1000443
|
[9] |
P. Resnik, “Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language,” Journal of artificial intelligence research, vol.11, pp.95–130, 1999. doi: 10.1613/jair.514
|
[10] |
D. Lin. “An information-theoretic definition of similarity.” in Icml. 1998.
|
[11] |
J.Z. Wang, Z. Du, R. Payattakool, et al., “A new method to measure the semantic similarity of GO terms,” Bioinformatics, vol.23, no.10, pp.1274–1281, 2007. doi: 10.1093/bioinformatics/btm087
|
[12] |
F.Z. Smaili, X. Gao and R. Hoehndorf, “Opa2vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction,” Bioinformatics, vol.35, no.12, pp.2133–2140, 2019. doi: 10.1093/bioinformatics/bty933
|
[13] |
F.Z. Smaili, X. Gao and R. Hoehndorf, “Onto2vec: joint vector-based representation of biological entities and their ontology-based annotations,” Bioinformatics, vol.34, no.13, pp.i52–i60, 2018. doi: 10.1093/bioinformatics/bty259
|
[14] |
D. Duong, E. Eskin and J.J. Li, “A novel Word2vec based tool to estimate semantic similarity of genes by using Gene Ontology terms,” bioRxiv, article no.103648, 2017.
|
[15] |
J. Lafferty, A. McCallum and F.C. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data.” 2001.
|
[16] |
J. Zhang, Y. Song, C. Zhang and S. Liu. “Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora.” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. 2010.
|
[17] |
T. Mikolov, I. Sutskever, K. Chen, et al., “Distributed representations of words and phrases and their compositionality. CoRR abs/1310.4546 (2013).” arXiv preprint arXiv: 1310.4546, 2013.
|
[18] |
J. Pennington, R. Socher and C.D. Manning. “Glove: Global vectors for word representation.” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.
|
[19] |
A. Joulin, E. Grave, P. Bojanowski, et al., “Fasttext. zip: Compressing text classification models.” arXiv preprint arXiv: 1612.03651, 2016.
|
[20] |
F. Shen, S. Peng, Y. Fan, et al., “HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology,” Journal of biomedical informatics, vol.96, article no.103246, 2019. doi: 10.1016/j.jbi.2019.103246
|
[21] |
M.E. Peters, M. Neumann, M. Iyyer, et al., “Deep contextualized word representations.” arXiv preprint arXiv: 1802.05365, 2018.
|
[22] |
J. Lee, W. Yoon, S. Kim, et al., “BioBERT: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol.36, no.4, pp.1234–1240, 2020.
|
[23] |
I. Beltagy, K. Lo and A. Cohan, “SciBERT: A pretrained language model for scientific text.” arXiv preprint arXiv: 1903.10676, 2019.
|
[24] |
Y. Peng, S. Yan and Z. Lu, “Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets”. arXiv preprint arXiv: 1906.05474, 2019.
|
[25] |
A. Conneau, D. Kiela, H. Schwenk, et al., “Supervised learning of universal sentence representations from natural language inference data.” arXiv preprint arXiv: 1705.02364, 2017.
|
[26] |
R. Kiros, Y. Zhu, R.R. Salakhutdinov, et al., “Skip-thought vectors,” in Advances in neural information processing systems, 2015.
|
[27] |
D. Cer, Y. Yang, S.-y. Kong, et al., “Universal sentence encoder”. arXiv preprint arXiv: 1803.11175, 2018.
|
[28] |
H. Al-Mubaid and H.A. Nguyen. “A cluster-based approach for semantic similarity in the biomedical domain.” in 2006 International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 2006.
|
[29] |
G. Pirró, “A semantic similarity metric combining features and intrinsic information content,” Data & Knowledge Engineering, vol.68, no.11, pp.1289–1308, 2009.
|
[30] |
D. Bollegala, Y. Matsuo and M. Ishizuka. “A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web.” in Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009.
|
[31] |
E.G. Petrakis, G. Varelas, A. Hliaoutakis and P. Raftopoulou, “X-similarity: Computing semantic similarity between concepts from different ontologies,” Journal of Digital Information Management, vol.4, no.4, 2006.
|
[32] |
L. Ding, T. Finin, A. Joshi, et al. “Swoogle: a search and metadata engine for the semantic web”. in Proceedings of the thirteenth ACM international conference on Information and knowledge management. 2004.
|
[33] |
D. Sánchez, D. Isern and M. Millan, “Content annotation for the semantic web: an automatic web-based approach,” Knowledge and Information Systems, vol.27, no.3, pp.393–418, 2011. doi: 10.1007/s10115-010-0302-3
|
[34] |
D. Duong, A. Uppunda, L. Gai, et al., “Evaluating representations for gene ontology terms,” bioRxiv, article no.765644, 2020.
|
[35] |
G.O. Consortium, “Expansion of the Gene Ontology knowledgebase and resources,” Nucleic acids research, vol.45, no.D1, pp.D331–D338, 2017. doi: 10.1093/nar/gkw1108
|
[36] |
G.K. Mazandu, E.R. Chimusa and N.J. Mulder, “Gene ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery,” Briefings in bioinformatics, vol.18, no.5, pp.886–901, 2017.
|
[37] |
A. Pesaranghader, S. Matwin, M. Sokolova and R.G. Beiko, “simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes,” Bioinformatics, vol.32, no.9, pp.1380–1387, 2016. doi: 10.1093/bioinformatics/btv755
|
[38] |
L.v.d. Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of machine learning research, vol.9, no.Nov, pp.2579–2605, 2008.
|
[39] |
C. Szegedy, V. Vanhoucke, S. Ioffe, et al. “Rethinking the inception architecture for computer vision”. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
|
[40] |
J. Peng, J. Chen and Y. Wang. “Identifying cross-category relations in gene ontology and constructing genome-specific term association networks.” in BMC bioinformatics. Springer, 2013.
|
[41] |
A. Bellandi, B. Furletti, V. Grossi and A. Romei, “Ontology-driven association rule extraction: A case study,” Contexts and Ontologies Representation and Reasoning, vol.10, 2007.
|
[42] |
O. Bodenreider, M. Aubry and A. Burgun, “Non-lexical approaches to identifying associative relations in the gene ontology,” in Biocomputing 2005. World Scientific, pp.91–102, 2005.
|
[43] |
J. Peng, H. Wang, J. Lu, et al., “Identifying term relations cross different gene ontology categories,” BMC bioinformatics, vol.18, no.16, article no.573, 2017.
|
[44] |
G. Salton, A. Wong and C.-S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol.18, no.11, pp.613–620, 1975. doi: 10.1145/361219.361220
|
[45] |
A. Kumar, B. Smith and C. Borgelt. “Dependence relationships between Gene Ontology terms based on TIGR gene product annotations.” in Proceedings of CompuTerm 2004: 3rd International Workshop on Computational Terminology. 2004.
|
[46] |
K.-H. Chen, T.-F. Wang and Y.-J. Hu, “Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme,” BMC Bioinformatics, vol.20, no.1, article no.308, 2019. doi: 10.1186/s12859-019-2907-1
|