Citation: | YAN Wenjing, ZHANG Baoyu, ZUO Min, et al., “AttentionSplice: An Interpretable Multi-Head Self-Attention Based Hybrid Deep Learning Model in Splice Site Prediction,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 870-887, 2022, doi: 10.1049/cje.2021.00.221 |
[1] |
X. Jian, E. Boerwinkle, and X. Liu, “In silico tools for splicing defect prediction: A survey from the viewpoint of end users,” Genetics in Medicine, vol.16, no.7, pp.497–503, 2014.
|
[2] |
N. Sheth, X. Roca, M. L. Hastings, T. Roeder, A. R. Krainer, and R. Sachidanandam, “Comprehensive splice-site analysis using comparative genomics,” Nucleic Acids Research, vol.34, no.14, pp.3955–3967, 2006.
|
[3] |
M. Burset, I. A. Seledtsov, and V. v. Solovyev, “Analysis of canonical and non-canonical splice sites in mammalian genomes,” Nucleic Acids Research, vol.28, no.21, pp.4364–4375, 2000.
|
[4] |
Karuppusamy T and M. Sivasubramanian, “Biological gene sequence stucture analysis using hidden Markov model,” Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol.12, no.4, pp.1652–1666, 2021.
|
[5] |
P. K. Meher, T. K. Sahu, and A. R. Rao, “Prediction of donor splice sites using random forest with a new sequence encoding approach,” BioData Mining, vol.9, no.1, pp.1–25, 2016.
|
[6] |
E. Pashaei, M. Ozen, and N. Aydin, “Splice site identification in human genome using random forest,” Health and Technology, vol.7, no.1, pp.141–152, 2017.
|
[7] |
T. Lee and S. Yoon, “Boosted categorical restricted Boltzmann machine for computational prediction of splice junctions,” in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, vol.37, pp.2483–2492, 2015
|
[8] |
E. Pashaei and N. Aydin, “Markovian encoding models in human splice site recognition using SVM,” Computational Biology and Chemistry, vol.73, pp.159–170, 2018.
|
[9] |
A. T. M. Golam Bari, M. Rokeya Reaz, and B. S. Jeong, “Effective DNA encoding for splice site prediction using SVM,” MATCH Communications in Mathematical and in Computer Chemistry, vol.71, no.1, pp.241–258, 2013.
|
[10] |
Y. Zhang, X. Liu, J. N. Macleod, and J. Liu, “DeepSplice: Deep classification of novel splice junctions revealed by RNA-seq,” in Proceedings of 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, pp.330–333, 2016.
|
[11] |
S. Moosa, P. A. Amira, and D. S. Boughorbel, “DASSI: Differential architecture search for splice identification from DNA sequences,” BioData Mining, vol.14, no.1, article no.15, 2021. doi: https://doi.org/10.1186/s13040-021-00237-y
|
[12] |
D. Quang and X. Xie, “DanQ: A hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences,” Nucleic Acids Research, vol.44, no.11, pp.e107–e107, 2016.
|
[13] |
J. Zuallaert, F. Godin, M. Kim, A. Soete, et al., “Splicerover: Interpretable convolutional neural networks for improved splice site prediction,” Bioinformatics, vol.34, no.24, pp.4180–4188, 2018.
|
[14] |
C. M. Dasari and R. Bhukya, “InterSSPP: Investigating patterns through interpretable deep neural networks for accurate splice signal prediction,” Chemometrics and Intelligent Laboratory Systems, vol.206, article no.104144, 2020.
|
[15] |
J. Qin, W. Pan, X. Xiang, Y. Tan, and G. Hou, “A biological image classification method based on improved CNN,” Ecological Informatics, vol.58, article no.101093, 2020.
|
[16] |
H. Sharma and A. S. Jalal, “Incorporating external knowledge for image captioning using CNN and LSTM,” Modern Physics Letters B, vol.34, no.28, article no.2050315, 2020.
|
[17] |
Z. Q. Geng, G. F. Chen, Y. M. Han, et al., “Semantic relation extraction using sequential and tree-structured LSTM with attention,” Information Sciences, vol.509, pp.183–192, 2020.
|
[18] |
Q. Li, L. Li, W. Wang, et al., “A comprehensive exploration of semantic relation extraction via pre-trained CNNs,” Knowledge-Based Systems, vol.194, article no.105488, 2020.
|
[19] |
R. A. Al-Zaidy, C. Caragea, and C. Lee Giles, “Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents,” in Proceedings of the World Wide Web Conference, San Francisco, CA, USA, pp.2551–2557, 2019.
|
[20] |
Z. Yang, D. Yang, C. Dyer, et al., “Hierarchical attention networks for document classification.” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, pp.1480–1489, 2016.
|
[21] |
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol.9, no.8, pp.1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735
|
[22] |
Li X. and Wu X., “Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Queensland, Australia, pp.4520–4524, 2015.
|
[23] |
A. Graves, N. Jaitly, and A. R. Mohamed, “Hybrid speech recognition with deep bidirectional LSTM,” in Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, pp.273–278, 2013.
|
[24] |
A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, pp.6000–6010, 2017.
|
[25] |
X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA, Vol.15, pp.315–323, 2011.
|
[26] |
P. Pollastro and S. Rampone, “HS3D, a dataset of homo sapiens splice regions, and its extraction procedure from a major public database,” International Journal of Modern Physics C, vol.13, no.8, pp.1105–1117, 2002.
|
[27] |
W. Chen, P. M. Feng, H. Lin, and K. C. Chou, “ISS-PseDNC: Identifying splicing sites using pseudo dinucleotide composition,” BioMed Research International, vol.2014, no.623149, pp.1–12, 2014.
|
[28] |
T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol.27, no.8, pp.861–874, 2006.
|
[29] |
S. Sonnenburg, G. Schweikert, P. Philips, et al., “Accurate splice site prediction using support vector machine,” BMC Bioinformatics, vol.8, no.Suppl 10, article no.S7, 2007. doi: 10.1186/1471-2105-8-S10-S7
|
[30] |
H. Tayara, M. Tahir, and K. T. Chong, “iSS-CNN: Identifying splicing sites using convolution neural network,” Chemometrics and Intelligent Laboratory Systems, vol.188, pp.63–69, 2019.
|
[31] |
X. Du, Y. Yao, Y. Diao, et al., “DeepSS: Exploring splice site motif through convolutional neural network directly from DNA sequence,” IEEE Access, vol.6, pp.32958–32987, 2018.
|
[32] |
I. V. Kulakovskiy, I. E. Vorontsov, I. S. Yevshin, et al., “HOCOMOCO: Towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis,” Nucleic Acids Research, vol.46, no.D1, pp.D252–D259, 2018.
|
[33] |
C. A. Grove, F. De Masi, M. I. Barrasa, et al., “A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors,” Cell, vol.138, no.2, pp.314–327, 2009.
|
[34] |
S. Gupta, J. A. Stamatoyannopoulos, T. L. Bailey, and W. S. Noble, “Quantifying similarity between motifs,” Genome Biology, vol.8, no.2, article no.R24, 2007. doi: 10.1186/gb-2007-8-2-r24
|
[35] |
T. L. Bailey, M. Boden, F. A. Buske, et al., “MEME Suite: Tools for motif discovery and searching,” Nucleic Acids Research, vol.37, no.SUPPL.2, pp.W202–W208, 2009.
|