AttentionSplice: An Interpretable Multi-Head Self-Attention Based Hybrid Deep Learning Model in Splice Site Prediction

YAN Wenjing; ZHANG Baoyu; ZUO Min; ZHANG Qingchuan; WANG Hong; MAO Da

doi:10.1049/cje.2021.00.221

Volume 31 Issue 5

Sep. 2022

Turn off MathJax

Article Contents

Article Navigation > Chinese Journal of Electronics > 2022 > 31(5): 870-887

YAN Wenjing, ZHANG Baoyu, ZUO Min, et al., “AttentionSplice: An Interpretable Multi-Head Self-Attention Based Hybrid Deep Learning Model in Splice Site Prediction,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 870-887, 2022, doi: 10.1049/cje.2021.00.221

Citation:

YAN Wenjing, ZHANG Baoyu, ZUO Min, et al., “AttentionSplice: An Interpretable Multi-Head Self-Attention Based Hybrid Deep Learning Model in Splice Site Prediction,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 870-887, 2022, doi: 10.1049/cje.2021.00.221

Citation:

PDF( 5488 KB)

AttentionSplice: An Interpretable Multi-Head Self-Attention Based Hybrid Deep Learning Model in Splice Site Prediction

doi: 10.1049/cje.2021.00.221

1.
National Engineering Laboratory for Agri-Product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China
2.
Division of Chemical Metrology and Analytical Science, National Institute of Metrology, Beijing 100029, China

Funds: This work was supported by Beijing Natural Science Foundation (4202014), National Natural Science Foundation of China (61873027), and Humanity and Social Science Youth Foundation of Ministry of Education of China (20YJCZH229)

More Information

Author Bio:
was born in 1982. She received the Ph.D. degree in electrical engineering from Auburn University, Alabama, USA. She is now a Lecturer at Beijing Technology and Business University. Her research interests include intelligent computing and biological information processing. (Email: yanwenjing0423@163.com)

(corresponding author) was born in 1973. He received a Ph.D. degree in computer application from the University of Science and Technology, Beijing, China. He is now a Professor and a Doctoral Supervisor at Beijing Technology and Business University. His research interests include intelligent management and artificial intelligence. (Email: zuomin1234@163.com)

was born in 1982. He received the Ph.D. degree in computer science and technology from the University of Science and Technology, Beijing, China. He is now an Associate Professor at Beijing Technology and Business University. His research interests include intelligence software, distributed computing, and artificial intelligence. (Email: zqc1982@126.com)
Received Date: 2021-06-26
Accepted Date: 2022-03-17

Available Online: 2022-03-28

Publish Date: 2022-09-05

Abstract

Abstract

Pre-mRNA splicing is an essential procedure for gene transcription. Through the cutting of introns and exons, the DNA sequence can be decoded into different proteins related to different biological functions. The cutting boundaries are defined by the donor and acceptor splice sites. Characterizing the nucleotides patterns in detecting splice sites is sophisticated and challenges the conventional methods. Recently, the deep learning frame has been introduced in predicting splice sites and exhibits high performance. It extracts high dimension features from the DNA sequence automatically rather than infers the splice sites with prior knowledge of the relationships, dependencies, and characteristics of nucleotides in the DNA sequence. This paper proposes the AttentionSplice model, a hybrid construction combined with multi-head self-attention, convolutional neural network, bidirectional long short-term memory network. The performance of AttentionSplice is evaluated on the Homo sapiens (Human) and Caenorhabditis Elegans (Worm) datasets. Our model outperforms state-of-the-art models in the classification of splice sites. To provide interpretability of AttentionSplice models, we extract important positions and key motifs which could be essential for splice site detection through the attention learned by the model. Our result could offer novel insights into the underlying biological roles and molecular mechanisms of gene expression.
- Splice sites,
- Multi-head self-attention,
- Bioinformatics,
- Deep-learning,
- Long short-term memory (LSTM),
- Convolutional neural network (CNN)

FullText(HTML)

References(35)

References

[1]	X. Jian, E. Boerwinkle, and X. Liu, “In silico tools for splicing defect prediction: A survey from the viewpoint of end users,” Genetics in Medicine, vol.16, no.7, pp.497–503, 2014.
[2]	N. Sheth, X. Roca, M. L. Hastings, T. Roeder, A. R. Krainer, and R. Sachidanandam, “Comprehensive splice-site analysis using comparative genomics,” Nucleic Acids Research, vol.34, no.14, pp.3955–3967, 2006.
[3]	M. Burset, I. A. Seledtsov, and V. v. Solovyev, “Analysis of canonical and non-canonical splice sites in mammalian genomes,” Nucleic Acids Research, vol.28, no.21, pp.4364–4375, 2000.
[4]	Karuppusamy T and M. Sivasubramanian, “Biological gene sequence stucture analysis using hidden Markov model,” Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol.12, no.4, pp.1652–1666, 2021.
[5]	P. K. Meher, T. K. Sahu, and A. R. Rao, “Prediction of donor splice sites using random forest with a new sequence encoding approach,” BioData Mining, vol.9, no.1, pp.1–25, 2016.
[6]	E. Pashaei, M. Ozen, and N. Aydin, “Splice site identification in human genome using random forest,” Health and Technology, vol.7, no.1, pp.141–152, 2017.
[7]	T. Lee and S. Yoon, “Boosted categorical restricted Boltzmann machine for computational prediction of splice junctions,” in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, vol.37, pp.2483–2492, 2015
[8]	E. Pashaei and N. Aydin, “Markovian encoding models in human splice site recognition using SVM,” Computational Biology and Chemistry, vol.73, pp.159–170, 2018.
[9]	A. T. M. Golam Bari, M. Rokeya Reaz, and B. S. Jeong, “Effective DNA encoding for splice site prediction using SVM,” MATCH Communications in Mathematical and in Computer Chemistry, vol.71, no.1, pp.241–258, 2013.
[10]	Y. Zhang, X. Liu, J. N. Macleod, and J. Liu, “DeepSplice: Deep classification of novel splice junctions revealed by RNA-seq,” in Proceedings of 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, pp.330–333, 2016.
[11]	S. Moosa, P. A. Amira, and D. S. Boughorbel, “DASSI: Differential architecture search for splice identification from DNA sequences,” BioData Mining, vol.14, no.1, article no.15, 2021. doi: https://doi.org/10.1186/s13040-021-00237-y
[12]	D. Quang and X. Xie, “DanQ: A hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences,” Nucleic Acids Research, vol.44, no.11, pp.e107–e107, 2016.
[13]	J. Zuallaert, F. Godin, M. Kim, A. Soete, et al., “Splicerover: Interpretable convolutional neural networks for improved splice site prediction,” Bioinformatics, vol.34, no.24, pp.4180–4188, 2018.
[14]	C. M. Dasari and R. Bhukya, “InterSSPP: Investigating patterns through interpretable deep neural networks for accurate splice signal prediction,” Chemometrics and Intelligent Laboratory Systems, vol.206, article no.104144, 2020.
[15]	J. Qin, W. Pan, X. Xiang, Y. Tan, and G. Hou, “A biological image classification method based on improved CNN,” Ecological Informatics, vol.58, article no.101093, 2020.
[16]	H. Sharma and A. S. Jalal, “Incorporating external knowledge for image captioning using CNN and LSTM,” Modern Physics Letters B, vol.34, no.28, article no.2050315, 2020.
[17]	Z. Q. Geng, G. F. Chen, Y. M. Han, et al., “Semantic relation extraction using sequential and tree-structured LSTM with attention,” Information Sciences, vol.509, pp.183–192, 2020.
[18]	Q. Li, L. Li, W. Wang, et al., “A comprehensive exploration of semantic relation extraction via pre-trained CNNs,” Knowledge-Based Systems, vol.194, article no.105488, 2020.
[19]	R. A. Al-Zaidy, C. Caragea, and C. Lee Giles, “Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents,” in Proceedings of the World Wide Web Conference, San Francisco, CA, USA, pp.2551–2557, 2019.
[20]	Z. Yang, D. Yang, C. Dyer, et al., “Hierarchical attention networks for document classification.” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, pp.1480–1489, 2016.
[21]	S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol.9, no.8, pp.1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735
[22]	Li X. and Wu X., “Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Queensland, Australia, pp.4520–4524, 2015.
[23]	A. Graves, N. Jaitly, and A. R. Mohamed, “Hybrid speech recognition with deep bidirectional LSTM,” in Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, pp.273–278, 2013.
[24]	A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, pp.6000–6010, 2017.
[25]	X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA, Vol.15, pp.315–323, 2011.
[26]	P. Pollastro and S. Rampone, “HS³D, a dataset of homo sapiens splice regions, and its extraction procedure from a major public database,” International Journal of Modern Physics C, vol.13, no.8, pp.1105–1117, 2002.
[27]	W. Chen, P. M. Feng, H. Lin, and K. C. Chou, “ISS-PseDNC: Identifying splicing sites using pseudo dinucleotide composition,” BioMed Research International, vol.2014, no.623149, pp.1–12, 2014.
[28]	T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol.27, no.8, pp.861–874, 2006.
[29]	S. Sonnenburg, G. Schweikert, P. Philips, et al., “Accurate splice site prediction using support vector machine,” BMC Bioinformatics, vol.8, no.Suppl 10, article no.S7, 2007. doi: 10.1186/1471-2105-8-S10-S7
[30]	H. Tayara, M. Tahir, and K. T. Chong, “iSS-CNN: Identifying splicing sites using convolution neural network,” Chemometrics and Intelligent Laboratory Systems, vol.188, pp.63–69, 2019.
[31]	X. Du, Y. Yao, Y. Diao, et al., “DeepSS: Exploring splice site motif through convolutional neural network directly from DNA sequence,” IEEE Access, vol.6, pp.32958–32987, 2018.
[32]	I. V. Kulakovskiy, I. E. Vorontsov, I. S. Yevshin, et al., “HOCOMOCO: Towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis,” Nucleic Acids Research, vol.46, no.D1, pp.D252–D259, 2018.
[33]	C. A. Grove, F. De Masi, M. I. Barrasa, et al., “A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors,” Cell, vol.138, no.2, pp.314–327, 2009.
[34]	S. Gupta, J. A. Stamatoyannopoulos, T. L. Bailey, and W. S. Noble, “Quantifying similarity between motifs,” Genome Biology, vol.8, no.2, article no.R24, 2007. doi: 10.1186/gb-2007-8-2-r24
[35]	T. L. Bailey, M. Boden, F. A. Buske, et al., “MEME Suite: Tools for motif discovery and searching,” Nucleic Acids Research, vol.37, no.SUPPL.2, pp.W202–W208, 2009.