Citation: | CHENG Gaofeng, LI Xin, YAN Yonghong, “Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition,” Chinese Journal of Electronics, vol. 28, no. 1, pp. 107-112, 2019, doi: 10.1049/cje.2018.11.008 |
G. Hinton, L. Deng, D. Yu, et al., "Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups" Signal Processing Magazine, IEEE, Vol.29, No.6, pp.82-97, 2012.
|
H. A. Bourlard and N. Morgan, "Connectionist speech recognition:A hybrid approach", Springer Science and Business Media, 2012.
|
G. E. Dahl, D. Yu, L. Deng, et al, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition", IEEE Transactions on Audio, Speech and Language Processing, Vol.20, No.1, pp.30-42, 2012.
|
F. Seide, G. Li, and D. Yu, "Conversational speech transcription using context-dependent deep neural networks" Proc. Annual Conference of International Speech Communication Association (Interspeech), pp.437-440, 2011.
|
P. Swietojanski, A. Ghoshal, and S. Renals, "Convolutional neural networks for distant speech recognition," Signal Processing Letters, IEEE, Vol.21, No.9, pp.1120-1124, 2014.
|
J. Xu, J. Pan, and Y. Yan, "Agglutinative language speech recognition using automatic allophone deriving", Chinese Journal of Electronics, Vol.25, No.2, pp.328-333, 2016.
|
W. Jiang, P. Liu, and F. Wen, "Speech magnitude spectrum reconstruction from MFCCs using deep neural network", Chinese Journal of Electronics, Vol.27, No.2, pp.393-398, 2018.
|
H. Zhang, Q. Fu, and Y. Yan, "Speech Enhancement Using Compact Microphone Array and Applications in Distant Speech Acquisition," Chinese Journal of Electronics, Vol.18, No.3, pp.481-486, 2009.
|
Y. Xie, J. Huang, and Y. He, "One Dictionary vs. Two Dictionaries in Sparse Coding Based Denoising", Chinese Journal of Electronics, Vol.26, No.2, pp.367-371, 2017.
|
A. Graves, A. Mohamed, and G. Hinton, "Speech recognition with deep recurrent neural networks," Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.
|
H. Zen, and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for lowlatency speech synthesis," Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
|
H. Sak, A. Senior, and F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling," Annual Conference of the International Speech Communication Association (Interspeech), 2014.
|
Y. Zhang, G. Chen, D. Yu, et al., "Highway long shortterm memory RNNs for distant speech recognition," Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.
|
Y. Bengio, P. Simard, P. Frasconi,"Learning long-term dependencies with gradient descent is difficult", IEEE Transactions on Neural Networks, Vol.5, No.2, pp.157-166, 1994.
|
L. LU, S. Renals,"Small-footprint deep neural networks with highway connections for speech recognition", IEEE Transactions on Audio, Speech and Lan-guage Processing, Vol.25, No.7, pp.1502-1511, 2017.
|
S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, Vol.9, No.8, pp.17351438, 1997.
|
H. Sak, A. Senior, and F. Beaufays, "Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition," Feb. 2014. Available:http://arxiv.org/abs/1402.1128.
|
C.Y. Lee, S. Xie, P. Gallagher, et al., "Deeply-supervised nets," Artificial Intelligence and Statistics, 2015.
|
Y. Bengio, P. Lamblin, D. Popovici, et al., "Greedy layer-wise training of deep networks," Proc. NIPS, 2007, Vol.19, pp.153.
|
G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, Vol.313, No.5786, pp.504-507, 2006.
|
R. K. Srivastava, K. Greff, and J. Schmidhuber, "Training very deep networks," Proc. NIPS, 2015.
|
D. Povey, V. Peddinti, D. Galvez, et al., "Purely sequencetrained neural networks for ASR based on lattice-free MMI", Annual Conference of International Speech Communication Association (Interspeech), 2016.
|
K. Vesely, A. Ghoshal, L. Burget, et al., "Sequencediscriminative training of deep neural networks." Annual Conference of International Speech Communication Association (Interspeech), pp.2345-2349, 2013.
|
G. Saon, H. Soltau, D. Nahamoo, et al., "Speaker adaption of neural network acoustic models using i-vectors." Proc. IEEE Workshop on Automfatic Speech Recognition and Understanding (ASRU), pp.55-59, 2013.
|