Citation: | WANG Xuyang, ZHANG Pengyuan, NA Xingyu, et al., “Handling OOV Words in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model,” Chinese Journal of Electronics, vol. 26, no. 6, pp. 1239-1244, 2017, doi: 10.1049/cje.2017.07.004 |
M. Lidia, E. Brill and A. Stolcke, "Finding consensus in speech recognition:Word error minimization and other applications of confusion networks", Computer Speech and Language, Vol.14, No.4, pp.373-400, 2000.
|
M. Mohri, F. Pereira and M. Riley, "Speech recognition with weighted finite-state transducers", Springer Handbook of Speech Processing, Springer, Berlin, Germany, pp.559-584, 2008.
|
C. Doan and M. Saraclar, "Lattice indexing for spoken term detection", IEEE Transactions on Audio, Speech, and Language Processing, Vol.19, No,8, pp.2338-2347, 2011.
|
I. Szoke, L. Burget, J. Cernocky, et al., "Sub-word modeling of out of vocabulary words in spoken term detection", Spoken Language Technology Workshop (SLT 2008), Goa, India, pp.273-276, 2008.
|
A. Murat, D. Vergyri and A. Stolcke, "Open-vocabulary spoken term detection using graphone-based hybrid recognition systems", Acoustics, Speech and Signal Processing (ICASSP 2008), Las Vegas, Nevada, USA, pp.5240-5243, 2008.
|
I. Bulyko, J. Herrero, C. Mihelich, et al., "Subword speech recognition for detection of unseen words", Thirteenth Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, pp.2446-2449, 2012.
|
M. Timo and D. Schneider, "Efficient subword lattice retrieval for German spoken term detectio", Acoustics, Speech and Signal Processing, Taipei, Taiwan, China, pp.4885-4888, 2009.
|
J. Gao, J. Shao, Q. Zhao, et al., "Efficient system combination for chinese spoken term detection", Chinese Journal of Electronics, Vol.19, No.3, pp.457-462, 2010.
|
Welly Naptali, Masatoshi Tsuchiya and Seiichi Nakagawa, "Class-based n-gram language model for new words using outof-vocabulary to in-vocabulary similarity", IEICE Transactions on Information and Systems, Vol.95, No.9, pp.2308-2317, 2012.
|
B. Réveil, K. Demuynck and J. Martens, "An improved twostage mixed language model approach for handling out-ofvocabulary words in large vocabulary continuous speech recognition", Computer Speech and Language, Vol.28, No.1, pp.141-162, 2014.
|
X. Liu, J. L Hieronymus, M. JF Gales, et al., "Syllable language models for mandarin speech recognition:Exploiting character language models", The Journal of the Acoustical Society of America, Vol.133, No.1, pp.519-528, 2013.
|
I. Chen, C. Ni, B.P. Lim, et al., "A keyword-aware grammar framework for lvcsr-based spoken keyword search", International Conference on Acoustics, Speech and Signal Processing (ICASSP 2015), Brisbane, Australia, pp.5196-5200, 2015.
|
P. Zhang, J. Shao, J. Han, et al., "Keyword spotting based on phoneme confusion matrix", International Symposium on Chinese Spoken Language Processing (ISCSLP 2006), Singapore, Vol.2, pp.408-419, 2006.
|
P.F. Brown, P.V. Desouza, R.L. Mercer, et al., "Class-based ngram models of natural language", Computational Linguistics, Vol.18, No.4, pp.467-479, 1992.
|
J. Shao, T. Li, Q. Zhang, et al., "A one-pass real-time decoder using memory-efficient state network", IEICE Transactions on Information and Systems, Vol.91, No.3, pp.529-537, 2008.
|
H. Ney and S. Ortmanns, "Progress in dynamic programming search for lvcsr", Proceedings of the IEEE, Vol.88, No.8, pp.1224-1240, 2000.
|
B.G. Secrest and G.R. Doddington, "An integrated pitch tracking algorithm for speech systems", International Conference on Acoustics, Speech and Signal Processing (ICASSP 1983), Boston, Massachusetts, USA, Vol.8, pp.1352-1355, 1983.
|
S. Young, J. Odell and P. Woodland, "Tree-based state tying for high accuracy acoustic modelling", Proceedings of the Workshop on Human Language Technology, Plainsboro, New Jerey, USA, pp.307-312, 1994.
|
A. Stolcke, "Srilm-an extensible language modeling toolkit", INTERSPEECH 2002, Denver, Colorado, USA, pp.257-286, 2002.
|
H. Zhang, H. Yu, D. Xiong, et al., "Hhmm-based chinese lexical analyzer ictclas", Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan, Vol.17, pp.184-187, 2003.
|
J.G. Fiscus, J. Ajot, J.S Garofolo, et al., "Results of the 2006 spoken term detection evaluation", Proceeding of SIGIR 2007, Amsterdam, Netherlands, Vol.7, pp.51-57, 2007.
|