XU Ji, PAN Jielin, YAN Yonghong. Agglutinative Language Speech Recognition Using Automatic Allophone Deriving[J]. Chinese Journal of Electronics, 2016, 25(2): 328-333. doi: 10.1049/cje.2016.03.020
Agglutinative Language Speech Recognition Using Automatic Allophone Deriving

Funds:  This work is supported by the National Natural Science Foundation of China (No.10925419, No.90920302, No.61072124, No.11074275, No.11161140319, No.91120001, No.61271426), the Strategic Priority Research Program of the Chinese Academy of Sciences (No.XDA06030100, No.XDA06030500), the National High Technology Research and Development Program of China (863 Program) (No.2012AA012503) and the CAS Priority Deployment Project (No.KGZD-EW-103-2).
  • Received Date: 2014-02-27
  • Rev Recd Date: 2014-05-13
  • Publish Date: 2016-03-10
  • Agglutinative language involves agglutination extensively, which results in the significant pronunciation variations in different contexts. Therefore, it is a problem to use phoneme sets translated from their written forms as basic units for acoustic modeling, due to the incapability to capture the pronunciation variations in Large-vocabulary continuous speech recognition (LVCSR). This paper presented a novel approach called Automatic allophone deriving (AAD) to create allophone candidates without any linguistic prior knowledge. Furthermore, an enhanced approach AAD-LT is proposed in which longtime features are used in AAD approach. Experiments are conducted on three languages which contains two agglutinative ones and an analytic one. The experiments suggest that AAD Long-time (AAD-LT) is very effective for the agglutinative languages in which more than 10% relative CER reduction is obtained.
