Agglutinative Language Speech Recognition Using Automatic Allophone Deriving
-
Abstract
Agglutinative language involves agglutination extensively, which results in the significant pronunciation variations in different contexts. Therefore, it is a problem to use phoneme sets translated from their written forms as basic units for acoustic modeling, due to the incapability to capture the pronunciation variations in Large-vocabulary continuous speech recognition (LVCSR). This paper presented a novel approach called Automatic allophone deriving (AAD) to create allophone candidates without any linguistic prior knowledge. Furthermore, an enhanced approach AAD-LT is proposed in which longtime features are used in AAD approach. Experiments are conducted on three languages which contains two agglutinative ones and an analytic one. The experiments suggest that AAD Long-time (AAD-LT) is very effective for the agglutinative languages in which more than 10% relative CER reduction is obtained.
-
-