ZHANG Jian, YUAN Qingsheng, BAO Xiuguo, ZHOU Ruohua, YAN Yonghong. PLF Optimization for Target Language Detection[J]. Chinese Journal of Electronics, 2017, 26(1): 118-121. doi: 10.1049/cje.2016.11.014
PLF Optimization for Target Language Detection

Funds:  This work is supported by the National Natural Science Foundation of China (No.11161140319, No.91120001, No.61271426), the Strategic Priority Research Program of the Chinese Academy of Sciences (No.XDA06030100, No.XDA06030500), the National High Technology Research and Development Program of China (No.2012AA012503), and the Chinese Academy of Sciences Priority Deployment Project (No.KGZD-EW-103-2).
  ZHOU Ruohua (corresponding author) received the B.S. degree from the Electronics Engineering Department, Beijing Institute of Technology, Beijing, China, in 1994, the M.S. degree of engineering in microelectronics and semiconductor devices from Microelectronics R&D Center, CAS, Beijing, in 1997, and the Ph.D. degree from the Signal Processing Laboratory (LTS), Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. Currently he is a professor at Key Laboratory of Speech Acoustics and Content Understanding at Institute of Acoustics, CAS. (Email:zhouruohua@hccl.ioa.ac.cn)
  • Received Date: 2015-04-09
  • Rev Recd Date: 2015-06-24
  • Publish Date: 2017-01-10
  • The objective of traditional feature studies in Spoken language recognition (SLR) is extracting the linguistic discrimination between each language. However, applications of security area always interested in a particular language, which requires the features should be the best reflection of the differences between target language and the other languages. To address this problems, the frame level Phone log-posteriors feature (PLF), which has been recently introduced as a novel and effective feature in SLR, is optimized to get a better performance on Target language detection (TLD) task. The F-Ratio analysis method is used to analyze the contribution of each dimension in feature vector for TLD. In this work, frame level phone posterior probabilities are estimated by a phone recognizer, and processed through taking logarithm. Then the feature is optimized through weighting each dimension according to the F-Ratio values. Finally, Principal component analysis (PCA) is used to decorrelate the feature and reduce vector size. Experiments carried out on the NIST LRE 2007 dataset show that the effectiveness of the optimized feature, which yields significant relative improvements in term of Equal error rate (EER) with regard to the Gaussian mixture models-Support vector machines (GMM-SVM) system based on the original feature.
