CHEN Chen, HAN Jiqing. Partial Least Squares Based Total Variability Space Modeling for I-Vector Speaker Verification[J]. Chinese Journal of Electronics, 2018, 27(6): 1229-1233. doi: 10.1049/cje.2018.06.001
Citation: CHEN Chen, HAN Jiqing. Partial Least Squares Based Total Variability Space Modeling for I-Vector Speaker Verification[J]. Chinese Journal of Electronics, 2018, 27(6): 1229-1233. doi: 10.1049/cje.2018.06.001

Partial Least Squares Based Total Variability Space Modeling for I-Vector Speaker Verification

doi: 10.1049/cje.2018.06.001
Funds:  This work is supported by the National Natural Science Foundation of China (No.61471145, No.91120303).
More Information
  • Corresponding author: HAN Jiqing (corresponding author) received the Ph.D. degree in computer science and technology from Harbin Institute of Technology. He is a professor and Ph.D. supervisor of School of Computer Science and Technology. He is a committee member of Automatic Discipline, National Natural Science Foundation of China, a committee member of National Science and Technology Awards of China, the vice Chairman of Society of Speech Processing, Association for Chinese Information Processing, etc. His main research fields are speech signal processing and audio information retrieval. He has won three Second-Prize and two Third-Prize awards of Science and Technology of Ministry/Province. He has published more than 200 papers and 4 books. He has obtained 9 national invention patents. (Email:jqhan@hit.edu.cn)
  • Received Date: 2016-07-13
  • Rev Recd Date: 2017-04-05
  • Publish Date: 2018-11-10
  • As an effective and low-dimension representation for speech utterances with different lengths, i-vector method has drawn considerable attentions in speaker verification. Training a Total variability space (TVS) is one of the key parts in the i-vector method. However, the traditional training method only explores the relationship between different mean supervectors, ignoring priori category information of speakers, which results in a lack of discrimination. In the proposed method, a discriminative TVS based on Partial least squares (PLS) is estimated, in which both the correlation of intra-class and the distinction of inter-class are fully utilized due to using speaker labels, and the proposed method can achieve a better performance.
  • loading
  • T. Kinnunen and H. Li, “An overview of text-independent speaker recognition: From features to supervectors”, Speech Communication, Vol.52, No.1, pp.12-40, 2010.
    P. Kenny, G. Boulianne, P. Ouellet, et al., “Joint factor analysis versus eigenchannels in speaker recognition”, Digital Signal Processing, Vol.15, No.4, pp.1435-1447, 2007.
    N. Dehak, P. Kenny, R. Dehak, et al., “Front-end factor analysis for speaker verification”, IEEE Transactions on Audio, Speech and Language Processing, Vol.19, No.4, pp.788-798, 2011.
    S.J.D. Prince and J.H. Elder, “Probabilistic linear discriminant analysis for inferences about identity”, Proc. of IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, pp.1-8, 2007.
    D. Bansé, G.R. Doddington, D. Garcia-Romero, et al., “Summary and initial results of the 2013-2014 speaker recognition i-vector machine learning challenge”, Proc. of International Speech Communication Association, Singapore, pp.368-372, 2014.
    J.H. Hansen and T. Hasan, “Speaker recognition by machines and humans: A tutorial review”, IEEE Signal Processing Magazine, Vol.32, No.6, pp.74-99, 2015.
    M.A. Nematollahi and S.A.R. Al-Haddad, “Distant speaker recognition: An overview”, International Journal of Humanoid Robotics, Vol.13, No.2, pp.1-45, 2016.
    W.M. Campbell, D.E. Sturim, and D.A. Reynolds, “Support vector machines using GMM supervectors for speaker verification”, IEEE Signal Processing Letters, Vol.13, No.5, pp.308-311, 2006.
    Y. Xu, H. Yang, L. Yang, et al., “A general Bayesian model for speaker verification”, Chinese Journal of Electronics, Vol.25, No.6, pp.1045-1051, 2016.
    Z. Lei and Y. Yang, “Maximum likelihood i-vector space using PCA for speaker verification”, Proc. of International Speech Communication Association, Florence, Italy, pp.2725-2728, 2011.
    V. Hautamäki, Y. Cheng, P. Rajan, et al., “Minimax i-vector extractor for short duration speaker verification”, Proc. of International Speech Communication Association, Lyon, France, pp.3708-3712, 2013.
    L. Chen, K. Lee, B. Ma, et al., “Local variability modeling for text-independent speaker verification”, Proc. of Odyssey: Speaker and Language Recognition Workshop, Joensuu, Finland, pp.54-59, 2014.
    L. Xu, Z. Yang, and L. Sun, “Simplification of i-vector extraction for speaker identification”, Chinese Journal of Electronics, Vol.25, No.6, pp.1121-1126, 2016.
    P. Geladi and B.R. Kowalski, “Partial least-squares regression: A tutorial”, Analytica Chimica Acta, Vol.185, No.86, pp.1-17, 1986.
    Q. Zhao, L. Zhang, and A. Cichocki, “Multilinear and nonlinear generalizations of partial least squares: An overview of recent advances”, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol.4, No.2, pp.104-115, 2014.
    K. Bi, X. Wang, X. Yao, et al., “Adaptively selective ensemble algorithm based on bagging and confusion matrix”, Acta Electronica Sinica, Vol.42, No.4, pp.711-716, 2014(in Chinese).
    T. Mehmood and B. Ahmed, “The diversity in the applications of partial least squares: An overview”, Journal of Chemometrics, Vol.30, No.1, pp.4-17, 2015.
    J. Li and D. You, “Enhanced speech based jointly statistical probability distribution function for voice activity detection”, Chinese Journal of Electronics, Vol.26, No.2, pp.325-330, 2017.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (112) PDF downloads(177) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return