Multi-Class Maximum A Posteriori LinearRegression for Speaker Verification
-
Graphical Abstract
-
Abstract
Maximum likelihood linear regression(MLLR) transforms have proven useful for textindependentspeaker recognition systems. These systemsuse the parameters of MLLR transforms as features forSVM modeling and classification. In this paper, we focuson calculating affine transforms based on a GMMUniversalbackground model (UBM). Rather than estimating transformsusing maximum likelihood criterion, we propose touse Maximum a posteriori linear regression (MAPLR) forfeature extraction. This work is enriched by a multi-classtechnique, which clusters the Gaussian mixtures into regressionclasses and estimates a different transform foreach class. The transforms of all classes are concatenatedinto a supervector for SVM classification. Besides, a furtheraccuracy boost is obtained by combining supervectorsderived from both female and male UBMs into a largersupervector. Experiments on a NIST 2008 SRE corpusshow that the MAPLR system outperforms MLLR andthe multi-class approaches can also bring significant gains.
-
-