Citation: | WANG Wenchao, XU Ji, YAN Yonghong, “Identity Vector Extraction Using Shared Mixture of PLDA for Short-Time Speaker Recognition,” Chinese Journal of Electronics, vol. 28, no. 2, pp. 357-363, 2019, doi: 10.1049/cje.2018.06.005 |
N. Dehak, P.J. Kenny, R. Dehak, et al., “Front-end factor analysis for speaker verification”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.19, No.4, pp.788-798, 2011.
|
S.J.D. Prince and J.H. Elder, “Probabilistic linear discriminant analysis for inferences about identity”, Proc. of IEEE 11th International Conference on Computer Vision, pp.1-8, 2007.
|
D.A. Reynolds, T.F. Quatieri and R.B. Dunn, “Speaker verification using adapted gaussian mixture models”, Digital signal processing, Vol. 10, No.1, pp.19-41, 2000.
|
P. Kenny, “Joint factor analysis of speaker and session variability: Theory and algorithms”, CRIM, Report, CRIM-06/08-13, 2005.
|
N. Dehak, Z.N. Karam, D.A. Reynolds, et al., “A channelblind system for speaker verification”, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, Prague, Czech Republic, pp.4536-4539, 2011.
|
P. Kenny, “Bayesian speaker verification with heavy-tailed prior”, Proc. of The Speaker and Language Recognition Workshop, Brno, Czech Republic, 2010.
|
L.T. Xu, Z. Yang and L. Sun, “Simplification of I-Vector Extraction for Speaker Identification”, Chinese Journal of Electronics, Vol.25, No.6, pp.1121-1126, 2016.
|
Y.F. Xu, H. Yang, L. Yang, et al., “A general Bayesian model for speaker verification”, Chinese Journal of Electronics, Vol.25, No.6, pp.1045-1051, 2016.
|
Y. Lei, N. Scheffer, L. Ferrer, et al., “A novel scheme for speaker recognition using a phonetically-aware deep neural network”, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy, pp.1695-1699, 2014.
|
P. Kenny, T. Stafylakis, P. Ouellet, et al., “Plda for speaker verification with utterances of arbitrary duration”, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, pp.7649-7653, 2013.
|
S. Cumani, O. Plchot and P. Laface, “Probabilistic linear discriminant analysis of i-vector posterior distributions”, Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, pp.7644-7648, 2013.
|
S. Cumani, “Fast scoring of full posterior plda models”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.23, No.11, pp.2036-2045, 2015.
|
S. Cumani, O. Plchot and P. Laface, “On the use of i-vector posterior distributions in probabilistic linear discriminant analysis”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.22, No.4, pp.846-857, 2014.
|
Q.Y. Hong, L. Li, M. Li, et al., “Modified-prior plda and score calibration for duration mismatch compensation in speaker recognition system”, Proc. of Conference of the International Speech Communication Association (INTERSPEECH), Dresden, Germany, pp.1037-1041, 2015.
|
M.I. Mandasari, R. Saeidi, M. McLaren, et al., “Quality measure functions for calibration of speaker recognition systems in various duration conditions”, IEEE Transactions on Audio, Speech, and Language Processing, Vol.21, No.11, pp.2425-2438, 2013.
|
M.I. Mandasari, R. Saeidi and D.A.V. Leeuwen, “Quality measures based calibration with duration and noise dependency for speaker recognition”, Speech Communication, Vol.72, pp.126-137, 2015.
|
Z. Ghahramani and G.E. Hinton, “The EM algorithm for mixtures of factor analyzers”, Technical Report, CRG-TR-96-1, 1996.
|
M. Senoussaoui, P. Kenny, N. Brummer, et al., “Mixture of plda models in i-vector space for gender-independent speaker recognition”, Proc. of Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, pp.25-28, 2011.
|
A.P. Dempster, N.M. Laird and D.B. Rubin, “Maximum likelihood from incomplete data via the em algorithm”, Journal of the Royal Statistical Society, Vol.39, No.1, pp.1-38, 1977.
|
D. Garcia-Romero and C.Y. Espy-Wilson, “Analysis of ivector length normalization in speaker recognition systems”, Proc. of Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, pp.249-252, 2011.
|