OU Shifeng, SONG Peng, GAO Ying. Soft Decision Based Gaussian-Laplacian Combination Model for Noisy Speech Enhancement[J]. Chinese Journal of Electronics, 2018, 27(4): 827-834. doi: 10.1049/cje.2018.05.015
Citation: OU Shifeng, SONG Peng, GAO Ying. Soft Decision Based Gaussian-Laplacian Combination Model for Noisy Speech Enhancement[J]. Chinese Journal of Electronics, 2018, 27(4): 827-834. doi: 10.1049/cje.2018.05.015

Soft Decision Based Gaussian-Laplacian Combination Model for Noisy Speech Enhancement

doi: 10.1049/cje.2018.05.015
Funds:  This work is supported by the National Natural Science Foundation of China (No.61703360, No.61005021, No.61201457), Shandong Province Higher Educational Science and Technology Program (No.J12LN27), and the Natural Science Foundation of Shandong Province (No.ZR2017MF008, No.ZR2014FQ016).
More Information
  • Corresponding author: GAO Ying (corresponding author) was born in Liaoning Province, China, in 1978. She received the Ph.D. degree in geodetection and information technology from Jilin University, China, in 2008. Currently, she is an associate professor at Yantai University, China. Her research interest focuses on signal and information processing. (Email:claragaoying@126.com)
  • Received Date: 2015-03-20
  • Rev Recd Date: 2015-07-21
  • Publish Date: 2018-07-10
  • One of the key issues of noisy speech enhancement technique is to achieve appropriate statistical distributions to model the clean speech and noise signals accurately. Most of the existing algorithms try to employ a sole model assumption in transform domain, which, however, has been proven to being contrary with the fact. To address this problem, the statistical properties of clean speech as well as several noise signals are analyzed using actual data in Discrete cosine transform (DCT) domain, and the study indicates the statistic of clean speech DCT coefficients tending to fall somewhere in between the Gaussian and Laplacian distribution. Based on the results, a novel speech enhancement algorithm is proposed using Gaussian-Laplacian combination model, whose core is employing a linear combination of Gaussian and Laplacian distribution to model the statistic of clean speech DCT coefficients. The corresponding weights of either distribution to the combination model are adaptively adjusted in terms of the probability of each hypothesis, which is estimated based on a soft decision technique by using Bayesian theorem. Through a number of objective and subjective tests, we compare the performance of the proposed algorithm with other recent model based approaches and have found that our algorithm is superior to the related approaches at all testing environments.
  • loading
  • P.C. Loizou, Speech Enhancement:Theory and Practice (The Second Edition), New York:CRC Press, 2013.
    Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator", IEEE Transactions on Acoustics, Speech, Signal Processing, Vol.32, No.6, pp.1109-1121, 1984.
    S.F. Boll, "Suppression of acoustic noise in speech using spectral subtraction", IEEE Transactions on Acoustics, Speech, Signal Processing, Vol.27, No.2, pp.113-120, 1979.
    H. Veisi and H. Sameti, "Speech enhancement using hidden Markov models in Mel-frequency domain", Speech Communication, Vol.55, No.2, pp.205-220, 2013.
    I.Y. Soon and S.N. Koh, "Speech enhancement using 2-D Fourier transform", IEEE Transactions on Speech and Audio Processing, Vol.11, No.6, pp. 717-724, 2003.
    R.C. Hendriks, J. Jensen and R. Heusdens, "Noise tracking using DFT domain subspace decompositions", IEEE Transactions on Audio, Speech, and Language Processing, Vol.16, No.3, pp.541-553, 2008.
    H. Ding, I.Y. Soon and C.K. Yeo, "A DCT-based speech enhancement system with pitch synchronous analysis", IEEE Transactions on Audio, Speech, and Language Processing, Vol.19, No.8, pp.2614-2623, 2011.
    S.C. Shekokar and M.B. Mali, "A brief survey of a DCT-based speech enhancement system", International Journal of Scientific & Engineering Research, Vol.4, No.2, pp.1-3, 2013.
    I.Y. Soon, S.N. Koh and C.K. Yeo, "Noisy speech enhancement using discrete cosine transform", Speech Communication, Vol.24, No.3, pp.249-257, 1998.
    I.Y. Soon and S.N. Koh, "Low distortion speech enhancement", IEE Proceedings-Vision, Image and Signal Processing, Vol.147, No.3, pp.247-253, 2000.
    J.H. Chang, "Warped discrete cosine transform-based noisy speech enhancement", IEEE Transactions on Circuits and Systems Ⅱ:Express Briefs, Vol.52, No.9, pp.535-539, 2005.
    S. Gazor and W. Zhang, "Speech probability distribution", IEEE Signal Processing Letters, Vol.10, No.7, pp.204-207, 2003.
    S. Gazor and W. Zhang, "Speech enhancement employing Laplacian-Gaussian mixture", IEEE Transactions on Speech and Audio Processing, Vol.13, No.5, pp.896-904, 2005.
    X. Zou and X. Zhang, "Speech enhancement using an MMSE short time DCT coefficients estimator with super Gaussian speech modeling", Journal of Electronics (China), Vol.24, No.3, pp.332-337, 2007.
    J.H. Chang, J.W. Shin and N.S. Kim, "Voice activity detector employing generalized Gaussian distribution", Electronics Letters, Vol.40, No.24, pp.1561-1563, 2004.
    J.H. Chang, S. Gazor, N.S. Kim, et al., "Multiple statistical models for soft decision in noisy speech enhancement", Pattern Recognition, Vol.40, No.3, pp.1123-1134, 2007.
    F. Mller, "Distribution shape of two-dimensional DCT coefficients of natural images", Electronics Letters, Vol.29, No.22, pp.1935-1936, 1993.
    R.L. Joshi and T.R. Fischer, "Comparison of generalized Gaussian and Laplacian modeling in DCT image coding", IEEE Signal Processing Letters, Vol.2, No.5, pp.81-82, 1995.
    P.C. Loizou, "Speech enhancement based on perceptually motivated Bayesian estimators of magnitude spectrum", IEEE Transactions on Speech and Audio Processing, Vol.13, No.5, pp.857-869, 2005.
    C. Lu, "Noise reduction using three-step gain factor and iterative-directional-median filter", Applied Acoustics, Vol.76, pp.249-261, 2014.
    S.R. Quackenbush, T.P. Barnwell and M.A. Clements, Objective Measures of Speech Quality, Englewood Cliffs, NJ:PrenticeHall, 1988.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (143) PDF downloads(181) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return