Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC and CCBC
-
Graphical Abstract
-
Abstract
A novel feature extraction algorithm was proposed to improve the robustness of speech recognition. Core technology was incorporating perceptual information into the Multiple signal classification (MUSIC) spectrum, it provided improved robustness and computational efficiency comparing with the Mel frequency cepstral coefficient (MFCC) technique, then the cepstrum coefficients were extracted as the feature parameter. The effectiveness of the parameter was discussed in view of the class separability and speaker variability properties. To improve the robustness, we considered incorporating Canonical correlation based compensation (CCBC) to cope with the mismatch between training and test set. We evaluated the technique using improved Back-propagation neural networks (BPNN) in three different tasks: in different speakers, different recording channels and different noisy environments. The experimental results show that the novel feature has well robustness and effectiveness relative to MFCC and the CCBC algorithm can make speech recognition system robust in all three kinds of mismatch.
-
-