EURASIP Journal on Applied Signal Processing
Volume 2003 (2003), Issue 8, Pages 814-823
doi:10.1155/S1110865703302070
Abstract
Limiting the decrease in performance due to acoustic
environment changes remains a major challenge for continuous
speech recognition (CSR) systems. We propose a novel approach
which combines the Karhunen-Loève transform (KLT) in the
mel-frequency domain with a genetic algorithm (GA) to enhance the data
representing corrupted speech. The idea consists of projecting
noisy speech parameters onto the space generated by the
genetically optimized principal axis issued from the KLT. The
enhanced parameters increase the recognition rate for highly
interfering noise environments. The proposed hybrid technique,
when included in the front-end of an HTK-based CSR system,
outperforms that of the conventional recognition process in severe
interfering car noise environments for a wide range of
signal-to-noise ratios (SNRs) varying from 16 dB to
−4 dB. We also showed the effectiveness of the KLT-GA
method in recognizing speech subject to telephone channel
degradations.