Copyright © 2002 Elsevier Science Ltd. All rights reserved.
Regular Article
Hidden Markov model training with contaminated speech material for distant-talking speech recognition
Received 9 November 2000;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
A challenging scenario is addressed in which a distant-talking speech recognizer operates in a noisy office environment with model adaptation. The use of a single far microphone as well as that of a microphone array input are investigated.
In addition to the benefits from the application of microphone array processing, system robustness is improved by training hidden Markov models (HMMs) with a contaminated version of a clean corpus. This artificial corpus is produced by exploiting information extracted from “real world" acoustic scenarios. The resulting models are then used as a starting point for unsupervised incremental adaptation.
Experimental results show that improvements in recognition accuracy due to multiple microphones, HMM training on contaminated speech and incremental adaptation are additive on a connected digits task. Moreover, the results show that unsupervised incremental adaptation receives the benefits of starting from models trained using contaminated speech. A final contribution of the paper refers to the influence of accuracy of speech activity detection, which seems to be relevant when moving towards real applications.






E-mail Article
Add to my Quick Links

Cited By in Scopus (2)




