ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Computer Speech & Language
Volume 16, Issue 2, April 2002, Pages 205-223
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (280 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1006/csla.2002.0191    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2002 Elsevier Science Ltd. All rights reserved.

Regular Article

Hidden Markov model training with contaminated speech material for distant-talking speech recognition

Marco Matassoni1, Maurizio Omologo2, Diego Giuliani3 and Piergiorgio Svaizer4

ITC-irst—Centro per la Ricerca Scientifica e Tecnologica, 38050 Povo di Trento, Italy

Received 9 November 2000; 
accepted 11 February 2002. 
Available online 13 June 2002.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

A challenging scenario is addressed in which a distant-talking speech recognizer operates in a noisy office environment with model adaptation. The use of a single far microphone as well as that of a microphone array input are investigated.

In addition to the benefits from the application of microphone array processing, system robustness is improved by training hidden Markov models (HMMs) with a contaminated version of a clean corpus. This artificial corpus is produced by exploiting information extracted from “real world" acoustic scenarios. The resulting models are then used as a starting point for unsupervised incremental adaptation.

Experimental results show that improvements in recognition accuracy due to multiple microphones, HMM training on contaminated speech and incremental adaptation are additive on a connected digits task. Moreover, the results show that unsupervised incremental adaptation receives the benefits of starting from models trained using contaminated speech. A final contribution of the paper refers to the influence of accuracy of speech activity detection, which seems to be relevant when moving towards real applications.


 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.