Skip Navigation

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005 E88-A(7):1724-1731; doi:10.1093/ietfec/e88-a.7.1724
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Request Permissions
Google Scholar
Right arrow Articles by KINOSHITA, K.
Right arrow Articles by MIYOSHI, M.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © 2005 The Institute of Electronics, Information and Communication Engineers

Special Section on Multi-channel Acoustic Signal Processing -- Papers -- Speech Enhancement

Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility

Keisuke KINOSHITA1, Tomohiro NAKATANI1 and Masato MIYOSHI1

1 The authors are with the NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619-0237 Japan. E-mail: kinoshita{at}cslab.kecl.ntt.co.jp, nak{at}cslab.kecl.ntt.co.jp, miyo{at}cslab.kecl.ntt.co.jp

A speech signal captured by a distant microphone is generally smeared by reverberation, which severely degrades both the speech intelligibility and Automatic Speech Recognition (ASR) performance. Previously, we proposed a single-microphone dereverberation method, named "Harmonicity based dEReverBeration (HERB)." HERB estimates the inverse filter for an unknown room transfer function by utilizing an essential feature of speech, namely harmonic structure. In previous studies, improvements in speech intelligibility was shown solely with spectrograms, and improvements in ASR performance were simply confirmed by matched condition acoustic model. In this paper, we undertook a further investigation of HERB's potential as regards to the above two factors. First, we examined speech intelligibility by means of objective indices. As a result, we found that HERB is capable of improving the speech intelligibility to approximately that of clean speech. Second, since HERB alone could not improve the ASR performance sufficiently, we further analyzed the HERB mechanism with a view to achieving further improvements. Taking the analysis results into account, we proposed an appropriate ASR configuration and conducted experiments. Experimental results confirmed that, if HERB is used with an ASR adaptation scheme such as MLLR and a multicondition acoustic model, it is very effective for improving ASR performance even in unknown severely reverberant environments.

Key Words: dereverberation, speech harmonicity, automatic speech recognition, speech intelligibility


Manuscript received October 29, 2004. Manuscript revised January 20, 2005. Final manuscript received March 11, 2005.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.