ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Speech Communication
Volume 49, Issue 1, January 2007, Pages 59-70
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (1433 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.specom.2006.10.006    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2006 Elsevier B.V. All rights reserved.

Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition

Yoo Rhee Oha, E-mail The Corresponding Author, Jae Sam Yoona, E-mail The Corresponding Author and Hong Kook KimCorresponding Author Contact Information, a, E-mail The Corresponding Author

aDepartment of Information and Communications, Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong, Buk-gu, Gwangju 500-712, Republic of Korea

Received 2 June 2006; 
revised 19 October 2006; 
accepted 23 October 2006. 
Available online 16 November 2006.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic model adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a speech recognition system by non-native speakers. The proposed acoustic model adaptation method is performed in two steps: analysis of the pronunciation variability of non-native speech, and acoustic model adaptation based on the pronunciation variability analysis. In order to obtain informative variant phonetic units, we analyze the pronunciation variability of non-native speech in two ways: a knowledge-based approach, and a data-driven approach. Next, for each approach, the acoustic model corresponding to each informative variant phonetic unit is adapted such that the state-tying of the acoustic model for non-native speech reflects a phonetic variability. For further improvement, a conventional acoustic model adaptation method such as MLLR and/or MAP is combined with the proposed acoustic model adaptation method. It is shown from the continuous Korean–English speech recognition experiments that the proposed method achieves an average word error rate reduction of 16.76% and 12.80% for the knowledge-based approach and the data-driven approach, respectively, when compared with the baseline speech recognition system trained by native speech. Moreover, a reduction of 53.45% and 57.14% in the average word error rate is obtained by combining MLLR and MAP adaptations to the adapted acoustic models by the proposed method for the knowledge-based approach and the data-driven approach, respectively.

Keywords: Speech recognition; Non-native speech; Knowledge-based pronunciation variability; Data-driven pronunciation variability; State-tying; State-clustering; Decision tree; Acoustic model adaptation

Article Outline

1. Introduction
2. Effect of non-native speech
2.1. Baseline English ASR
2.2. English speech database spoken by Koreans
2.3. Effect of native and non-native speech on the performance of the baseline ASR system
3. Acoustic model adaptation for non-native speech recognition
3.1. Analysis of the pronunciation variability
3.1.1. Knowledge-based approach
3.1.2. Data-driven approach
3.2. Proposed acoustic model adaptation for non-native speech recognition
4. Experiments
4.1. Pronunciation modeling adaptation
4.2. Acoustic modeling approach
4.2.1. Retraining method
4.2.2. Proposed acoustic model adaptation
4.2.3. Combining with MLLR/MAP
5. Conclusion
Acknowledgements
References






Speech Communication
Volume 49, Issue 1, January 2007, Pages 59-70
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.