ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Speech Communication
Volume 34, Issues 1-2, April 2001, Pages 93-114
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (621 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0167-6393(00)00048-0    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2001 Elsevier Science B.V. All rights reserved.

Time and frequency filtering of filter-bank energies for robust HMM speech recognition

Climent NadeuCorresponding Author Contact Information, E-mail The Corresponding Author, DuImage an MachoE-mail The Corresponding Author and Javier HernandoE-mail The Corresponding Author

TALP Research Center, Department of Signal Theory and Communications, Universitat Politècnica de Catalunya, J. Girona 1-3, Campus Nord, Edifici D5, E-08034 Barcelona, Spain

Available online 14 February 2001.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Every speech recognition system requires a signal representation that parametrically models the temporal evolution of the speech spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those energies is performed in diverse ways, but it always includes smoothing of basic spectral measurements and non-linear amplitude compression. Several linear transformations are then applied to the two-dimensional time-frequency sequence of energies before entering the HMM pattern matching stage. In this paper, a recently introduced technique that consists of filtering that sequence of energies along the frequency dimension is presented, and its resulting parameters are compared with the widely used cepstral coefficients. Then, that frequency filtering transformation is jointly considered with the time filtering transformation that is used to compute dynamic parameters, showing that the flexibility of this combined (tiffing) approach can be used to design a robust set of filters. Recognition experiment results are reported which show the potential of tiffing for an enhanced and more robust HMM speech recognition.

Author Keywords: Robust speech recognition; Time and frequency filtering; Modulation spectrum; Filter-bank energies

Article Outline

1. Introduction
2. Non-linearly compressed filter-bank energies
2.1. Spectral smoothing
2.2. Quasi-optimality of frequency averaging
2.3. Non-linear compression
2.4. Compressed FBEs assumed in this work
3. Linear transformation of the parameter vector
3.1. Disadvantages of cepstral coefficients for speech recognition
3.2. The frequency filtering technique
3.3. FF and decorrelation of FBEs
3.4. FF and discriminative liftering
3.5. Recognition tests with static parameters
3.5.1. Clean speech tests
3.5.2. Noisy speech tests
3.5.3. Conclusion
3.6. Alternative combination of FF and non-linearity
4. Temporal filtering
4.1. Modulation spectrum analysis
4.2. Temporal filters for robust speech recognition
5. Tiffing (time and frequency filtering)
5.1. The two-dimensional modulation spectrum (2D-MS)
5.2. 2D-MS-assisted design of the time and frequency filters for robust speech recognition
5.3. Tiffing versus cepstral-time matrices
5.4. Recognition tests with the Aurora database and recognition setup
5.5. Conclusion: advantage of time and frequency filtering
6. Optimal transformations of the whole set of features: PCA and LDA
7. Conclusions
Acknowledgements
References











Speech Communication
Volume 34, Issues 1-2, April 2001, Pages 93-114
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.