ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Speech Communication
Volume 49, Issue 3, March 2007, Pages 159-176
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (2220 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.specom.2006.12.004    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Chirp group delay analysis of speech signalsstar, open

Baris BozkurtCorresponding Author Contact Information, a, E-mail The Corresponding Author, E-mail The Corresponding Author, Laurent Couvreura, E-mail The Corresponding Author and Thierry Dutoita, E-mail The Corresponding Author

aTCTS Lab., Faculté Polytechnique De Mons, Initialis Scientific Parc, B-7000 Mons, Belgium

Received 13 December 2005; 
revised 19 December 2006; 
accepted 20 December 2006. 
Available online 30 December 2006.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

This study proposes new group delay estimation techniques that can be used for analyzing resonance patterns of short-term discrete-time signals and more specifically speech signals. Phase processing or equivalently group delay processing of speech signals are known to be difficult due to large spikes in the phase/group delay functions that mask the formant structure. In this study, we first analyze in detail the z-transform zero patterns of short-term speech signals in the z-plane and discuss the sources of spikes on group delay functions, namely the zeros closely located to the unit circle. We show that windowing largely influences these patterns, therefore short-term phase processing. Through a systematic study, we then show that reliable phase/group delay estimation for speech signals can be achieved by appropriate windowing and group delay functions can reveal formant information as well as some of the characteristics of the glottal flow component in speech signals. However, such phase estimation is highly sensitive to noise and robust extraction of group delay based parameters remains difficult in real acoustic conditions even with appropriate windowing. As an alternative, we propose processing of chirp group delay functions, i.e. group delay functions computed on a circle other than the unit circle in z-plane, which can be guaranteed to be spike-free. We finally present one application in feature extraction for automatic speech recognition (ASR). We show that chirp group delay representations are potentially useful for improving ASR performance.

Keywords: Group delay processing; Phase processing; Windowing; Spectral analysis; Automatic speech recognition

Article Outline

1. Introduction
1.1. Motivations
1.2. Plan
2. Difficulties in group delay analysis and proposed solutions
2.1. Difficulties in group delay analysis
2.2. Recently proposed methods for group delay analysis
2.2.1. Modified group delay function (MODGDF)
2.2.2. Product spectrum (PS)
2.2.3. Our approach
3. ZZT representation
3.1. Definition
3.2. ZZT and the source–filter model of speech
3.3. ZZT of windowed speech signals
3.3.1. Effect of window location on ZZT patterns and group delay
3.3.2. Effect of window shape on ZZT patterns and group delay
3.3.3. Effect of window size on ZZT patterns and group delay
3.3.4. Group delay spectrogram
3.4. Appropriate windowing for group delay function computation
4. Chirp group delay processing
4.1. Definition
4.2. CGD of speech signals
4.2.1. Chirp group delay of GCI-synchronously windowed speech signals
4.2.2. Chirp group delay of the zero-phase version of speech signals
4.3. Discussion
5. Application to speech recognition
5.1. ASR feature extraction
5.2. ASR experiments
5.3. Discussion
6. Conclusions
Acknowledgements
References


















Speech Communication
Volume 49, Issue 3, March 2007, Pages 159-176
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.