ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Computer Speech & Language
Volume 20, Issues 2-3, April-July 2006, Pages 128-158
Odyssey 2004: The speaker and Language Recognition Workshop - Odyssey-04
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (565 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.csl.2005.07.001    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2005 Published by Elsevier Ltd.

NIST and NFI-TNO evaluations of automatic speaker recognition

David A. van Leeuwena, Corresponding Author Contact Information, E-mail The Corresponding Author, Alvin F. Martinb, E-mail The Corresponding Author, Mark A. Przybockib, E-mail The Corresponding Author and Jos S. Boutenc, E-mail The Corresponding Author

aTNO Human Factors, Postbus 23, 3769 ZG Soesterberg, Utrecht, The Netherlands bNational Institute of Standards and Technology, Gaithersburg, USA cNetherlands Forensic Institute, The Hague, The Netherlands

Received 1 November 2004; 
revised 1 June 2005; 
accepted 18 July 2005. 
Available online 15 August 2005.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

In the past years, several text-independent speaker recognition evaluation campaigns have taken place. This paper reports on results of the NIST evaluation of 2004 and the NFI-TNO forensic speaker recognition evaluation held in 2003, and reflects on the history of the evaluation campaigns. The effects of speech duration, training handsets, transmission type, and gender mix show expected behaviour on the DET curves. New results on the influence of language show an interesting dependence of the DET curves on the accent of speakers. We also report on a number of statistical analysis techniques that have recently been introduced in the speaker recognition community, as well as a new application of the analysis of deviance analysis. These techniques are used to determine that the two evaluations held in 2003, by NIST and NFI-TNO, are of statistically different difficulty to the speaker recognition systems.

Article Outline

1. Introduction
2. Evaluation paradigm
3. Statistics
3.1. Basic binomial quantities
3.2. Error propagation
3.3. DET confidence bandwidth
3.4. Comparisons between systems
3.5. Analysis of variance
4. Designs of the NFI-TNO and NIST 2004 evaluations
4.1. NFI-TNO evaluation
4.2. NIST 2004 evaluation
5. Results and analysis of the NFI-TNO and NIST evaluations
5.1. Overall results
5.2. Effect of training and test duration
5.2.1. NFI-TNO effect of duration
5.2.2. NIST 2004 effect of duration
5.3. Effect of number of training handsets
5.3.1. NFI-TNO effect of number of sessions
5.3.2. NIST effect of number of training handsets
5.4. Language effects
5.4.1. NFI-TNO language effects
5.4.2. NIST language effect
5.5. Effect of transmission type for NIST 2004
5.6. Summed channel data for NIST 2004
5.6.1. Gender mix
5.7. Comparison of evaluations
6. Summary and conclusions
Acknowledgements
References























Computer Speech & Language
Volume 20, Issues 2-3, April-July 2006, Pages 128-158
Odyssey 2004: The speaker and Language Recognition Workshop - Odyssey-04
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.