Graph Embedding for Speaker Recognition

Karam, Z. N.; Campbell, W. M.

doi:10.1007/978-1-4614-4457-2_10

Graph Embedding for Speaker Recognition

Z. N. Karam³ &
W. M. Campbell⁴

Chapter
First Online: 01 January 2012

1676 Accesses
2 Citations

Abstract

This chapter presents applications of graph embedding to the problem of text-independent speaker recognition. Speaker recognition is a general term encompassing multiple applications. At the core is the problem of speaker comparison—given two speech recordings (utterances), produce a score which measures speaker similarity. Using speaker comparison, other applications can be implemented—speaker clustering (grouping similar speakers in a corpus), speaker verification (verifying a claim of identity), speaker identification (identifying a speaker out of a list of potential candidates), and speaker retrieval (finding matches to a query set).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Reynolds DA, Quatieri TF, Dunn R (2000) Speaker verification using adapted Gaussian mixture models. Digital Signal Process 10(1–3):19–41
Article Google Scholar
Campbell WM (2002) Generalized linear discriminant sequence kernels for speaker recognition. (IEEE) In: Proceedings of ICASSP, pp 161–164
Google Scholar
Stolcke A, Ferrer L, Kajarekar S, Shriberg E, Venkataraman A (2005) MLLR transforms as features in speaker recognition. In: Proceedings of Interspeech, pp 2425–2428
Google Scholar
Campbell WM, Campbell JP, Reynolds DA, Jones DA, Leek TR (2004) High-level speaker verification with support vector machines. Curran Associates, Inc. (IEEE) In: Proceedings of ICASSP, pp I–73–76
Google Scholar
Shriberg E, Ferrer L, Venkataraman A, Kajarekar S (2004) SVM modeling of SNERF-grams for speaker recognition. Curran Associates, Inc. In: Proceedings of interspeech, pp 1409–1412
Google Scholar
Solomonoff A, Quillen C, Campbell WM (2004) Channel compensation for SVM speaker recognition. (IEEE) In: Proceedings of Odyssey-04, the speaker and language recognition workshop, pp 57–62
Google Scholar
Campbell WM, Sturim DE, Reynolds DA, Solomonoff A (2006) SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. (IEEE) In: Proceedings of ICASSP, pp I–97–I–100
Google Scholar
You CH, Lee KA, Li H (2009) An SVM kernel with GMM-supervector based on the bhattacharyya distance for speaker recognition. IEEE Sign Process Lett 16(1):49–52
Article Google Scholar
Campbell WM, Liu H (2001) Using feature transformation and selection with polynomial networks. In: Applications and science of computational intelligence IV, SPIE Aerosense
Google Scholar
Kajarekar SS (2005) Four weightings and a fusion: A cepstral-SVM system for speaker recognition. (IEEE) In: Proceedings of ASRU
Google Scholar
Solomonoff A, Campbell WM, Boardman I (2005) Advances in channel compensation for SVM speaker recognition. (IEEE) In: Proceedings of ICASSP
Google Scholar
Kenny P, Dumouchel P (2004) Experiments in speaker verification using factor analysis likelihood ratios. (IEEE) In: Proceedings of Odyssey04, pp 219–226
Google Scholar
Hatch AO, Kajarekar S, Stolcke A (2006) Within-class covariance normalization for SVM-based speaker recognition. (IEEE) In: Proceedings of the international conference on spoken-language processing, pp 1471–1474
Google Scholar
Kenny P, Ouellet P, Dehak N, Gupta V, Dumouchel P (2008) A study of inter-speaker variability in speaker verification. In: Transactions on Audio, Speech and Language Processing
Google Scholar
Dehak N, Dehak R, Kenny P, Brummer N, Ouellet P, Dumouchel P (2009) Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. Curran Associates, Inc. In: Proceedings of interspeech
Google Scholar
Campbell WM, Karam ZN, Sturim DE (2009) Inner product discriminant functions. In: Advances in neural information processing systems 22. MIT, Cambridge, MA
Google Scholar
Karam Z, Campbell WM (2010) Graph embedding for speaker recognition. Curran Associates, Inc. In: Proceedings of interspeech, pp 2742–2745
Google Scholar
Karam Z, Campbell WM, Dehak N (2011) Graph relational features for speaker recognition and mining. In: IEEE statistical signal processing workshop, pp 525–528
Google Scholar
Dehak N, Karam ZN, Reynolds DA, Dehak R, Campbell WM, Glass JR (2011) A channel-blind system for speaker verification. (IEEE) In: Proceedings of ICASSP
Google Scholar
Campbell WM (2010) Weighted nuisance attribute projection. In: Proceedings of IEEE Odyssey
Google Scholar
Campbell W, Karam Z (2010) Simple and efficient speaker comparison using approximate KL divergence. Curran Associates, Inc. In: Proceedings of interspeech
Google Scholar
Campbell W, Karam Z (2009) Variability compensated support vector machines applied to speaker verification. Curran Associates, Inc. In: Proceedings of interspeech
Google Scholar
McCree A, Sturim D, Reynolds D (2011) A new perspective on GMM subspace compensation based on PPCA and wiener filtering. In: Proceedings of interspeech, Florence, Italy, 2011
Google Scholar
Belkin M, Niyogi P (2003) Using manifold stucture for partially labeled classification. (IEEE) In: Thrun S, Becker S, Obermayer K (eds) Advances in neural information processing systems 15. MIT, Cambridge, MA, pp 929–936
Google Scholar
Zhu X (2007) Semi-supervised learning literature survey. Tech. Rep., University of Wisconsin, Madison
Google Scholar
Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische Math Springer Berlin / Heidelberg Signal, 1:269–271
MathSciNet MATH Google Scholar
Wan V, Campbell WM (2000) Support vector machines for verification and identification. In: Neural networks for signal processing X, proceedings of the 2000 IEEE signal processing workshop, pp 775–784
Google Scholar
Tenenbaum JB, Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. (AAAS), Science 290
Google Scholar
Tenenbaum J (xxxx) Matlab package for a global geometric framework for nonlinear dimensionality reduction. http://isomap.stanford.edu/
Cox TF, Cox MAA (2000) Multidimensional scaling, 2nd edn. Chapman and Hall, London
Book Google Scholar
The NIST year 2004 speaker recognition evaluation plan (2004) Accessed date on September 30, 2012. http://www.itl.nist.gov/iad/mig/tests/sre/2004/index.html
The NIST year 2006 speaker recognition evaluation plan (2005) Accessed date on September 30, 2012. http://www.itl.nist.gov/iad/mig/tests/sre/2006/index.html
The NIST year 2008 speaker recognition evaluation plan (2008) Accessed date on September 30, 2012. http://www.itl.nist.gov/iad/mig/te
Adar E (2006) Guess: A language and interface for graph exploration. (ACM) In: CHI
Google Scholar
Battista D, Eades P, Tamassia R, Tollis IG (2002) Graph drawing: Algorithms for visualization of graphs. Prentice Hall, Englewood Cliffs
Google Scholar
The NIST year 2010 speaker recognition evaluation plan (2010) Accessed date on September 30, 2012. http://www.itl.nist.gov/iad/mig/te
Bishop CM (2009) Pattern recognition and machine learning. Springer, Berlin
Google Scholar
Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. In: Proceedings of eurospeech, pp 1895–1898
Google Scholar
Linguistic Data Consortium (xxxx) Switchboard-2 corpora. http://www.ldc.upenn.edu
The NIST year 2005 speaker recognition evaluation plan (2005) Accessed date on September 30, 2012. http://www.itl.nist.gov/iad/mig/tests/sre/2005/index.html
White S, Smyth P (2003) Algorithms for estimating relative importance in networks. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Google Scholar
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. (ACM) In: Proceedings of the 12th international conference on information and knowledge management
Google Scholar
Joachims T (2005) A support vector method for multivariate performance measures. ACM Press. Proceedings of the 22nd International Conference on Machine Learning 377–384
Google Scholar
Singer E, Reynolds DA (2004) Analysis of multitarget detection for speaker and language recognition. In: Proceedings of Odyssey, pp 301–308
Google Scholar
Szummer M, Jaakkola T (2001) Partially labeled classification with random walks. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems 14. MIT, Cambridge, MA
Google Scholar
Lin F, Cohen WW, (2010) Semi-Supervised Classification of Network Data Using Very Few Labels, asonam, pp. 192–199, 2010 International Conference on Advances in Social Networks Analysis and Mining, ASONAM Publisher is Institute of Electrical and Electronics Engineers (IEEE)
Google Scholar
Macskassy SA, Provost F (2007) Classification in networked data: A toolkit and a univariate case study. J Mach Learn Res 8:935–983
Google Scholar

Download references

Acknowledgements

This work was sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States Government.

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Z. N. Karam
MIT Lincoln Laboratory, Lexington, Massachusetts, 02420, USA
W. M. Campbell

Authors

Z. N. Karam
View author publications
You can also search for this author in PubMed Google Scholar
W. M. Campbell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Z. N. Karam .

Editor information

Editors and Affiliations

, Dept. of ECE, College of Engineering, Northeastern University, 403 Dana Research Center, 360 Huntington Ave, Boston, 02115, Massachusetts, USA
Yun Fu
Honeywell, Douglas Drive North 1985, Golden Valley, 55422, USA
Yunqian Ma

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Karam, Z.N., Campbell, W.M. (2013). Graph Embedding for Speaker Recognition. In: Fu, Y., Ma, Y. (eds) Graph Embedding for Pattern Analysis. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4457-2_10

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4457-2_10
Published: 21 September 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4456-5
Online ISBN: 978-1-4614-4457-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics