Temporal Information in a Binary Framework for Speaker Recognition

Hernández-Sierra, Gabriel; Calvo, José R.; Bonastre, Jean-François

doi:10.1007/978-3-319-12568-8_26

Gabriel Hernández-Sierra^17,18,
José R. Calvo¹⁷ &
Jean-François Bonastre¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8827))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

2132 Accesses

Abstract

In recent years a simple representation of a speech excerpt has been proposed, as a binary matrix allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a temporal information representation based on sequential changes present in the binary representation. A new temporal information is proposed in order to add it to speaker recognition systems. A new specificity selection approach using a mask in the cumulative vector space is also proposed. This aims to increase effectiveness in the speaker binary key paradigm. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement of 7%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.

Download to read the full chapter text

Chapter PDF

Intra-Speaker Variability Assessment for Speaker Recognition in Degraded Conditions: A Case of African Tone Languages

The use of long-term features for GMM- and i-vector-based speaker diarization systems

Article Open access 26 September 2018

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

Article Open access 18 July 2023

Keywords

References

Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-carrasquillo, P.A.: Support Vector Machines for speaker and language recognition. Computer Speech and Language 20, 210–229 (2006)
Article Google Scholar
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech & Language Processing 15, 1448–1460 (2007)
Article Google Scholar
Anguera, X., Bonastre, J.F.: A novel speaker binary key derived from anchor models. In: INTERSPEECH, pp. 2118–2121 (2010)
Google Scholar
Hernández-Sierra, G., Bonastre, J.-F., Calvo de Lara, J.R.: Speaker recognition using a binary representation and specificities models. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 732–739. Springer, Heidelberg (2012)
Chapter Google Scholar
Bonastre, J.F., Miro, X.A., Sierra, G.H., Bousquet, P.M.: Speaker modeling using local binary decisions. In: INTERSPEECH, pp. 13–16 (2011)
Google Scholar
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech & Language Processing 19, 788–798 (2011)
Article Google Scholar
Roy, A., Magimai-Doss, M., Marcel, S.: A fast parts-based approach to speaker verification using boosted slice classifiers. IEEE Transactions on Information Forensics and Security 7, 241–254 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Technologies Application Center, Havana, Cuba
Gabriel Hernández-Sierra & José R. Calvo
LIA, University of Avignon, France
Gabriel Hernández-Sierra & Jean-François Bonastre

Authors

Gabriel Hernández-Sierra
View author publications
You can also search for this author in PubMed Google Scholar
José R. Calvo
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Bonastre
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical Engineering and Computer Science, CINVESTAV, Guadalajara, Jalisco, México
Eduardo Bayro-Corrochano
Department of Computer Science, University of York, YO10 5GH, Deramore Lane, York, UK
Edwin Hancock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hernández-Sierra, G., Calvo, J.R., Bonastre, JF. (2014). Temporal Information in a Binary Framework for Speaker Recognition. In: Bayro-Corrochano, E., Hancock, E. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2014. Lecture Notes in Computer Science, vol 8827. Springer, Cham. https://doi.org/10.1007/978-3-319-12568-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-12568-8_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12567-1
Online ISBN: 978-3-319-12568-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Temporal Information in a Binary Framework for Speaker Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Intra-Speaker Variability Assessment for Speaker Recognition in Degraded Conditions: A Case of African Tone Languages

The use of long-term features for GMM- and i-vector-based speaker diarization systems

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Temporal Information in a Binary Framework for Speaker Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Intra-Speaker Variability Assessment for Speaker Recognition in Degraded Conditions: A Case of African Tone Languages

The use of long-term features for GMM- and i-vector-based speaker diarization systems

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation