Abstract
In recent years a simple representation of a speech excerpt has been proposed, as a binary matrix allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a temporal information representation based on sequential changes present in the binary representation. A new temporal information is proposed in order to add it to speaker recognition systems. A new specificity selection approach using a mask in the cumulative vector space is also proposed. This aims to increase effectiveness in the speaker binary key paradigm. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement of 7%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.
Chapter PDF
Similar content being viewed by others
References
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-carrasquillo, P.A.: Support Vector Machines for speaker and language recognition. Computer Speech and Language 20, 210–229 (2006)
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech & Language Processing 15, 1448–1460 (2007)
Anguera, X., Bonastre, J.F.: A novel speaker binary key derived from anchor models. In: INTERSPEECH, pp. 2118–2121 (2010)
Hernández-Sierra, G., Bonastre, J.-F., Calvo de Lara, J.R.: Speaker recognition using a binary representation and specificities models. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 732–739. Springer, Heidelberg (2012)
Bonastre, J.F., Miro, X.A., Sierra, G.H., Bousquet, P.M.: Speaker modeling using local binary decisions. In: INTERSPEECH, pp. 13–16 (2011)
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech & Language Processing 19, 788–798 (2011)
Roy, A., Magimai-Doss, M., Marcel, S.: A fast parts-based approach to speaker verification using boosted slice classifiers. IEEE Transactions on Information Forensics and Security 7, 241–254 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Hernández-Sierra, G., Calvo, J.R., Bonastre, JF. (2014). Temporal Information in a Binary Framework for Speaker Recognition. In: Bayro-Corrochano, E., Hancock, E. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2014. Lecture Notes in Computer Science, vol 8827. Springer, Cham. https://doi.org/10.1007/978-3-319-12568-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-12568-8_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12567-1
Online ISBN: 978-3-319-12568-8
eBook Packages: Computer ScienceComputer Science (R0)