Abstract
It has been shown that prosody helps to improve voice spectrum based speaker recognition systems. Therefore, prosodic features can also be used in multimodal person verification in order to achieve better results. In this paper, a multimodal recognition system based on facial and vocal tract spectral features is improved by adding prosodic information. Matcher weighting method and support vector machines have been used as fusion techniques, and histogram equalization has been applied before SVM fusion as a normalization technique. The results show that the performance of a SVM multimodal verification system can be improved by using histogram equalization, especially when the equalization is applied to those scores giving the highest EER values.
Chapter PDF
Similar content being viewed by others
Keywords
References
Bolle, R.M., et al.: Guide to Biometrics, p. 364. Springer, New York (2004)
Fox, N.A., et al.: Person identification using automatic integration of speech, lip and face experts. In: ACM SIGMM 2003 Multimedia Biometrics Methods and Applications Workshop, Berkeley, CA, ACM, New York (2003)
Indovina, M., et al.: Multimodal Biometric Authentication Methods: A COTS Approach. In: MMUA. Workshop on Multimodal User Authentication, Santa Barbara, CA (2003)
Lucey, S., Chen, T.: Improved audio-visual speaker recognition via the use of a hybrid combination strategy. In: The 4th International Conference on Audio- and Video- Based Biometric Person Authentication, Guildford, UK (2003)
Wang, Y., Tan, T.: Combining fingerprint and voiceprint biometrics for identity verification: and experimental comparison. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, Springer, Heidelberg (2004)
Farrús, M., et al.: On the Fusion of Prosody, Voice Spectrum and Face Features for Multimodal Person Verification. In: ICSLP, Pittsburgh (2006)
Campbell, J.P., Reynolds, D.A., Dunn, R.B.: Fusing high- and low-level features for speaker recognition. In: Eurospeech (2003)
Nadeu, C., Hernando, J., Gorricho, M.: On the decorrelation of filter bank energies in speech recognition. In: Eurospeech (1995)
Peskin, B., et al.: Using prosodic and conversational features for high-performance speaker recognition: Report from JHU WS’02. In: ICASSP (2003)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems: Proceedings of the 2000 Conference, MIT Press, Cambridge (2001)
Zafeiriou, S., Tefas, A., Pitas, I.: Discriminant NMF-faces for frontal face verification. In: IEEE International Workshop on Machine Learning for Signal Processing, Mystic, Connecticut, IEEE, Los Alamitos (2005)
Hilger, F., Ney, H.: Quantile based histogram equalization for noise robust speech recognition. In: Eurospeech, Aalborg, Denmark (2001)
Balchandran, R., Mammone, R.: Non parametric estimation and correction of non linear distortion in speech systems. In: ICASSP (1998)
Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: ODYSSEY-2001 (2001)
Skosan, M., Mashao, D.: Modified Segmental Histogram Equalization for robust speaker verification. Pattern Recognition Letters 27(5), 479–486 (2006)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines (and other kernel-based learning methods). Cambridge University Press, Cambridge (2000)
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge discovery 2, 121–167 (1998)
Godfrey, J.J., Holliman, E.C., McDaniel, J.: Switchboard: Telephone speech corpus for research and development. In: ICASSP (1990)
Lüttin, J., Maître, G.: Evaluation Protocol for the Extended M2VTS Database (XM2VTSDB). In: IDIAP, Martigny, Switzerland (1998)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Farrús, M., Ejarque, P., Temko, A., Hernando, J. (2007). Histogram Equalization in SVM Multimodal Person Verification. In: Lee, SW., Li, S.Z. (eds) Advances in Biometrics. ICB 2007. Lecture Notes in Computer Science, vol 4642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74549-5_86
Download citation
DOI: https://doi.org/10.1007/978-3-540-74549-5_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74548-8
Online ISBN: 978-3-540-74549-5
eBook Packages: Computer ScienceComputer Science (R0)