Skip to main content

Text and Language-Independent Speaker Recognition Using Suprasegmental Features and Support Vector Machines

  • Conference paper
  • 1159 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 40))

Abstract

In this paper, presence of the speaker-specific suprasegmental information in the Linear Prediction (LP) residual signal is demonstrated. The LP residual signal is obtained after removing the predictable part of the speech signal. This information, if added to existing speaker recognition systems based on segmental and subsegmental features, can result in better performing combined system. The speaker-specific suprasegmental information can not only be perceived by listening to the residual, but can also be seen in the form of excitation peaks in the residual waveform. However, the challenge lies in capturing this information from the residual signal. Higher order correlations among samples of the residual are not known to be captured using standard signal processing and statistical techniques. The Hilbert envelope of residual is shown to further enhance the excitation peaks present in the residual signal. A speaker-specific pattern is also observed in the autocorrelation sequence of the Hilbert envelope, and further in the statistics of this autocorrelation sequence. This indicates the presence of the speaker-specific suprasegmental information in the residual signal. In this work, no distinction between voiced and unvoiced sounds is done for extracting these features. Support Vector Machine (SVM) is used to classify the patterns in the variance of the autocorrelation sequence for the speaker recognition task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Furui, S.: Speaker-independent and speakeradaptive recognition techniques. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech signal processing, pp. 597–622. Marcel Dekker (1991)

    Google Scholar 

  2. Makhoul, J.: Linear Prediction: A Tutorial Review. Proc. IEEE 63(4), 561–580 (1975)

    Article  Google Scholar 

  3. Yegnanarayana, B., Prasanna, S.R.M., Rao, K.S.: Speech Enhancement using Excitation Source Information. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Orlando, FL, USA (May 2002)

    Google Scholar 

  4. Ananthapadmanabha, T.V., Yegnanarayana, B.: Epoch Extraction from Linear Prediction Residual for Identification of Closed Glottis Interval. IEEE Trans. Acoust., Speech, Signal Processing ASSP-27(4), 309–319 (1979)

    Article  Google Scholar 

  5. Yegnanarayana, B., Prasanna, S.R.M., Zachariah, J.M., Gupta, C.S.: Combining Evidence from Source, Suprasegmental and Spectral Features for a Fixed-Text Speaker Verification System. IEEE Trans. Speech and Audio Processing 13(4) (July 2005)

    Google Scholar 

  6. Campbell, J.P.: Speaker recognition: A tutorial. Proc. IEEE 85(9), 1436–1462 (1997)

    Article  Google Scholar 

  7. Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)

    Article  Google Scholar 

  8. Yegnanarayana, B., Reddy, K.S., Kishore, S.P.: Source and System Features for Speaker Recognition using AANN Models. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, Saltlake City, Utah, USA (May 2001)

    Google Scholar 

  9. Prasanna, S.R.M., Gupta, C.S., Yegnanarayana, B.: Autoassociative Neural Network Models for Speaker Verification using Source Features. In: Proc. Int. Conf. Cognitive and Neural Systems, Boston, USA (May 2002)

    Google Scholar 

  10. Pruzansky, S.: Pattern-matching procedure for automatic talker recognition. J. Acoust. Soc. Amer. 35, 354–358 (1963)

    Article  Google Scholar 

  11. Li, K.P., et al.: Experimental studies in speaker verification using a adaptive system. J. Acoust. Soc. Amer. 40, 966–978 (1966)

    Article  Google Scholar 

  12. Doddington, G.: A method of speaker verification. J. Acoust. Soc. Amer. 49, 139 (A) (1971)

    Article  Google Scholar 

  13. Li, K.P., Hughes, G.W.: Talker differences as they appear in correlation matrices of continuous speech spectra. J. Acoust. Soc. Amer. 55(4), 833–837 (1974)

    Article  CAS  Google Scholar 

  14. Beek, B., et al.: An assessment of the technology of automatic speech recognition for military applications. IEEE Trans. Acoust., Speech, Signal Processing 25, 310–322 (1977)

    Article  Google Scholar 

  15. Sambur, M.R.: Speaker recognition using orthogonal linear prediction. IEEE Trans. Acoust., Speech, Signal Processing 24, 283–289 (1976)

    Article  Google Scholar 

  16. Furui, S., Itakura, F., Satio, S.: Talker recognition by long-time averaged speech spectrum. Electron Commun., Jap. 55-A, 54–61 (1972)

    Google Scholar 

  17. Soong, F.K., Rosenberg, A.E., Rabiner, L.R., Juang, B.H.: A vector quantization approach to speaker recognition. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 387–390 (1985)

    Google Scholar 

  18. Rosenberg, A.E., Soong, F.K.: Evaluation of a vector quantization talker recognition system in a text independent and text dependent modes. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 873–876 (1986)

    Google Scholar 

  19. Poritz, A.B.: Linear predictive hidden markov models and the speech signal. In: Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, pp. 1291–1294 (1982)

    Google Scholar 

  20. Reynolds, D.A.: Speaker identification and verification using gaussian mixture models. Speech Comm. 17, 91–108 (1995)

    Article  Google Scholar 

  21. Higgins, A.L., Bahler, L., Porter, J.: Voice identification using nonparametric density matching. In: Lee, C.H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition, pp. 211–232. Kluwer Academic, Boston (1996)

    Chapter  Google Scholar 

  22. Doddington, G.R.: Speaker recognition based on idiolectal differences between speakers. In: Eurospeech, pp. 2521–2524 (2001)

    Google Scholar 

  23. Prasanna, S.R.M., Gupta, C.S., Yegnanarayana, B.: Source Information from Linear Prediction Residual for Speaker Recognition. Communicated to J. Acoust. Soc. Amer. (2002)

    Google Scholar 

  24. Collobert, R., Bengio, S.: Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research 1, 143–160 (2001)

    Google Scholar 

  25. Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company, New York (1994)

    Google Scholar 

  26. Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bajpai, A., Pathangay, V. (2009). Text and Language-Independent Speaker Recognition Using Suprasegmental Features and Support Vector Machines. In: Ranka, S., et al. Contemporary Computing. IC3 2009. Communications in Computer and Information Science, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03547-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03547-0_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03546-3

  • Online ISBN: 978-3-642-03547-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics