Skip to main content

Modelling Speech Intelligibility in Adverse Conditions

  • Conference paper
  • First Online:

Part of the book series: Advances in Experimental Medicine and Biology ((volume 787))

Abstract

Jørgensen and Dau (J Acoust Soc Am 130:1475–1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting the intelligibility of reverberant speech as well as noisy speech processed by spectral subtraction. The key role of the SNRenv metric is further supported here by the ability of a short-term version of the sEPSM to predict speech masking release for different speech materials and modulated interferers. However, the sEPSM cannot account for speech subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted ­successfully by the spectro-temporal modulation index (STMI) (Elhilali et al., Speech Commun 41:331–348, 2003), which assumes an explicit analysis of the spectral “ripple” structure of the speech signal. However, since the STMI applies the same decision metric as the STI, it fails to account for spectral subtraction. The results from this study suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is “realized” in the auditory system remains unresolved.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Berouti M, Schwartz R, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise. Proc IEEE Int Conf Acoust, Speech, Signal Proces (ICASSP-79), USA 4:208–211

    Google Scholar 

  • Dubbelboer F, Houtgast T (2008) The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. J Acoust Soc Am 124:3937–3946

    Article  PubMed  Google Scholar 

  • Elhilali M, Chi T, Shamma SA (2003) A spectro-temporal modulation index (STMI) for assessment of speech intelligibility. Speech Commun 41:331–348

    Article  Google Scholar 

  • Elhilali M, Ma L, Micheyl C, Oxenham AJ, Shamma SA (2009) Temporal coherence in the ­perceptual organization and cortical representation of auditory scenes. Neuron 61:317–329

    Article  PubMed  CAS  Google Scholar 

  • Ewert S, Dau T (2000) Characterizing frequency selectivity for envelope fluctuations. J Acoust Soc Am 108:1181–1196

    Article  PubMed  CAS  Google Scholar 

  • French N, Steinberg J (1947) Factors governing intelligibility of speech sounds. J Acoust Soc Am 19:90–119

    Article  Google Scholar 

  • Green DM, Swets JA (1988) Signal detection theory and psychophysics. Peninsula Publishing, Los Altos, pp 238–239

    Google Scholar 

  • Holube I, Fredelake S, Vlaming M, Kollmeier B (2010) Development and analysis of an International Speech Test Signal (ISTS). Int J Audiol 49:891–903

    Article  PubMed  Google Scholar 

  • Jørgensen S, Dau T (2011) Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. J Acoust Soc Am 130:1475–1487

    Article  PubMed  Google Scholar 

  • Kjems U, Boldt JB, Pedersen MS, Lunner T, Wang D (2009) Role of mask pattern in intelligibility of ideal binary-masked noisy speech. J Acoust Soc Am 126:1415–1426

    Article  PubMed  Google Scholar 

  • Moore BCJ, Glasberg BR (1983) Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J Acoust Soc Am 74:750–753

    Article  PubMed  CAS  Google Scholar 

  • Nielsen JB, Dau T (2009) Development of a Danish speech intelligibility test. Int J Audiol 48:729–741

    Article  PubMed  Google Scholar 

  • Piechowiak T, Ewert SD, Dau T (2007) Modeling comodulation masking release using an equalization-cancellation mechanism. J Acoust Soc Am 121:2111–2126

    Article  PubMed  Google Scholar 

  • Steeneken HJM, Houtgast T (1980) A physical method for measuring speech transmission quality. J Acoust Soc Am 67:318–326

    Article  PubMed  CAS  Google Scholar 

  • Wagener K, Josvassen JL, Ardenkjaer R (2003) Design, optimization and evaluation of a Danish sentence test in noise. Int J Audiol 42:10–17

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Ewen MacDonald and Hedwig Gockel for helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Torsten Dau .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Jørgensen, S., Dau, T. (2013). Modelling Speech Intelligibility in Adverse Conditions. In: Moore, B., Patterson, R., Winter, I., Carlyon, R., Gockel, H. (eds) Basic Aspects of Hearing. Advances in Experimental Medicine and Biology, vol 787. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1590-9_38

Download citation

Publish with us

Policies and ethics