Skip to main content

Acoustics and Psychoacoustics of Sound Scenes and Events

  • Chapter
  • First Online:

Abstract

Auditory scenes are made of several different sounds overlapping in time and frequency, propagating through space, and resulting in complex arrays of acoustic information reaching the listeners’ ears. Despite the complexity of the signal, human listeners segregate effortlessly these scenes into different meaningful sound events . This chapter provides an overview of the auditory mechanisms subserving this ability. First, we briefly introduce the major characteristics of sound production and propagation and basic notions of psychoacoustics . The next part describes one specific family of auditory scene analysis models (how listeners segregate the scene into auditory objects ), based on multidimensional representations of the signal, temporal coherence analysis to form auditory objects , and the attentional processes that make the foreground pop out from the background. Then, the chapter reviews different approaches to study the perception and identification of sound events (how listeners make sense of the auditory objects): the identification of different properties of sound events (size , material , velocity , etc.), and a more general approach that investigates the acoustic and auditory features subserving sound recognition . Overall, this review of the acoustics and psychoacoustics of sound scenes and events provides a backdrop for the development of computational methods reported in the other chapters of this volume.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Note that this definition does only apply to musical instruments, though.

  2. 2.

    https://www.head-acoustics.de.

  3. 3.

    https://www.bksv.com/.

  4. 4.

    Physical objects have some privileged vibration frequencies, determined by their geometry. These are the modes of vibration. The sounds produced by an object set in vibration result from a combination of these modes. The particular combination depends on how the object is set in vibration (e.g., where it is struck).

  5. 5.

    The Doppler effect causes a dramatic change of the perceived pitch of a moving object as it passes the observer.

References

  1. Agus, T.R., Thorpe, S.J., Pressnitzer, D.: Rapid formation of robust auditory memories: insights from noise. Neuron 66, 610–618 (2010)

    Article  Google Scholar 

  2. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  3. Ahumada, A. Jr., Lovell, J.: Stimulus features in signal detection. J. Acoust. Soc. Am. 49, 1751–1756 (1970)

    Article  Google Scholar 

  4. Alain, C., Arnott, S.R.: Selectively attending to auditory objects. Front. Biosci. 5, D202–D212 (2000)

    Article  Google Scholar 

  5. American Standard Association: USA acoustical terminology S1.1–160. American Standard Association (1960)

    Google Scholar 

  6. Aucouturier, J.J., Bigand, E.: Mel Cepstrum & Ann Ova: the difficult dialog between MIR and music cognition. In: ISMIR, pp. 397–402. Citeseer (2012)

    Google Scholar 

  7. Backer, K.C., Alain, C.: Attention to memory: orienting attention to sound object representations. Psychol. Res. 78(3), 439–452 (2014)

    Article  Google Scholar 

  8. Bendixen, A.: Predictability effects in auditory scene analysis: a review. Front. Neurosci. 8, 60 (2014)

    Article  Google Scholar 

  9. Bendixen, A., Denham, S.L., Gyimesi, K., Winkler, I.: Regular patterns stabilize auditory streams. J. Acoust. Soc. Am. 128, 3658–3666 (2010)

    Article  Google Scholar 

  10. Bendixen, A., Bőhm, T.M., Szalárdy, O., Mill, R., Denhman, L.S., Winkler, I.: Different roles of similarity and predictability in auditory stream segregation. Learn. Percept. 5(2), 37–54 (2013)

    Article  Google Scholar 

  11. Bizley, J.K., Cohen, Y.E.: The what, where and how of auditory-object perception. Nat. Rev. Neurosci. 14(10), 693–707 (2013)

    Article  Google Scholar 

  12. Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 5.1.05) (2009). Computer program. http://www.praat.org/. Retrieved May 1, 2009

  13. Bregman, A.S.: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge (1990)

    Google Scholar 

  14. Brown, G.J., Cooke, M.: Computational auditory scene analysis. Comput. Speech Lang. 8(4), 297–336 (1994)

    Article  Google Scholar 

  15. Buus, S., Müsch, H., Florentine, M.: On loudness at threshold. J. Acoust. Soc. Am. 104(1), 399–410 (1998)

    Article  Google Scholar 

  16. Cabe, P.A., Pittenger, J.B.: Human sensitivity to acoustic information from vessel filling. J. Exp. Psychol. Hum. Percept. Perform. 26(1), 313–324 (2000)

    Article  Google Scholar 

  17. Caclin, A., McAdams, S., Smith, B.K., Winsberg, S.: Acoustic correlates of timbre space dimensions: a confirmatory study using synthetic tones. J. Acoust. Soc. Am. 118(1), 471–482 (2005)

    Article  Google Scholar 

  18. Carlile, S., Best, V.: Discrimination of sound source velocity in human listeners. J. Acoust. Soc. Am. 111(2), 1026–1035 (2002)

    Article  Google Scholar 

  19. Cooke, M., Ellis, D.P.: The auditory organization of speech and other sources in listeners and computational models. Speech Commun. 35(3), 141–177 (2001)

    Article  MATH  Google Scholar 

  20. Daniel, P., Weber, R.: Psychoacoustical roughness: implementation of an optimized model. Acust. United Acta Acust. 83, 113–123 (1997)

    Google Scholar 

  21. David, M., Lavandier, M., Grimault, N.: Sequential streaming, binaural cues and lateralization. J. Acoust. Soc. Am. 138(6), 3500–3512 (2015)

    Article  Google Scholar 

  22. David, M., Lavandier, M., Grimault, N., Oxenham, A. Sequential stream segregation of voiced and unvoiced speech sounds based on fundamental frequency. Hear. Res. 344, 235–243 (2017)

    Article  Google Scholar 

  23. de Cheveigné, A.: Pitch perception models. In: Plack, C., Oxenham, A. (eds.) Pitch, chap. 6, pp. 169–233. Springer, New York (2004)

    Google Scholar 

  24. de Cheveigné, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111(4), 1917–1930 (2002)

    Article  Google Scholar 

  25. Devergie, A., Grimault, N., Tillmann, B., Berthommier, F.: Effect of rhythmic attention on the segregation of interleaved melodies. J. Acoust. Soc. Am. 128, EL1–EL7 (2010)

    Article  Google Scholar 

  26. Elhilali, M., Ma, L., Micheyl, C., Oxenham, A.J., Shamma, S.: Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61(2), 317–329 (2009)

    Article  Google Scholar 

  27. Elliott, T.M., Hamilton, L.S., Theunissen, F.E.: Acoustic structure of the five perceptual dimensions of timbre in orchestral instrument tones. J. Acoust. Soc. Am. 133(1), 389–404 (2013)

    Article  Google Scholar 

  28. Ellis, D.P.: Using knowledge to organize sound: the prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures. Speech Commun. 27(3), 281–298 (1999)

    Article  Google Scholar 

  29. Gaudrain, E., Grimault, N., Healy, E., Béra, J.: Effect of spectral smearing on the perceptual segregation of vowel sequences. Hear. Res. 231, 32–41 (2007)

    Article  Google Scholar 

  30. Gaver, W.W.: What do we hear in the world? An ecological approach to auditory event perception. Ecol. Psychol. 5(1), 1–29 (1993)

    Google Scholar 

  31. Giordano, B.L., McAdams, S.: Material identification of real impact sounds: effect of size variation in steel, glass, wood and plexiglass plates. J. Acoust. Soc. Am. 119(2), 1171–1881 (2006)

    Article  Google Scholar 

  32. Giordano, B.L., McAdams, S., Rocchesso, D.: Integration of acoustical information in the perception of impacted sound sources: the role of information accuracy and exploitability. J. Exp. Psychol. Hum. Percept. Perform. 36(2), 462–476 (2010). doi:10.1037/a0018388

    Article  Google Scholar 

  33. Gordon, C., Webb, D.: You can’t hear the shape of a drum. Am. Sci. 84(1), 46–55 (1996)

    Google Scholar 

  34. Gosselin, F., Schyns, P.G.: Bubbles: a technique to reveal the use of information in recognition tasks. Vis. Res. 41(17), 2261–2271 (2001)

    Article  Google Scholar 

  35. Grassi, M.: Do we hear size or sound? Balls dropped on plates. Percept. Psychophys. 67(2), 274–284 (2005)

    Article  Google Scholar 

  36. Grassi, M., Pastore, M., Lemaitre, G.: Looking at the world with your ears: how do we get the size of an object from its sound? Acta Psychol. 143, 96–104 (2013)

    Article  Google Scholar 

  37. Gray, G.W.: Phonemic microtomy: the minimum duration of perceptible speech sounds. Commun. Monogr. 9(1), 75–90 (1942)

    Google Scholar 

  38. Grey, J.M., Moorer, J.A.: Perceptual evaluation of synthesized musical instrument tones. J. Acoust. Soc. Am. 62, 454–462 (1977)

    Article  Google Scholar 

  39. Grimault, N., Micheyl, C., Carlyon, R., Collet, L.: Evidence for two pitch encoding mechanisms using a selective auditory training paradigm. Percept. Psychophys. 64(2), 189–197 (2002)

    Article  Google Scholar 

  40. Guski, R.: Acoustic Tau: an easy analogue to visual Tau? Ecol. Psychol. 4(3), 189–197 (1992)

    Article  Google Scholar 

  41. Gygi, B., Kidd, G.R., Watson, C.S.: Spectral-temporal factors in the identification of environmental sounds. J. Acoust. Soc. Am. 115(3), 1252–1265 (2004)

    Article  Google Scholar 

  42. Houben, M.M., Kohlrausch, A., Hermes, D.J.: The contribution of spectral and temporal information to the auditory perception of the size and speed of rolling balls. Acta Acust. United Acust. 91, 1007–1015 (2005)

    Google Scholar 

  43. Hromádka, T., Zador, A.M.: Representations in auditory cortex. Curr. Opin. Neurobiol. 19(4), 430–433 (2009)

    Article  Google Scholar 

  44. Isnard, V., Taffou, M., Viaud-Delmon, I., Suied, C.: Auditory sketches: very sparse representations of signals are still recognizable. PLoS One 11(3), e0150313 (2016)

    Article  Google Scholar 

  45. Joris, P.X., Verschooten, E.: On the limit of neural phase locking to fine structure in humans. Basic Asp. Hear. 787, 101–108 (2013)

    Article  Google Scholar 

  46. Kac, M.: Can one hear the shape of a drum? Am. Math. Mon. 73(4), 1–23 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  47. Kaczmarek, T.: Auditory perception of sound source velocity. J. Acoust. Soc. Am. 117(5), 3149–3156 (2005)

    Article  Google Scholar 

  48. Kawahara, H., Masuda-Katsuse, I., De Cheveigne, A.: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: possible role of a repetitive structure in sounds. Speech Commun. 27(3), 187–207 (1999)

    Article  Google Scholar 

  49. Klatzky, R.L., Pai, D.K., Krotkov, E.P.: Perception of material from contact sounds. Presence 9(4), 399–410 (2000)

    Article  Google Scholar 

  50. Krishnan, L., Elhilali, M., Shamma, S.: Segregating complex sound sources through temporal coherence. PLoS Comput. Biol. 10, e1003985 (2014)

    Article  Google Scholar 

  51. Kunkler-Peck, A.J., Turvey, M.T.: Hearing shape. J. Exp. Psychol. Hum. Percept. Perform. 26(1), 279–294 (2000)

    Article  Google Scholar 

  52. Lakatos, S., McAdams, S., Caussé, R.: The representation of auditory source characteristics: simple geometric forms. Percept. Psychophys. 59(8), 1180–1190 (1997)

    Article  Google Scholar 

  53. Lartillot, O., Toiviainen, P., Eerola, T.: A Matlab toolbox for music information retrieval. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds.) Data Analysis, Machine Learning and Applications, pp. 261–268. Springer, Berlin (2008)

    Chapter  Google Scholar 

  54. Lemaitre, G., Heller, L.M.: Auditory perception of material is fragile, while action is strikingly robust. J. Acoust. Soc. Am. 131(2), 1337–1348 (2012)

    Article  Google Scholar 

  55. Lemaitre, G., Susini, P., Winsberg, S., Letinturier, B., McAdams, S.: The sound quality of car horns: a psychoacoustical study of timbre. Acoust. United Acta Acoust. 93(3), 457–468 (2007)

    Google Scholar 

  56. Lewicki, M.S.: Efficient coding of natural sounds. Nat. Neurosci. 5(4), 356–363 (2002)

    Article  Google Scholar 

  57. Lu, Y.C., Cooke, M.: Binaural estimation of sound source distance via the direct-to-reverberant energy ratio for static and moving sources. IEEE Trans. Audio Speech Lang. Process. 18(7), 1793–1805 (2010)

    Article  Google Scholar 

  58. Lutfi, R.A.: Auditory detection of hollowness. J. Acoust. Soc. Am. 110(2), 1010–1019 (2001)

    Article  Google Scholar 

  59. Lutfi, R.A., Stoelinga, C.N.J.: Sensory constraints on auditory identification of the material and geometric properties of struck bars. J. Acoust. Soc. Am. 127(1), 350–360 (2010)

    Article  Google Scholar 

  60. Lutfi, R.A., Wang, W.: Correlational analysis of acoustic cues for the discrimination of auditory motion. J. Acoust. Soc. Am. 106(2), 919–928 (1999)

    Article  Google Scholar 

  61. Mandel, M.I., Yoho, S.E., Healy, E.W.: Measuring time-frequency importance functions of speech with bubble noise A. J. Acoust. Soc. Am. 140(4), 2542–2553 (2016)

    Article  Google Scholar 

  62. Marozeau, J., Innes-Brown, H., Grayden, D., Burkitt, A., Blamey, P.: The effect of visual cues on auditory stream segregation in musicians and non-musicians. PLoS One 5(6), e11297 (2010)

    Article  Google Scholar 

  63. McAdams, S.: The psychomechanics of real and simulated sound sources. J. Acoust. Soc. Am. 107(5), 2792–2792 (2000)

    Article  Google Scholar 

  64. McAdams, S., Winsberg, S., Donnadieu, S., Soete, G.D., Krimphoff, J.: Perceptual scaling of synthesized musical timbres: common dimensions, specificities and latent subject classes. Psychol. Res. 58, 177–192 (1995)

    Article  Google Scholar 

  65. McAdams, S., Chaigne, A., Roussarie, V.: The psychomechanics of simulated sound sources: material properties of impacted bars. J. Acoust. Soc. Am. 115(3), 1306–1320 (2004)

    Article  Google Scholar 

  66. McDermott, J.H., Simoncelli, E.P.: Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis. Neuron 71(5), 926–940 (2011)

    Article  Google Scholar 

  67. Meddis, R., O’Mard, L.: A unitary model of pitch perception. J. Acoust. Soc. Am. 102(3), 1811–1820 (1997)

    Article  Google Scholar 

  68. Middlebrooks, J.C., Onsan, Z.A.: Stream segregation with high spatial acuity. J. Acoust. Soc. Am. 132(6), 3896–3911 (2012)

    Article  Google Scholar 

  69. Misdariis, N., Minard, A., Susini, P., Lemaitre, G., McAdams, S., Parizet, E.: Environmental sound perception: meta-description and modeling based on independent primary studies. EURASIP J. Speech Audio Music Process. 2010 (2010). Article ID 362013

    Google Scholar 

  70. Moore, B.C.: Development and current status of the “Cambridge” loudness models. Trends Hear. 18, 2331216514550620 (2014)

    Google Scholar 

  71. Moore, B., Gockel, H.: Factors influencing sequential stream segregation. Acoust. United Acta Acoust. 88, 320–333 (2002)

    Google Scholar 

  72. Moore, B.C.J., Gockel, H.E.: Properties of auditory stream formation. Philos. Trans. R. Soc. B 367, 919–931 (2012)

    Article  Google Scholar 

  73. Moore, B.C.J., Glasberg, B.R., Baer, T.: A model for the prediction of thresholds, loudness and partial loudness. J. Audio Eng. Soc. 45(4), 224–238 (1997)

    Google Scholar 

  74. Nelken, I., Rotman, Y., Yosef, O.B.: Responses of auditory-cortex neurons to structural features of natural sounds. Nature 397(6715), 154–157 (1999)

    Article  Google Scholar 

  75. Neuhoff, J.G.: Auditory motion and localization. In: Neuhoff, J.G. (ed.) Ecological Psychoacoustics, pp. 87–111. Brill, Leiden (2004)

    Google Scholar 

  76. O’Meara, N., Bleeck, S.: Size discrimination of transient sounds: perception and modelling. J. Hearing Sci. 3(3), 32–44 (2013)

    Google Scholar 

  77. Overath, T., Kumar, S., Stewart, L., von Kriegstein, K., Cusack, R., Rees, A., Griffiths, T.D.: Cortical mechanisms for the segregation and representation of acoustic textures. J. Neurosci. 30(6), 2070–2076 (2010)

    Article  Google Scholar 

  78. Oxenham, A.J., Bernstein, J.G.W., Penagos, H.: Correct tonotopic representation is necessary for complex pitch perception. Proc. Natl. Acad. Sci. 101(5), 1421–1425 (2004)

    Article  Google Scholar 

  79. Pachet, F., Roy, P.: Analytical features: a knowledge-based approach to audio feature generation. EURASIP J. Audio Speech Music Process. 2009(1), 1 (2009)

    Article  Google Scholar 

  80. Parizet, E., Guyader, E., Nosulenko, V.: Analysis of car door closing sound quality. Appl. Acoust. 69, 12–22 (2008)

    Article  Google Scholar 

  81. Patil, K., Pressnitzer, D., Shamma, S., Elhilali, M.: Music in our ears: the biological bases of musical timbre perception. PLoS Comput. Biol. 8(11), e1002759 (2012)

    Article  Google Scholar 

  82. Patterson, R.D.: Pulse-resonance sounds. In: Encyclopedia of Computational Neuroscience, pp. 2541–2548. Springer, New York (2015)

    Google Scholar 

  83. Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Cuidado projet report, Institut de Recherche et de Coordination Acoustique Musique (IRCAM), Paris (2004)

    Google Scholar 

  84. Peeters, G., Giordano, B.L., Susini, P., Misdariis, N., McAdams, S.: The timbre toolbox: extracting audio descriptors from musical signals. J. Acoust. Soc. Am. 130(5), 2902 (2011)

    Article  Google Scholar 

  85. Plumbley, M.D., Blumensath, T., Daudet, L., Gribonval, R., Davies, M.E.: Sparse representations in audio and music: from coding to source separation. Proc. IEEE 98(6), 995–1005 (2010)

    Article  Google Scholar 

  86. Ponsot, E., Susini, P., Meunier, S.: A robust asymmetry in loudness between rising-and falling-intensity tones. Atten. Percept. Psychophys. 77(3), 907–920 (2015)

    Article  Google Scholar 

  87. Portilla, J.: Image restoration through l0 analysis-based sparse optimization in tight frames. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3909–3912. IEEE, New York (2009)

    Google Scholar 

  88. Powell, R.L., Tosi, O.: Vowel recognition threshold as a function of temporal segmentations. J. Speech Lang. Hear. Res. 13(4), 715–724 (1970)

    Article  Google Scholar 

  89. Pressnitzer, D., Agus, T., Suied, C.: Acoustic timbre recognition. In: Encyclopedia of Computational Neuroscience, pp. 128–133. Springer, Berlin (2015)

    Google Scholar 

  90. Rajendran, V.G., Harper, N.S., Willmore, B.D., Hartmann, W.M., Schnupp, J.W.H.: Temporal predictability as a grouping cue in the perception of auditory streams. J. Acoust. Soc. Am. 134, EL98–EL104 (2013)

    Article  Google Scholar 

  91. Risset, J.C., Wessel, D.L.: Exploration of timbre by analysis and synthesis. In: Deutsch, D. (ed.) The Psychology of Music, Series in Cognition and Perception, 2nd edn. pp. 113–169. Academic, New York (1999)

    Google Scholar 

  92. Robinson, K., Patterson, R.D.: The duration required to identify the instrument, the octave, or the pitch chroma of a musical note. Music Percept. Interdiscip. J. 13(1), 1–15 (1995)

    Article  Google Scholar 

  93. Rosenblum, L.D., Carello, C., Pastore, R.E.: Relative effectiveness of three stimulus variables for locating a moving sound source. Perception 16(2), 175–186 (1987)

    Article  Google Scholar 

  94. Schwartz, J.L., Grimault, N., Hupé, J.M., Moore, B.C.J., Pressnitzer, D.: Introduction: multistability in perception: binding sensory modalities, an overview. Philos. Trans. R. Soc. B 367, 896–905 (2012)

    Article  Google Scholar 

  95. Shamma, S.A., Elhilali, M., Micheyl, C.: Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 34(3), 114–123 (2011)

    Article  Google Scholar 

  96. Shannon, R.V., Zeng, F.G., Kamath, V., Wygonski, J., Ekelid, M.: Speech recognition with primarily temporal cues. Science 270(5234), 303 (1995)

    Article  Google Scholar 

  97. Shinn-Cunningham, B.G.: Object-based auditory and visual attention. Trends Cogn. Sci. 12(5), 182–186 (2008)

    Article  Google Scholar 

  98. Siedenburg, K., Fujinaga, I., McAdams, S.: A comparison of approaches to timbre descriptors in music information retrieval and music psychology. J. New Music Res. 45(1), 27–41 (2016)

    Article  Google Scholar 

  99. Smith, E.C., Lewicki, M.S.: Efficient auditory coding. Nature 439(23), 978–982 (2006)

    Article  Google Scholar 

  100. Sontacchi, A.: Entwicklung eines modulkonzeptes für die psychoakustische geräuschenalayse unter matlab (1999). Diplomarbeit, Institut für Elektronische Musik der Kunstuniversität Graz, Graz

    Google Scholar 

  101. Stevens, S.S., Galanter, E.H.: Ratio scales and category scales for a dozen of perceptual continua. J. Exp. Psychol. 54(6), 377–411 (1957)

    Article  Google Scholar 

  102. Suen, C.Y., Beddoes, M.P.: Discrimination of vowel sounds of very short duration. Percept. Psychophys. 11(6), 417–419 (1972)

    Article  Google Scholar 

  103. Suied, C., Drémeau, A., Pressnitzer, D., Daudet, L.: Auditory sketches: sparse representations of sounds based on perceptual models. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., lvi Ystad, S. (eds.) From Sounds to Music and Emotions, 9th International Symposium, CMMR 2012, London, June 19–22, 2012, Revised Selected Papers. Lecture Notes in Computer Science, vol. 7900, pp. 154–170. Springer, Berlin (2013)

    Google Scholar 

  104. Suied, C., Agus, T.R., Thorpe, S.J., Mesgarani, N., Pressnitzer, D.: Auditory gist: recognition of very short sounds from timbre cues. J. Acoust. Soc. Am. 135(3), 1380–1391 (2014)

    Article  Google Scholar 

  105. Sumby, W.H., Pollack, I.: Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215 (1954)

    Article  Google Scholar 

  106. Sumner, C., Lopez-Poveda, E., O’Mard, L., Meddis, R.: A revised model of the inner-hair cell and auditory-nerve complex. J. Acoust. Soc. Am. 111(5), 2178–2188 (2002)

    Article  Google Scholar 

  107. Susini, P., McAdams, S., Winsberg, S., Perry, I., Vieillard, S., Rodet, X.: Characterizing the sound quality of air-conditioning noise. Appl. Acoust. 65(8), 763–790 (2004)

    Article  Google Scholar 

  108. Susini, P., Lemaitre, G., McAdams, S.: Psychological measurement for sound description and evaluation. In: Berglund, B., Rossi, G.B., Townsend, J.T., Pendrill, L.R. (eds.) Measurement with Persons - Theory, Methods and Implementation Area, chap. 11 Psychology Press/Taylor and Francis, New York (2011)

    Google Scholar 

  109. Szabó, B.T., Denham, S.L., Winkler, I.: Computational models of auditory scene analysis: a review. Front. Neurosci. 10, 524 (2016)

    Google Scholar 

  110. Teng, X., Tian, X., Poeppel, D.: Testing multi-scale processing in the auditory system. Sci. Rep. 6 (2016). doi:10.1038/srep34390

    Google Scholar 

  111. Theunissen, F.E., Sen, K., Doupe, A.J.: Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds. J. Neurosci. 20(6), 2315–2331 (2000)

    Google Scholar 

  112. Thoret, E., Depalle, P., McAdams, S.: Perceptually salient spectrotemporal modulations for recognition of sustained musical instruments. J. Acoust. Soc. Am. 140(6), EL478–EL483 (2016)

    Article  Google Scholar 

  113. Tucker, S., Brown, G.J.: Modelling the auditory perception of size, shape and material: applications to the classification of transient sonar sounds. In: Audio Engineering Society Convention, vol. 114 (2003). http://www.aes.org/e-lib/browse.cfm?elib=12543

  114. Turner, R., Sahani, M.: Modeling natural sounds with modulation cascade processes. In: Advances in Neural Information Processing Systems, pp. 1545–1552 (2008)

    Google Scholar 

  115. Unoki, M., Irino, T. Glasberg, B., Moore, B., Patterson, R.: Comparison of the roex and gammachirp filters as representations of the auditory filter. J. Acoust. Soc. Am. 120(3), 1474–1492 (2006)

    Article  Google Scholar 

  116. van Noorden L.: Temporal coherence in the perception of tone sequences. Ph.D. thesis, Eindhoven University of Technology (1975)

    Google Scholar 

  117. Varnet, L., Knoblauch, K., Meunier, F., Hoen, M.: Using auditory classification images for the identification of fine acoustic cues used in speech perception. Front. Hum. Neurosci. 7, 865 (2013)

    Article  Google Scholar 

  118. Venezia, J.H., Hickok, G., Richards, V.M.: Auditory “bubbles”: efficient classification of the spectrotemporal modulations essential for speech intelligibility. J. Acoust. Soc. Am. 140(2), 1072–1088 (2016)

    Article  Google Scholar 

  119. Viemeister, N.F., Wakefield, G.H.: Temporal integration and multiple looks. J. Acoust. Soc. Am. 90(2), 858–865 (1991)

    Article  Google Scholar 

  120. Virtanen, T., Gemmeke, J.F., Raj, B., Smaragdis, P.: Compositional models for audio processing: uncovering the structure of sound mixtures. IEEE Signal Process. Mag. 32(2), 125–144 (2015)

    Article  Google Scholar 

  121. Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley-IEEE Press, New York (2006)

    Book  Google Scholar 

  122. Warren, W.H., Verbrugge, R.R.: Auditory perception of breaking and bouncing events: a case study in ecological acoustics. J. Exp. Psychol. Hum. Percept. Perform. 10(5), 704–712 (1984)

    Article  Google Scholar 

  123. Wildes, R.P., Richards, W.A.: Recovering material properties from sound. In: Richards, W.A. (ed.) Natural Computation. A Bradford Book, chap. 25, pp. 356–363. The MIT Press, Cambridge, MA (1988)

    Google Scholar 

  124. Winkler, I., Denham, S.L., Nelken, I.: Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn. Sci. 13(12), 532–540 (2009)

    Article  Google Scholar 

  125. Young, W., Rodger, M., Craig, C.M.: Perceiving and reenacting spatiotemporal characteristics of walking sounds. J. Exp. Psychol. Hum. Percept. Perform. 39(2), 464–476 (2012)

    Article  Google Scholar 

  126. Zilany, M., Bruce, I., Nelson, P., Carney, L.: A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. J. Acoust. Soc. Am. 126(5), 2390–2412 (2009)

    Article  Google Scholar 

  127. Zwicker, E., Fastl, H.: Psychoacoustics Facts and Models, 463 pp. Springer, Berlin (1990)

    Google Scholar 

  128. Zwicker, E., Fastl, H., Widmann, U., Kurakata, K., Kuwano, S., Namba, S.: Program for calculating loudness according to DIN 45631 (ISO 532B). J. Acoust. Soc. Jpn. 12(1) (1991). doi10.1250/ast.12.39

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Lemaitre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Lemaitre, G., Grimault, N., Suied, C. (2018). Acoustics and Psychoacoustics of Sound Scenes and Events. In: Virtanen, T., Plumbley, M., Ellis, D. (eds) Computational Analysis of Sound Scenes and Events. Springer, Cham. https://doi.org/10.1007/978-3-319-63450-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63450-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63449-4

  • Online ISBN: 978-3-319-63450-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics