Skip to main content

Fourier-Time-Transformation (FTT), Analysis of Sound and Auditory Perception

  • Chapter
  • First Online:

Part of the book series: Current Research in Systematic Musicology ((CRSM,volume 1))

Abstract

The Fourier/time transformation (FTT) has been proposed by Ernst Terhardt (1985, 1992, 1998) as a tool for analysis and representation of audio signals such as speech and music. Terhardt (1985) issued the FTT in the context of an updated interpretation of the Fourier transform (FT) and with the aim to develop a transform suited to perform time/frequency analysis comparable to that of the mammalian auditory system. FTT is re-examined in this chapter and some other methods relevant for musical acoustics and psychoacoustics such as the short-time Fourier transform (STFT), autoregressive spectral modeling (AR) and Wavelet transform (WT) are presented in a brief survey for comparison, and are illustrated by some examples. Different approaches to time/frequency analysis are also viewed as to their power with respect to the so-called uncertainty product Δt Δf.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Settings for the analysis performed with the Praat software (Boersma and Weenink 2011) were a time window of 30 ms with a Gaussian weighting, a time step of 2 ms from one frame to the next, an analysis bandwidth of 2 kHz and a frequency step of 2 Hz. The sound sample of 11.17 s was processed in 5253 (overlapping) frames.

  2. 2.

    A formal proof can be given on the basis of the Cauchy-Bunjakowski-Schwarz inequality (cf. Meyer and Guicking 1974, 95, 108; Papoulis 1962, 63).

  3. 3.

    Applying no specific windowing function means a rectangular window is chosen for which the so-called Equivalent Noise Bandwidth (ENBW [Bins], see DeFatta et al. 1988, 262ff.) is 1.0.

  4. 4.

    There are several definitions as to ‚linear‘. In electronics, linear refers to circuits (like LRC filters) in which linear relations exist between physical magnitudes (induction, capacity, resistance, gain) and where all voltages and current are proportional to the electromotive force driving the system (cf. Küpfmüller 1968, 12f.). In signals and systems theory, linearity is defined by Bachmann (1992, 9) like this: superposition at the input has the same effect as superposition at the output.

  5. 5.

    Analysis for 0–1 kHz was performed with the Sonogram software (Hiroshi Momose 1991); settings were FFT+Wigner, time window 2048 pts, Hanning weighting, time increment 85 pts = 1.77 ms; LPC, sideband suppression 80 Hz, dynamic range of analysis and graph representation −20 to −1 dB.

  6. 6.

    Historic organ of St. Bartholomäus, Mittelnkirchen, Altes Land, build by Arp Schnitger, Jacob Albrecht and Johann Matthias Schreiber 1688–1753. The Quintadena 16′ pipe rank is in the Hauptwerk of the organ.

  7. 7.

    Built by Joris du Mery 1742–1748.

  8. 8.

    FFT: 8192, Hanning, Hop ratio 0.25, zero pad factor 2.0. Analysis performed with Spectro 3.01 (Perry Cook, Gary Scavone).

  9. 9.

    The code used for analysis was programmed in MatLab by Can Karadogan and Florian Keiler while working in the Department of Signal Processing and Communications of the Helmut-Schmidt-Universität Hamburg. The AR tool was developed to be used in a joint project directed at the study of transients in the sound of musical instruments (cf. Keiler et al. 2003).

  10. 10.

    Fourier transforms of the steady-state part of the sound show that partial frequencies for higher harmonic partials are not exactly at integer ratios. Moreover, frequencies for partials including the fundamental fluctuate over time as can be seen from increasing values for variance of frequencies in longer FFT transforms (e.g., 65536). However, ACF analysis clearly gives a single ‘pitch’ for this pipe tone corresponding to 143 Hz.

  11. 11.

    The usual approach (cf. Cohen 1995, 30ff., Flandrin 1999, 26ff.) is to calculate the so-called analytic signal by means of a Hilbert transform (Flandrin rightly calls the analytic signal a “complexified” signal).

  12. 12.

    The same consideration was made in “running” autocorrelation algorithms, which typically “slide” along a time signal and include a weighting function to successively discard past sample values so that ACF in fact is computed from an “effective time window” of N samples up to the sample point t moving with time. As to the equivalence of “running” ACF and FTT, see Terhardt 1998, 94f.

  13. 13.

    A more detailed analytic formulation of the FTT is given by Mummert 1997.

  14. 14.

    For example, one CB included in the table given by Zwicker and Terhardt (1980, p. 152), ranges from 920 to 1080 Hz with f c = 1000 Hz and is 160 Hz wide; divided by 25, the frequency step would be 160/25 = 6.4 Hz as compared to the jnd at 1000 Hz, which is ca. 3 Hz.

  15. 15.

    The ENBW for the Blackman window is 1.73 bins in DFT and the 3.0 dB bandwidth is 1.68 bins.

References

  • Bachmann, W. (1992). Signalanalyse. Grundlagen und mathematische Verfahren. Braunschweig: Vieweg.

    Google Scholar 

  • Beauchamp, James. (2007). Analysis and synthesis of musical instrument sounds. In J. Beauchamp (Ed.), Analysis, synthesis, and perception of musical sounds (pp. 1–89). New York: Springer.

    Chapter  Google Scholar 

  • Bilsen, F. & Kievits, I. (1989). The minimum integration time of the auditory system. Preprint 2746, AES Convention Hamburg March 1989.

    Google Scholar 

  • Boersma, P. & Weenink, D. (2011). Praat. Doing phonetics by computer (version 5232). Amsterdam: University of Amsterdam, Institute of Phonetics.

    Google Scholar 

  • Boersma, P. (1993). Accurate short-term Analysis of the fundamental frequency and the harmonic-to-noise ratio of a sampled sound: Proceedins of Institute of Phonetics, University of Amsterdam (Vol. 17 pp. 97–110).

    Google Scholar 

  • Bracewell, R. (1978). Fourier transform (2nd ed.). New York: McGraw-Hill.

    MATH  Google Scholar 

  • Bregman, A. (1990). Auditory scene analysis. Cambridge: MIT Press.

    Google Scholar 

  • Bürck, W., Kotowski, P. & Lichte, H. (1935). Der Aufbau des Tonhöhenbewußtseins. Elektrische Nachrichtentechnik, 12, 326–333.

    Google Scholar 

  • Cohen, L. (1995). Time-frequency analysis. Upper Saddle River, N.J.: Prentice-Hall.

    Google Scholar 

  • de Boer, E. (1976). On the “residue” and auditory pitch perception. In W. D. Keidel & W. D. Neff (Eds.), Handbook of sensory physiology (Vol. 3, pp. 479–583). New York: Springer.

    Google Scholar 

  • de Cheveigné, A. (2005). Pitch perception models. In C. Plack, A. Oxenham, R. Fay, A. Popper (Eds.). Pitch. neural coding and perception (pp. 169–230). New York: Springer.

    Google Scholar 

  • DeFatta, D., J. Lucas, & Hodgkiss, W. (1988). Digital signal processing. A system design approach. New York: Wiley.

    Google Scholar 

  • Dellomo, M., & Jacyna, G. (1991). Wigner transforms, Gabor coefficients, and Weyl-Heisenberg wavelets. Journal of Acoustical Society of America, 89, 2355–2361.

    Article  Google Scholar 

  • Dutilleux, P., Grossmann A. & Kronland-Martinet, R. (1988). Application of the wavelet transform to the analysis, transformation and synthesis of musical sound. Preprint 2727, AES Convention 85, November 1988.

    Google Scholar 

  • Eddins, D., & Green, D. (1995). Temporal integration and temporal resolution. In B. C. J. Moore (Ed.), Hearing (pp. 207–242). San Diego: Academic Press.

    Chapter  Google Scholar 

  • Evangelista, G. (1997). Wavelet representations of musical signals. In C. Roads, St. Pope, A. Piccialli, G. de Poli (Eds.), Musical signal processing (pp. 127–153). Lisse: Swets and Zeitlinger.

    Google Scholar 

  • Flandrin, P. (1999). Time-Frequency/Time-Scale Analysis. San Diego: Academic Press.

    MATH  Google Scholar 

  • Gabor, D. (1946). Theory of communication. Journal of Institution of Electrical Engineering, 93, 429–457.

    Google Scholar 

  • Gafori, F. (1496/1967/1968). Practica Musicae. Milan (Reprint Farnborough, Hants.: Gregg Pr. 1967); [English translation and transcription of musical examples by Clement Miller]. American Institute of Musicology 1968).

    Google Scholar 

  • Greenwood, D. (1990). A cochlear frequency-position function for several species—29 years later. Journal of Acoustical Society of America, 87, 2592–2605.

    Article  Google Scholar 

  • Heldmann, K. (1993). Wahrnehmung, gehörgerechte Analyse und Merkmalsextraktion technischer Schalle. Ph.D. thesis, Technical University of Munich.

    Google Scholar 

  • Hut, R., Boone, M., & Gisolf, A. (2006). Cochlear modeling as time-frequency analysis tool. Acustica, 92, 629–636.

    Google Scholar 

  • Jurado, C., & M, Brian. (2010). Frequency selectivity for frequencies below 100 Hz: Comparison with mid-frequencies. Journal of Acoustical Society of America, 128, 3585–3596.

    Article  Google Scholar 

  • Keiler, F., Karadogan, C., Zölzer, U. & Schneider, A. (2003). Analysis of transient musical sounds by auto-regressive modeling: Proceedings of the 6 th International Conference on Digital Audio Effects (DAFx-03) (pp. 301–304). London: St. Marys.

    Google Scholar 

  • Kostek, B. (2005). Perception-based data processing in acoustics. Berlin: Springer.

    Google Scholar 

  • Kral, A., & Majérnik, V. (1996). Neural networks simulating the frequency discrimination of hearing for non-stationary short tone stimuli. Biological Cybernetics, 74, 359–366.

    Article  Google Scholar 

  • Küpfmüller, K. (1968). Die Systemtheorie der elektrischen Nachrichtenübertragung (3rd ed.). Stuttgart: Hirzel.

    Google Scholar 

  • Mammano, F., & Nobili, R. (1993). Biophysics of the cochlea: Linear approximation. Journal of Acoustical Society of America, 93, 3320–3332.

    Article  Google Scholar 

  • Markel, J., & Gray, A. (1976). Linear prediction of speech. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Marple, S. L. (1987). Digital spectral analysis. Englewood Cliffs, N.J.: Prentice-Hall.

    Google Scholar 

  • Meddis, R., & O’Mard, L. (1997). A unitary model of pitch perception. Journal of Acoustical Society of America, 102, 1811–1820.

    Article  Google Scholar 

  • Meddis, R., & O’Mard, L. (2006). Virtual pitch in a computational physiological model. Journal of Acoustical Society of America, 120, 3861–3869.

    Article  Google Scholar 

  • Meddis, R. & Lopez-Poveda, E. (2010). Auditory periphery: From pinna to auditory nerve. In R. Meddis et al. (Eds.), Computational models of the auditory system (pp. 7–38). New York: Springer.

    Google Scholar 

  • Meddis, R., Lopez-Poveda, E., Fay, R., & Popper, A. (Eds.). (2010). Computational models of the auditory system. New York: Springer.

    Google Scholar 

  • Messner, G. (2011). Du krächzt wie ein Rabe…, singst wie eine Nachtigall…In A. Schmidhofer, St. Jena (Eds.), Klangfarbe. Vergleichend-systematische und musikhistorische Perspektiven. Frankfurt/M.: P. Lang, pp. 205–217 (plus sound examples on a CD in the book).

    Google Scholar 

  • Mertins, A. (1996). Signaltheorie. Stuttgart: Teubner.

    Book  MATH  Google Scholar 

  • Mertins, A. (1999). Signal analysis. Chichester: Wiley.

    Book  MATH  Google Scholar 

  • Meyer, E., & Guicking, D. (1974). Schwingungslehre. Braunschweig: Vieweg.

    Book  Google Scholar 

  • Momose, H. (1991). Sonogram. Davis, CA: University of Cal.

    Google Scholar 

  • Moore, B. (1995). Frequency analysis and masking. In B. Moore (Ed.), Hearing (pp. 161–205). San Diego: Academic Press.

    Chapter  Google Scholar 

  • Moore, B. (2008). An introduction to the psychology of hearing (5th ed.). Bingley: Emerald.

    Google Scholar 

  • Mummert, M. (1997). Sprachcodierung durch Konturierung eines gehörangepaßten Spektrogramms und ihre Anwendung zur Datenreduktion. Ph.D. thesis, Technical University of Munich.

    Google Scholar 

  • Netten, S., & Duifhuis, H. (1983). Modelling an active, nonlinear cochlea. In E. de Boer & M. Viergever (Eds.), Mechanics of Hearing. Delft: Delft University Pr., 143–151.

    Google Scholar 

  • Nobili, R., & Mammano, F. (1999). Biophysics of the cochlea II: Stationary nonlinear phenomenology. Journal of Acoustical Society of America, 99, 2244–2255.

    Article  Google Scholar 

  • Oertel, D., Fay, R., & Popper, A. (Eds.). (2002). Integrative functions in the mammalian auditory pathway. New York: Springer.

    Google Scholar 

  • Papoulis, A. (1962). The Fourier Integral and its applications. New York: McGraw-Hill.

    MATH  Google Scholar 

  • Patterson, R., Nimmo-Smith, I., Weber, D., & Milroy, R. (1982). The deterioration of hearing with age: Frequency selectivity, the critical ratio, the audiogram, and speech threshold. Journal of the Acoustical Society of America, 72, 1788–1803.

    Article  Google Scholar 

  • Patterson, R., Robinson, K., Holdsworth, J., McMcKeown, D., Zhang, C., & Allerhand, M. (1992). Complex sounds and auditory images. Advances in the Biosciences, 83, 429–443.

    Google Scholar 

  • Pickles, Ja. (2008). An Introduction the Physiology of Hearing (3rd ed.). Bingley: Emerald.

    Google Scholar 

  • Pressnitzer, D., Patterson, R., & Krumbholz, K. (2001). The lower limit of melodic pitch. Journal of the Acoustical Society of America, 109, 2074–2084.

    Article  Google Scholar 

  • Rodet, X., & Schwarz, D. (2007). Spectral envelopes and additive+residual analysis/synthesis. In J. Beauchamp (Ed.), Analysis, Synthesis, and Perception of Musical Sounds (pp. 174–227). New York: Springer.

    Google Scholar 

  • Rossing, T. (1982). The Science of Sound. CA: Addison—Wesley.

    Google Scholar 

  • Rücker, C. (1997). Berechnung von Erregungsverteilungen aus FTT-Spektren. Fortschritte der AkustikDAGA 1997, pp. 484–485.

    Google Scholar 

  • Russo, M., Rožić, N., & Stella, M. (2011). Biophysical cochlear model: Time-frequency analysis and signal reconstruction. Acustica, 97, 632–640.

    Article  Google Scholar 

  • Schlang, M. & Mummert, M. (1990). Die Bedeutung der Fensterfunktion für die Fourier-t-Transformation als gehörgerechte Spektralanalyse. Fortschritte der Akustik, DAGA 1990, Bad Honnef 1990, pp. 1043–1046.

    Google Scholar 

  • Schneider, A. (1997). Tonhöhe, Skala, Klang. Akustische, tonometrische und psychoakustische Studien auf vergleichender Grundlage. Bonn: Orpheus-Verlag für Syst. Musikwiss.

    Google Scholar 

  • Schneider, A. (2001). Complex inharmonic sounds, perceptual ambiguity, and musical imagery. In R. I. Godøy & H. Jørgensen (Eds.), Musical imagery (pp. 95–116). Lisse: Swets and Zeitlinger.

    Google Scholar 

  • Schneider, A. & Frieler, K. (2009). Perception of harmonic and inharmonic sounds: Results from ear models. In S. Ystad, R. Kronland-Martinet & K. Jensen (Eds.), Computer music modeling and retrieval. Genesis of meaning in sound and music (pp. 18–44). Berlin: Springer.

    Google Scholar 

  • Schneider, A., von Ruschkowski, A., & Bader, R. (2009). Klangliche Rauhigkeit, ihre Wahrnehmung und Messung. In R. Bader (Ed.), Musical acoustics, neurocognition and psychology of music (pp. 103–148). Frankfurt: P. Lang.

    Google Scholar 

  • Schneider, A., & Tsatsishvili, V. (2011). Perception of musical intervals at very low frequencies: Some experimental findings. In A. Schneider & A. von Ruschkowski (Eds.), Systematic musicology: Empirical and theoretical studies (pp. 99–125). Frankfurt: P. Lang.

    Google Scholar 

  • Solbach, L., Wöhrmann, R., & Kliewer, J. (1998). The complex-valued continuous wavelet transform as a preprocessor for auditory scene analysis. In D. F. Rosenthal & H. G. Okuno (Eds.), Computational auditory scene analysis (pp. 273–292). Mahwah, N.J.: Erlbaum.

    Google Scholar 

  • Snyder, B. (2000). Music and memory. Cambridge, MA: MIT Press.

    Google Scholar 

  • Terhardt, E. (1985). Fourier transformation of time signals: Conceptual revision. Acustica, 57, 242–256.

    MATH  Google Scholar 

  • Terhardt, E. (1992). From Speech to language: on auditory information processing. In M.E.H. Schouten (Ed.). The Auditory Processing of Speech. From Sounds to Words (pp. 363-380). New York: Mouton de Gruyter.

    Google Scholar 

  • Terhardt, E. (1998). Akustische Kommunikation. Berlin: Springer.

    Book  Google Scholar 

  • Vormann, M. (1995). Psychoakustische Modellierung der virtuellen Tonhöhe. Diploma thesis (Physics), Carl von Ossietzky University, Oldenburg.

    Google Scholar 

  • Vormann, M. & Weber, R. (1995). Gehörgerechte Darstellung von instationären Umweltgeräuschen mittels Fourier-Time-Transformation (FTT). Fortschritte der AkustikDAGA 1995, pp. 1191–1194.

    Google Scholar 

  • Winer, J., & Schreiner, C. (Eds.). (2011). The Auditory Cortex. New York: Springer.

    Google Scholar 

  • Yen, N. (1987). Time and frequency representation of acoustic signals by means of the wigner distribution function: Implementation and interpretation. Journal of the Acoustical Society of America, 81, 1841–1850.

    Article  Google Scholar 

  • Zhu, X., & Kim, J. (2006). Application of analytic wavelet transform to analysis of highly impulsive noises. Journal of Sound and Vibration, 294, 841–855.

    Article  Google Scholar 

  • Zwicker, E., & Terhardt, E. (1980). Analytical expressions for critical-band rate and critical bandwidth. Journal of Acoustical Society of America, 68, 1523–1525.

    Article  Google Scholar 

  • Zwicker, E., & Fastl, H. (1999). Psychoacoustics. Facts and models (2nd ed.). Berlin: Springer.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Albrecht Schneider .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Schneider, A., Mores, R. (2013). Fourier-Time-Transformation (FTT), Analysis of Sound and Auditory Perception. In: Bader, R. (eds) Sound - Perception - Performance. Current Research in Systematic Musicology, vol 1. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00107-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00107-4_13

  • Published:

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-00106-7

  • Online ISBN: 978-3-319-00107-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics