Fourier-Time-Transformation (FTT), Analysis of Sound and Auditory Perception

Schneider, Albrecht; Mores, Robert

doi:10.1007/978-3-319-00107-4_13

Fourier-Time-Transformation (FTT), Analysis of Sound and Auditory Perception

Albrecht Schneider² &
Robert Mores³

Chapter
First Online: 01 January 2013

2262 Accesses
3 Citations

Part of the book series: Current Research in Systematic Musicology ((CRSM,volume 1))

Abstract

The Fourier/time transformation (FTT) has been proposed by Ernst Terhardt (1985, 1992, 1998) as a tool for analysis and representation of audio signals such as speech and music. Terhardt (1985) issued the FTT in the context of an updated interpretation of the Fourier transform (FT) and with the aim to develop a transform suited to perform time/frequency analysis comparable to that of the mammalian auditory system. FTT is re-examined in this chapter and some other methods relevant for musical acoustics and psychoacoustics such as the short-time Fourier transform (STFT), autoregressive spectral modeling (AR) and Wavelet transform (WT) are presented in a brief survey for comparison, and are illustrated by some examples. Different approaches to time/frequency analysis are also viewed as to their power with respect to the so-called uncertainty product Δt Δf.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Settings for the analysis performed with the Praat software (Boersma and Weenink 2011) were a time window of 30 ms with a Gaussian weighting, a time step of 2 ms from one frame to the next, an analysis bandwidth of 2 kHz and a frequency step of 2 Hz. The sound sample of 11.17 s was processed in 5253 (overlapping) frames.
2.
A formal proof can be given on the basis of the Cauchy-Bunjakowski-Schwarz inequality (cf. Meyer and Guicking 1974, 95, 108; Papoulis 1962, 63).
3.
Applying no specific windowing function means a rectangular window is chosen for which the so-called Equivalent Noise Bandwidth (ENBW [Bins], see DeFatta et al. 1988, 262ff.) is 1.0.
4.
There are several definitions as to ‚linear‘. In electronics, linear refers to circuits (like LRC filters) in which linear relations exist between physical magnitudes (induction, capacity, resistance, gain) and where all voltages and current are proportional to the electromotive force driving the system (cf. Küpfmüller 1968, 12f.). In signals and systems theory, linearity is defined by Bachmann (1992, 9) like this: superposition at the input has the same effect as superposition at the output.
5.
Analysis for 0–1 kHz was performed with the Sonogram software (Hiroshi Momose 1991); settings were FFT+Wigner, time window 2048 pts, Hanning weighting, time increment 85 pts = 1.77 ms; LPC, sideband suppression 80 Hz, dynamic range of analysis and graph representation −20 to −1 dB.
6.
Historic organ of St. Bartholomäus, Mittelnkirchen, Altes Land, build by Arp Schnitger, Jacob Albrecht and Johann Matthias Schreiber 1688–1753. The Quintadena 16′ pipe rank is in the Hauptwerk of the organ.
7.
Built by Joris du Mery 1742–1748.
8.
FFT: 8192, Hanning, Hop ratio 0.25, zero pad factor 2.0. Analysis performed with Spectro 3.01 (Perry Cook, Gary Scavone).
9.
The code used for analysis was programmed in MatLab by Can Karadogan and Florian Keiler while working in the Department of Signal Processing and Communications of the Helmut-Schmidt-Universität Hamburg. The AR tool was developed to be used in a joint project directed at the study of transients in the sound of musical instruments (cf. Keiler et al. 2003).
10.
Fourier transforms of the steady-state part of the sound show that partial frequencies for higher harmonic partials are not exactly at integer ratios. Moreover, frequencies for partials including the fundamental fluctuate over time as can be seen from increasing values for variance of frequencies in longer FFT transforms (e.g., 65536). However, ACF analysis clearly gives a single ‘pitch’ for this pipe tone corresponding to 143 Hz.
11.
The usual approach (cf. Cohen 1995, 30ff., Flandrin 1999, 26ff.) is to calculate the so-called analytic signal by means of a Hilbert transform (Flandrin rightly calls the analytic signal a “complexified” signal).
12.
The same consideration was made in “running” autocorrelation algorithms, which typically “slide” along a time signal and include a weighting function to successively discard past sample values so that ACF in fact is computed from an “effective time window” of N samples up to the sample point t moving with time. As to the equivalence of “running” ACF and FTT, see Terhardt 1998, 94f.
13.
A more detailed analytic formulation of the FTT is given by Mummert 1997.
14.
For example, one CB included in the table given by Zwicker and Terhardt (1980, p. 152), ranges from 920 to 1080 Hz with f _c = 1000 Hz and is 160 Hz wide; divided by 25, the frequency step would be 160/25 = 6.4 Hz as compared to the jnd at 1000 Hz, which is ca. 3 Hz.
15.
The ENBW for the Blackman window is 1.73 bins in DFT and the 3.0 dB bandwidth is 1.68 bins.

References

Bachmann, W. (1992). Signalanalyse. Grundlagen und mathematische Verfahren. Braunschweig: Vieweg.
Google Scholar
Beauchamp, James. (2007). Analysis and synthesis of musical instrument sounds. In J. Beauchamp (Ed.), Analysis, synthesis, and perception of musical sounds (pp. 1–89). New York: Springer.
Chapter Google Scholar
Bilsen, F. & Kievits, I. (1989). The minimum integration time of the auditory system. Preprint 2746, AES Convention Hamburg March 1989.
Google Scholar
Boersma, P. & Weenink, D. (2011). Praat. Doing phonetics by computer (version 5232). Amsterdam: University of Amsterdam, Institute of Phonetics.
Google Scholar
Boersma, P. (1993). Accurate short-term Analysis of the fundamental frequency and the harmonic-to-noise ratio of a sampled sound: Proceedins of Institute of Phonetics, University of Amsterdam (Vol. 17 pp. 97–110).
Google Scholar
Bracewell, R. (1978). Fourier transform (2nd ed.). New York: McGraw-Hill.
MATH Google Scholar
Bregman, A. (1990). Auditory scene analysis. Cambridge: MIT Press.
Google Scholar
Bürck, W., Kotowski, P. & Lichte, H. (1935). Der Aufbau des Tonhöhenbewußtseins. Elektrische Nachrichtentechnik, 12, 326–333.
Google Scholar
Cohen, L. (1995). Time-frequency analysis. Upper Saddle River, N.J.: Prentice-Hall.
Google Scholar
de Boer, E. (1976). On the “residue” and auditory pitch perception. In W. D. Keidel & W. D. Neff (Eds.), Handbook of sensory physiology (Vol. 3, pp. 479–583). New York: Springer.
Google Scholar
de Cheveigné, A. (2005). Pitch perception models. In C. Plack, A. Oxenham, R. Fay, A. Popper (Eds.). Pitch. neural coding and perception (pp. 169–230). New York: Springer.
Google Scholar
DeFatta, D., J. Lucas, & Hodgkiss, W. (1988). Digital signal processing. A system design approach. New York: Wiley.
Google Scholar
Dellomo, M., & Jacyna, G. (1991). Wigner transforms, Gabor coefficients, and Weyl-Heisenberg wavelets. Journal of Acoustical Society of America, 89, 2355–2361.
Article Google Scholar
Dutilleux, P., Grossmann A. & Kronland-Martinet, R. (1988). Application of the wavelet transform to the analysis, transformation and synthesis of musical sound. Preprint 2727, AES Convention 85, November 1988.
Google Scholar
Eddins, D., & Green, D. (1995). Temporal integration and temporal resolution. In B. C. J. Moore (Ed.), Hearing (pp. 207–242). San Diego: Academic Press.
Chapter Google Scholar
Evangelista, G. (1997). Wavelet representations of musical signals. In C. Roads, St. Pope, A. Piccialli, G. de Poli (Eds.), Musical signal processing (pp. 127–153). Lisse: Swets and Zeitlinger.
Google Scholar
Flandrin, P. (1999). Time-Frequency/Time-Scale Analysis. San Diego: Academic Press.
MATH Google Scholar
Gabor, D. (1946). Theory of communication. Journal of Institution of Electrical Engineering, 93, 429–457.
Google Scholar
Gafori, F. (1496/1967/1968). Practica Musicae. Milan (Reprint Farnborough, Hants.: Gregg Pr. 1967); [English translation and transcription of musical examples by Clement Miller]. American Institute of Musicology 1968).
Google Scholar
Greenwood, D. (1990). A cochlear frequency-position function for several species—29 years later. Journal of Acoustical Society of America, 87, 2592–2605.
Article Google Scholar
Heldmann, K. (1993). Wahrnehmung, gehörgerechte Analyse und Merkmalsextraktion technischer Schalle. Ph.D. thesis, Technical University of Munich.
Google Scholar
Hut, R., Boone, M., & Gisolf, A. (2006). Cochlear modeling as time-frequency analysis tool. Acustica, 92, 629–636.
Google Scholar
Jurado, C., & M, Brian. (2010). Frequency selectivity for frequencies below 100 Hz: Comparison with mid-frequencies. Journal of Acoustical Society of America, 128, 3585–3596.
Article Google Scholar
Keiler, F., Karadogan, C., Zölzer, U. & Schneider, A. (2003). Analysis of transient musical sounds by auto-regressive modeling: Proceedings of the 6 ^th International Conference on Digital Audio Effects (DAFx-03) (pp. 301–304). London: St. Marys.
Google Scholar
Kostek, B. (2005). Perception-based data processing in acoustics. Berlin: Springer.
Google Scholar
Kral, A., & Majérnik, V. (1996). Neural networks simulating the frequency discrimination of hearing for non-stationary short tone stimuli. Biological Cybernetics, 74, 359–366.
Article Google Scholar
Küpfmüller, K. (1968). Die Systemtheorie der elektrischen Nachrichtenübertragung (3rd ed.). Stuttgart: Hirzel.
Google Scholar
Mammano, F., & Nobili, R. (1993). Biophysics of the cochlea: Linear approximation. Journal of Acoustical Society of America, 93, 3320–3332.
Article Google Scholar
Markel, J., & Gray, A. (1976). Linear prediction of speech. Berlin: Springer.
Book MATH Google Scholar
Marple, S. L. (1987). Digital spectral analysis. Englewood Cliffs, N.J.: Prentice-Hall.
Google Scholar
Meddis, R., & O’Mard, L. (1997). A unitary model of pitch perception. Journal of Acoustical Society of America, 102, 1811–1820.
Article Google Scholar
Meddis, R., & O’Mard, L. (2006). Virtual pitch in a computational physiological model. Journal of Acoustical Society of America, 120, 3861–3869.
Article Google Scholar
Meddis, R. & Lopez-Poveda, E. (2010). Auditory periphery: From pinna to auditory nerve. In R. Meddis et al. (Eds.), Computational models of the auditory system (pp. 7–38). New York: Springer.
Google Scholar
Meddis, R., Lopez-Poveda, E., Fay, R., & Popper, A. (Eds.). (2010). Computational models of the auditory system. New York: Springer.
Google Scholar
Messner, G. (2011). Du krächzt wie ein Rabe…, singst wie eine Nachtigall…In A. Schmidhofer, St. Jena (Eds.), Klangfarbe. Vergleichend-systematische und musikhistorische Perspektiven. Frankfurt/M.: P. Lang, pp. 205–217 (plus sound examples on a CD in the book).
Google Scholar
Mertins, A. (1996). Signaltheorie. Stuttgart: Teubner.
Book MATH Google Scholar
Mertins, A. (1999). Signal analysis. Chichester: Wiley.
Book MATH Google Scholar
Meyer, E., & Guicking, D. (1974). Schwingungslehre. Braunschweig: Vieweg.
Book Google Scholar
Momose, H. (1991). Sonogram. Davis, CA: University of Cal.
Google Scholar
Moore, B. (1995). Frequency analysis and masking. In B. Moore (Ed.), Hearing (pp. 161–205). San Diego: Academic Press.
Chapter Google Scholar
Moore, B. (2008). An introduction to the psychology of hearing (5th ed.). Bingley: Emerald.
Google Scholar
Mummert, M. (1997). Sprachcodierung durch Konturierung eines gehörangepaßten Spektrogramms und ihre Anwendung zur Datenreduktion. Ph.D. thesis, Technical University of Munich.
Google Scholar
Netten, S., & Duifhuis, H. (1983). Modelling an active, nonlinear cochlea. In E. de Boer & M. Viergever (Eds.), Mechanics of Hearing. Delft: Delft University Pr., 143–151.
Google Scholar
Nobili, R., & Mammano, F. (1999). Biophysics of the cochlea II: Stationary nonlinear phenomenology. Journal of Acoustical Society of America, 99, 2244–2255.
Article Google Scholar
Oertel, D., Fay, R., & Popper, A. (Eds.). (2002). Integrative functions in the mammalian auditory pathway. New York: Springer.
Google Scholar
Papoulis, A. (1962). The Fourier Integral and its applications. New York: McGraw-Hill.
MATH Google Scholar
Patterson, R., Nimmo-Smith, I., Weber, D., & Milroy, R. (1982). The deterioration of hearing with age: Frequency selectivity, the critical ratio, the audiogram, and speech threshold. Journal of the Acoustical Society of America, 72, 1788–1803.
Article Google Scholar
Patterson, R., Robinson, K., Holdsworth, J., McMcKeown, D., Zhang, C., & Allerhand, M. (1992). Complex sounds and auditory images. Advances in the Biosciences, 83, 429–443.
Google Scholar
Pickles, Ja. (2008). An Introduction the Physiology of Hearing (3rd ed.). Bingley: Emerald.
Google Scholar
Pressnitzer, D., Patterson, R., & Krumbholz, K. (2001). The lower limit of melodic pitch. Journal of the Acoustical Society of America, 109, 2074–2084.
Article Google Scholar
Rodet, X., & Schwarz, D. (2007). Spectral envelopes and additive+residual analysis/synthesis. In J. Beauchamp (Ed.), Analysis, Synthesis, and Perception of Musical Sounds (pp. 174–227). New York: Springer.
Google Scholar
Rossing, T. (1982). The Science of Sound. CA: Addison—Wesley.
Google Scholar
Rücker, C. (1997). Berechnung von Erregungsverteilungen aus FTT-Spektren. Fortschritte der Akustik—DAGA 1997, pp. 484–485.
Google Scholar
Russo, M., Rožić, N., & Stella, M. (2011). Biophysical cochlear model: Time-frequency analysis and signal reconstruction. Acustica, 97, 632–640.
Article Google Scholar
Schlang, M. & Mummert, M. (1990). Die Bedeutung der Fensterfunktion für die Fourier-t-Transformation als gehörgerechte Spektralanalyse. Fortschritte der Akustik, DAGA 1990, Bad Honnef 1990, pp. 1043–1046.
Google Scholar
Schneider, A. (1997). Tonhöhe, Skala, Klang. Akustische, tonometrische und psychoakustische Studien auf vergleichender Grundlage. Bonn: Orpheus-Verlag für Syst. Musikwiss.
Google Scholar
Schneider, A. (2001). Complex inharmonic sounds, perceptual ambiguity, and musical imagery. In R. I. Godøy & H. Jørgensen (Eds.), Musical imagery (pp. 95–116). Lisse: Swets and Zeitlinger.
Google Scholar
Schneider, A. & Frieler, K. (2009). Perception of harmonic and inharmonic sounds: Results from ear models. In S. Ystad, R. Kronland-Martinet & K. Jensen (Eds.), Computer music modeling and retrieval. Genesis of meaning in sound and music (pp. 18–44). Berlin: Springer.
Google Scholar
Schneider, A., von Ruschkowski, A., & Bader, R. (2009). Klangliche Rauhigkeit, ihre Wahrnehmung und Messung. In R. Bader (Ed.), Musical acoustics, neurocognition and psychology of music (pp. 103–148). Frankfurt: P. Lang.
Google Scholar
Schneider, A., & Tsatsishvili, V. (2011). Perception of musical intervals at very low frequencies: Some experimental findings. In A. Schneider & A. von Ruschkowski (Eds.), Systematic musicology: Empirical and theoretical studies (pp. 99–125). Frankfurt: P. Lang.
Google Scholar
Solbach, L., Wöhrmann, R., & Kliewer, J. (1998). The complex-valued continuous wavelet transform as a preprocessor for auditory scene analysis. In D. F. Rosenthal & H. G. Okuno (Eds.), Computational auditory scene analysis (pp. 273–292). Mahwah, N.J.: Erlbaum.
Google Scholar
Snyder, B. (2000). Music and memory. Cambridge, MA: MIT Press.
Google Scholar
Terhardt, E. (1985). Fourier transformation of time signals: Conceptual revision. Acustica, 57, 242–256.
MATH Google Scholar
Terhardt, E. (1992). From Speech to language: on auditory information processing. In M.E.H. Schouten (Ed.). The Auditory Processing of Speech. From Sounds to Words (pp. 363-380). New York: Mouton de Gruyter.
Google Scholar
Terhardt, E. (1998). Akustische Kommunikation. Berlin: Springer.
Book Google Scholar
Vormann, M. (1995). Psychoakustische Modellierung der virtuellen Tonhöhe. Diploma thesis (Physics), Carl von Ossietzky University, Oldenburg.
Google Scholar
Vormann, M. & Weber, R. (1995). Gehörgerechte Darstellung von instationären Umweltgeräuschen mittels Fourier-Time-Transformation (FTT). Fortschritte der Akustik—DAGA 1995, pp. 1191–1194.
Google Scholar
Winer, J., & Schreiner, C. (Eds.). (2011). The Auditory Cortex. New York: Springer.
Google Scholar
Yen, N. (1987). Time and frequency representation of acoustic signals by means of the wigner distribution function: Implementation and interpretation. Journal of the Acoustical Society of America, 81, 1841–1850.
Article Google Scholar
Zhu, X., & Kim, J. (2006). Application of analytic wavelet transform to analysis of highly impulsive noises. Journal of Sound and Vibration, 294, 841–855.
Article Google Scholar
Zwicker, E., & Terhardt, E. (1980). Analytical expressions for critical-band rate and critical bandwidth. Journal of Acoustical Society of America, 68, 1523–1525.
Article Google Scholar
Zwicker, E., & Fastl, H. (1999). Psychoacoustics. Facts and models (2nd ed.). Berlin: Springer.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Musicology, University of Hamburg, Neue Rabenstr. 13, D-20354, Hamburg, Germany
Albrecht Schneider
University of Applied Sciences, DMI Faculty, Finkenau 35, D-22081, Hamburg, Germany
Robert Mores

Authors

Albrecht Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Robert Mores
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Albrecht Schneider .

Editor information

Editors and Affiliations

Institute of Musicology, University of Hamburg, Neue Rabenstr. 13, Hamburg, 20354, Germany
Rolf Bader

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schneider, A., Mores, R. (2013). Fourier-Time-Transformation (FTT), Analysis of Sound and Auditory Perception. In: Bader, R. (eds) Sound - Perception - Performance. Current Research in Systematic Musicology, vol 1. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00107-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-00107-4_13
Published: 24 May 2013
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00106-7
Online ISBN: 978-3-319-00107-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics