Abstract
Computing ”in-place and in-order” FFT poses a very difficult problem on hierarchical memory architectures where data movement can seriously degrade the performance. In this paper we present recursive formulation of a self sorting in-place FFT algorithm that adapts to the target architecture. For transform sizes where an in-place, in-order execution is not possible, we show how schedules can be constructed that use minimum work-space to perform the computation efficiently. In order to express and construct FFT schedules, we present a context free grammar that generates the FFT Schedule Specification Language. We conclude by comparing the performance of our in-place in-order FFT implementation with that of other well known FFT libraries. We also present a performance comparison between the out-of-place and in-place execution of various FFT sizes.
Keywords
- Fast Fourier Transform
- Fast Fourier Transform Algorithm
- Adaptive Computation
- Installation Time
- Middle Rank
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ali, A., Johnsson, L., Mirkovic, D.: Empirical Auto-tuning Code Generator for FFT and Trignometric Transforms. In: ODES: 5th Workshop on Optimizations for DSP and Embedded Systems, in conjunction with International Symposium on Code Generation and Optimization (CGO), San Jose, CA (March 2007)
Ali, A., Johnsson, L., Subhlok, J.: Scheduling FFT Computation on SMP and Multicore Systems. In: International Conference on Supercomputing, Seattle, WA (June 2007)
Burrus, C.S., Eschenbacher, P.W.: An in-place, in-order prime factor FFT algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing 29, 806–817 (1981)
Burrus, C.S., Johnson, H.W.: An in-order, in-place radix-2 FFT. IEEE Transactions on Acoustics, Speech, and Signal Processing 9, 473–476 (1984)
Cooley, J., Tukey, J.: An algorithm for the machine computation of complex fourier series. Mathematics of Computation 19, 297–301 (1965)
Franchetti, F., Voronenko, Y., Püschel, M.: FFT program generation for shared memory: SMP and multicore. In: SC 2006. Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 115. ACM Press, New York (2006)
Frigo, M.: A fast Fourier transform compiler. In: PLDI 1999. Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, pp. 169–180. ACM Press, New York (1999)
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. In: Proceedings of the IEEE 1993, vol. 2, pp. 216–231 (2005), special issue on Program Generation, Optimization, and Platform Adaptation
Hegland, M.: A self-sorting in-place fast Fourier transform algorithm suitable for vector and parallel processing. Numerische Mathematik 68(4), 507–547 (1994)
Loan, C.V.: Computational frameworks for the fast Fourier transform. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (1992)
Mirkovic, D., Johnsson, S.L.: Automatic Performance Tuning in the UHFFT Library. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 71–80. Springer, Heidelberg (2001)
Mirkovic, D., Mahasoom, R., Johnsson, S.L.: An adaptive software library for fast Fourier transforms. In: International Conference on Supercomputing, pp. 215–224 (2000)
Püschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B.W., Xiong, J., Franchetti, F., Gačić, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation 93(2), 232–275 (2005)
Singleton, R.C.: An algorithm for computing the mixed radix fast Fourier transform. IEEE Transactions on Audio and Electroacoustics 17, 93–103 (1969)
Tang, P.T.P.: DFTI – A New Interface for Fast Fourier Transform Libraries. ACM Transactions on Mathematical Software 31(4), 475–507 (2005)
Temperton, C.: Self-Sorting Mixed-Radix Fast Fourier Transforms. Journal of Computational Physics 52, 1–23 (1983)
Temperton, C.: Implementation of a Self-Sorting In-Place Prime Factor FFT Algorithm. Journal of Computational Physics 54, 283–299 (1985)
Temperton, C.: A new set of minimum-add small-n rotated DFT modules. J. Comput. Phys. 75(1), 190–198 (1988)
Temperton, C.: Self-Sorting In-Place Fast Fourier Transforms. SIAM Journal on Scientific and Statistical Computing 12(4), 808–823 (1991)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ali, A., Johnsson, L., Subhlok, J. (2007). Adaptive Computation of Self Sorting In-Place FFTs on Hierarchical Memory Architectures. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds) High Performance Computing and Communications. HPCC 2007. Lecture Notes in Computer Science, vol 4782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75444-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-75444-2_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75443-5
Online ISBN: 978-3-540-75444-2
eBook Packages: Computer ScienceComputer Science (R0)