Elsevier

Signal Processing

Volume 82, Issue 3, March 2002, Pages 433-459
Signal Processing

A new fast algorithm for the unified forward and inverse MDCT/MDST computation

https://doi.org/10.1016/S0165-1684(01)00195-5Get rights and content

Abstract

The modified discrete cosine transform (MDCT) and modified discrete sine transform (MDST) are employed in subband/transform coding schemes as the analysis/synthesis filter banks based on the concept of time domain aliasing cancellation (TDAC). Princen, Bradley and Johnson defined two types of the MDCT, specifically, for an evenly stacked and oddly stacked analysis/synthesis systems. The MDCT is the basic processing component in the international audio coding standards and commercial products for high-quality audio compression. Almost all existing audio coding systems have used the complex-valued or real-valued FFT algorithms, and the DCT/DST of type IV (DCT-IV/DST-IV) for the fast MDCT computation. New fast and efficient algorithm for a unified forward and inverse MDCT/MDST computation in the oddly stacked system is proposed. It is based on the DCT/DST of types II and III (DCT-II/DST-II, DCT-III/DST-III), and the real arithmetic is used only. Corresponding generalized signal flow graph is regular, structurally simple and enables to compute MDCT/MDST and their inverses in general for any N divisible by 4 (N being length of a data sequence). Consequently, the new fast algorithm can be adopted for the MDCT computation in the current audio coding standards such as MPEG family (MPEG-1, MPEG-2, MPEG-2 Advanced Audio Coding and MPEG-4 audio), and in commercial products (proprietary audio coding algorithms) such as Sony MiniDisc/ATRAC/ATRAC2/SDDS digital audio coding systems, the AT&T Perceptual Audio Coder (PAC) or Lucent Technologies PAC/Enhanced PAC/Multichannel PAC, and Dolby Labs AC-3 digital audio compression algorithm. Besides the new fast algorithm has some interesting properties, it provides an efficient implementation of the forward and inverse MDCT computation for layer III in MPEG audio coding, where the length of data blocks N≠2n. Especially, for the AC-3 algorithm, it is shown how both the proposed new MDCT/MDST algorithm and existing fast algorithms/computational architectures for the discrete sinusoidal transforms computation of real data sequences such as the DCT-IV/DST-IV, generalized discrete Fourier transform of type IV (DFT-IV) and generalized discrete Hartley transform of type IV (DHT-IV) can be used for the fast alternate or simultaneous (on-line) MDCT/MDST computation by simple pre- and post-processing of data sequences.

Introduction

The modified discrete cosine transform (MDCT) and modified discrete sine transform (MDST) are employed in subband/transform coding schemes as the analysis/synthesis filter banks based on the concept of time domain aliasing cancellation (TDAC) [50], [51], [52]. Therefore, they are frequently called the “TDAC transforms”. The MDCT is a perfect reconstruction cosine modulated filter bank that is widely used in current perceptual audio coding algorithms. Specifically, the MDCT is the basic processing component for high-quality audio coding in the international audio coding standards such as MPEG-1 [20], [53], MPEG-2 [21], [53], recently established the most efficient MPEG-2 AAC (advanced audio coding) [2], [5], [32], [47], MPEG-4 [22], [32], and in commercial products (proprietary audio coding algorithms) such as Sony MiniDisc/ATRAC (adaptive transform acoustics coding)/ATRAC2/SDDS (sony dynamic digital sound) digital audio coding systems [32], [48], [60], [64], the AT&T perceptual audio coder (PAC) or Lucent Technologies PAC/Enhanced PAC/Multichannel PAC [27], [32], [48], [58], Dolby Labs AC-3 digital audio compression algorithm [3], [16], [32], [48], and in various published audio/speech coding schemes [23], [24], [26], [30], [33], [34], [49], [51], [57]. The excellent review of audio coding algorithms, the history of their development including research, standardization activities and applications is presented in [48].

Princen, Bradley and Johnson defined two types of the MDCT, specifically for an evenly stacked [50] and for oddly stacked analysis/synthesis system [51], [52]. Both systems are very similar. The main difference lies in the transform operations. The MDCT for the evenly stacked system is defined by the transform kernel cos[(π/2)m−(2πk/N)(n+n0)] and the MDCT for the oddly stacked system is defined by the transform kernel cos[(2π/N)(k+12)(n+n0)], where N is the length of a data block, n is the time index, k is the frequency index, and m is the index of the data block. For both systems, the phase factor n0 has a value of N/4+12 which is necessary for perfect reconstruction. The oddly stacked system can also be seen as a modulated lapped transform (MLT) [35], [39], extended lapped transform (ELT) [36], [38], [39], [40], or modulated biorthogonal lapped transform (MBLT) [41], [42] which belong to the class of lapped transforms. Further, it can be regarded as a particular case of perfect reconstruction modulated filter banks (PRMF) [45], [46], or uniform filter banks introduced in [44]. All these filter banks can also be viewed as block transforms, in which the basis functions overlap the adjacent blocks by 50%. The oddly stacked system has some advantages compared with the evenly stacked system. It is uniform, i.e., it has equally spaced bands of equal bandwidth, and its computational structure though more complex, requires the application of the MDCT only, while in the evenly stacked system, alternate applications of the MDCT/MDST are needed. For these reasons, the oddly stacked system is preferred in the audio coding applications.

In existing audio coding systems, the transform operations take a considerable portion of the available processing time. Therefore, the efficient algorithms for the MDCT real-time implementation are of great importance. During the last decade many fast algorithms for the efficient MDCT computation have been developed. Generally, almost in all systems the fast MDCT computation is realized by complex-valued or real-valued FFT algorithms with pre- and post-rotations [15], [17], [19], [23], [25], [30], [56] or it is based on the DCT/DST of type IV (DCT-IV/DST-IV) with pre- and post-permutations [13], [31], [35], [37], [38], [39], [40], [41], [42], [45], [46]. Since there exists a relation between the DCT-IV and DCT of type II (DCT-II) [11], [12], [28], the DCT-IV-based algorithms can be converted to the DCT-II of the same size [13], [42]. Computationally efficient algorithm for real-valued analysis/synthesis filter banks based on a modified DFT, the odd-time odd-frequency DFT (O2 DFT) [1], has been proposed by Cramer and Gluth [15]. First, the original data sequence is split into its even and odd parts and exploiting the symmetry relations between even/odd sequences and their transforms, the MDCT coefficients are obtained by fast O2 DFT algorithm consisting of one N4-point FFT (N is even) and pre- and post-twiddle operations. The similar fast algorithm, however based on O2 DFT for DCT-IV and DST-IV computation [19] has been presented in [30]. The MLT introduced by Malvar [35] is equivalent to the MDCT except for some sign changes. By defining a new sequence from the original one, it was shown that the MLT can be implemented by the fast orthogonal DST-IV of half size. Further, whereas the ELT is the MLT with longer basis functions and the MBLT is a biorthogonal version of the MLT, Malvar [37], [38], [39], [40], [41], [42] showed that they can be realized by the fast orthogonal DCT-IV of half size. Generally, fast algorithms for the computation of the direct and inverse MLT, ELT and MBLT consist of a cascade of window operators and DCT-IV or DST-IV. The use of DCT-IV or DST-IV depends on a form of permutation applied to the original data sequence. A fast MLT algorithm which is the combination of the fast algorithm to compute DCT-IV via DFT [19] with Malvar's one to compute the MLT using DCT-IV has been proposed in [25]. The most efficient algorithm for the fast MDCT computation in terms of arithmetic complexity has been developed by Duhamel et al. [17]. Using a symmetry property of the MDCT and after performing a permutation of the input data sequence, the MDCT computation is mapped into a complex modified DFT of reduced size, which is implemented by the split-radix (N/4)-point FFT with pre- and post-rotations. In addition, the windowing operation can be incorporated into the algorithm resulting in reduced computational complexity of the complete TDAC filter bank implementation [56]. Iwadare et al. [23] presented a fast MDCT algorithm used in the adaptive transform audio codec with adaptive MDCT block size. By decomposing the MDCT transform kernel and using the symmetry of cosine function, the fast algorithm is composed of pre-processing (data shifts, differential calculation and complex pre-multiplications), an (N/2)-point FFT followed by complex post-multiplications. However, if the authors used the more proper differential calculation in step 3 of their algorithm (replacing term 2n by n) they would get DCT-IV-based algorithm and consequently, the (N/2)-point FFT with complex pre- and post-multiplications could be completely eliminated. A highly optimized algorithm for the fast MDCT computation has been described in [59]. The algorithm consists of 8 steps, where in most steps include multiplications by cosine and sine twiddle factors. Although the authors have claimed that their algorithm has low number of arithmetic operations, requires minimum size of storage and is robust against rounding errors, it seems to have a rather complex indexing scheme. A regressive structure derived from sinusoidal recursive formulae for the MDCT and inverse MDCT kernels has been proposed in [14]. This regular structure provides an efficient scheme for the on-line computation of variable length MDCT and its inverse. Recently, two new algorithms for the inverse MDCT computation only have been presented in [18]. They are derived by direct recursive factorization of inverse MDCT kernel. The first algorithm is a variation of the fast DCT-II algorithm [29], and the second one is based on the fast Hartley transform [4]. For both algorithms the authors showed corresponding flowgraphs for N=16. However, generally it can be difficult to derive generalized flowgraphs for increasing values of N. A highly regular algorithm for TDAC technique has been presented in [13]. By permuting the input data sequence, the MDCT and its inverse are converted to the DCT-IV and subsequently to the DCT-II of the same size. A unified fast algorithm for cosine modulated filter banks (MDCT and its two variants, polyphase filter bank) used in the current audio coding standards has been presented in [31]. The forward and inverse MDCT is computed by the DCT-IV of half size with pre- and post-permutations of data sequences. Although, the form of permutation used in [13], [31] seems to be different comparing with Malvar's fast MLT algorithm [39], all algorithms generate the same permuted data sequence for the DCT-IV of half size. It is interesting to note that recently Malvar introduced the modulated complex lapped transform (MCLT) [43] as a complex generalization of the MLT. The MCLT is a particular kind of 2× oversampled generalized DFT filter bank, whose real part corresponds to the MLT and imaginary part is a sine modulated filter bank, which coincides with the MDST defined in this paper. After permutations of the original data sequence, the real part of the MCLT is efficiently computed via the DCT-IV and imaginary part via the DST-IV.

In this paper, new fast algorithm for a unified forward and inverse MDCT/MDST computation in the oddly stacked system is proposed. It is based on the DCT/DST of types II and III (DCT-II/DST-II, DCT-III/DST-III), and the real arithmetic is used only. Corresponding generalized signal flow graph is regular, structurally simple and enables the computation of MDCT/MDST and their inverses in general for any N divisible by 4. The proposed new fast algorithm has some interesting properties. Its intrinsic structure results in a sparse matrix factorization of the MDCT matrix, and the algorithm can be formulated exclusively in terms of Given's plane rotations so simplifying its structure at the cost of increased arithmetic complexity. In particular, the new fast algorithm provides the efficient implementation of the forward and inverse MDCT for layer III in MPEG audio coding standards [9], [10] and the efficient implementation of two variants of cosine modulated filter banks defined by AC-3 digital audio compression algorithm. Consequently, it can be adopted in the current audio coding standards such as MPEG family, and in proprietary audio coding algorithms. Finally, for the AC-3 algorithm, it is shown how both the proposed new fast algorithm and the existing fast algorithms/computational architectures for the discrete sinusoidal transform computation of real data sequences such as the DCT-IV/DST-IV, generalized discrete Fourier transform of type IV (DFT-IV) and generalized discrete Hartley transform of type IV (DHT-IV) can be used for the fast alternate or simultaneous (on-line) MDCT/MDST computation by simple pre- and post-processing of data sequences. Moreover, the relationships among the MDCT/MDST, DCT-II/DST-II, DCT-III/DST-III, DCT-IV/DST-IV, DFT-IV and DHT-IV are revealed.

Section snippets

Definitions and properties of the MDCT/MDST

Let {xn},n=0,1,…,N−1 be an input data sequence. The MDCT and its inverse, IMDCT, in the oddly stacked system are respectively defined as [51], [52]ck=2Nn=0N−1xncosπ2N2n+1+N2(2k+1),k=0,1,…,N2−1,x̂n=k=0N2−1ckcosπ2N2n+1+N2(2k+1),n=0,1,…,N−1.We note that the original definition of the MDCT includes the factor cos(πmk), where m is the index of the data block. However, the modifications by factor cos(πmk) are ignored since they simply invert odd channels in frequency.

The corresponding MDST and

The new fast MDCT/MDST algorithm

In the TDAC analysis/synthesis filter banks, a windowing operation is first applied to the input data sequence. For simplicity and without loss of generality, we shall assume that {xn},n=0,1,…,N−1 represents the windowed input data sequence. Similarly, for simplicity the normalization factor 2/N is omitted. Let us assume that N being the length of the input data sequence is a power of 2, i.e., N=2n and n⩾2.

According to [17] the MDCT given by (1) can be written in the formzk=n=0N−1xncosπ2N

The MDCT computation in digital audio compression algorithm AC-3

AC-3 digital audio compression algorithm [16] developed originally by Dolby Labs has been selected for use in the United States HDTV system, and it has been also utilized in consumer electronics equipment such as cable television and direct broadcast satellite. For the MDCT/IMDCT computation, AC-3 audio coder has adopted the (N/4)-point complex FFT algorithm [17].

AC-3 defines the forward and inverse MDCT, respectively, as [16]cαk=2Nn=0N−1xncosπ2N(2n+1)(2k+1)+π4(2k+1)(1+α),k=0,1,…,N2−1,x̂n=k=0

Conclusions

New fast algorithm for the unified efficient forward and inverse MDCT/MDST computation in the oddly stacked system is proposed. It is based on the DCT-II/DST-II and uses the real arithmetic only. Corresponding generalized signal flow graph is regular, structurally simple and enables to compute MDCT/MDST and their inverses for any N divisible by 4. The new fast algorithm has some interesting properties. Specifically, its intrinsic structure results in a sparse matrix factorization of the MDCT

References (64)

  • V. Britanak et al.

    The fast generalized discrete Fourier transforms: a unified approach to the discrete sinusoidal transforms computation

    Signal Process.

    (December 1999)
  • A.W. Johnson et al.

    Adaptive transform coding incorporating time domain aliasing cancellation

    Speech Comm.

    (December 1987)
  • Z. Wang et al.

    The discrete W transform

    Appl. Math. Comput.

    (1985)
  • G. Bonnerot et al.

    Odd-time odd-frequency discrete Fourier transform for symmetric real-valued series

    Proc. IEEE

    (March 1976)
  • M. Bosi, et al., ISO/IEC MPEG-2 Advanced Audio Coding, 101th AES Convention, Audio Engineering Society preprint #4382,...
  • M. Bosi, S.E. Forshay, High quality audio coding for HDTV: an overview of AC-3, The Seventh International Workshop on...
  • R.N. Bracewell

    The Hartley Transform

    (1986)
  • K. Brandenburg, M. Bosi, MPEG-2 advanced audio coding: overview and applications, 103th AES Convention, Audio...
  • V. Britanak

    A unified approach to the fast computation of discrete sinusoidal transforms I: DCT and DST transforms

    Comput. Artif. Intell.

    (December 1998)
  • V. Britanak

    A unified approach to the fast computation of discrete sinusoidal transforms I: DFT and DWT transforms

    Comput. Arti. Intell.

    (January 1999)
  • V. Britanak et al.

    An efficient implementation of the forward and inverse MDCT in MPEG audio coding

    IEEE Signal Process. Lett.

    (February 2001)
  • V. Britanak et al.

    Corrections to an efficient implementation of the forward and inverse MDCT in MPEG audio coding

    IEEE Signal Process. Lett.

    (October 2001)
  • S.C. Chan et al.

    Direct methods for computing discrete sinusoidal transforms

    IEE Proc. Radar Signal Process.

    (December 1990)
  • S.C. Chan et al.

    Fast algorithms for computing the discrete cosine transform

    IEEE Trans. Circuits Systems II: Analog Digital Signal Process.

    (March 1992)
  • D.Y. Chan et al.

    Regular implementation algorithms of time domain aliasing cancellation

    IEE Proc. Vision Image Signal Process.

    (December 1996)
  • H.C. Chiang et al.

    Regressive implementation for the forward and inverse MDCT in MPEG audio coding

    IEEE Signal Process. Lett.

    (April 1996)
  • S. Cramer, R. Gluth, Computationally efficient real-valued filter banks based on a modified O2 DFT, Signal Processing...
  • Digital Audio Compression (AC-3) Standard, Document of Advanced Television Systems Committee (ATSC), Audio Specialist...
  • P. Duhamel, Y. Mahieux, J.P. Petit, A fast algorithm for the implementation of filter banks based on time domain...
  • Y.H. Fan et al.

    On fast algorithms for computing the inverse modified discrete cosine transform

    IEEE Signal Process. Lett.

    (March 1999)
  • R. Gluth, Regular FFT-related transform kernels for DCT/DST-based polyphase filter banks, Proceedings of the IEEE...
  • ISO/IEC JTC1/SC29/WG11 MPEG, IS11172-3, Information Technology—Coding of Moving Pictures and Associated Audio for...
  • ISO/IEC JTC1/SC29/WG11 MPEG, IS13818-3, Information Technology—Generic Coding of Moving Pictures and Associated Audio,...
  • ISO/IEC JTC1/SC29/WGII, MPEG99/N2916, Study on MPEG-4 Version.2 Audio FPDAM, October...
  • M. Iwadare et al.

    A 128kb/s Hi-Fi audio CODEC based on adaptive transform coding with adaptive block size MDCT

    IEEE J. Select. Areas Comm.

    (January 1992)
  • N. Iwakami, T. Moriya, S. Miki, High-quality audio-coding at less than 64kbit/s by using transform-domain weighted...
  • Ch. Jing et al.

    Fast algorithm for computing modulated lapped transform

    Electron. Lett.

    (June 2001)
  • A. Jin, et al., Scalable audio coder based on quantizer units of MDCT coefficients, Proceedings of the IEEE ICASSP’99,...
  • J.D. Johnston, A.J. Ferreira, Sum–difference stereo transform coding, Proceedings of the IEEE ICASSP’92, San Francisco,...
  • C.W. Kok

    Fast algorithm for computing discrete cosine transform

    IEEE Trans. Signal Process.

    (March 1997)
  • B.G. Lee

    A new fast algorithm to compute the discrete cosine transform

    IEEE Trans. Acoust. Speech Signal Process.

    (December 1984)
  • B. Lincoln, An experimental high fidelity perceptual audio coder, Project in MUS420 Win 97, Center for Computer...
  • Cited by (55)

    • A survey on quality-assurance approximate stream processing and applications

      2019, Future Generation Computer Systems
      Citation Excerpt :

      The sketch structure can build approximate synopses with small space and update time and meanwhile guarantee the quality. In the earlier stage, wavelets, discrete Fourier transform [99] and discrete cosine transform [100] were developed to compress signal data through maintaining significant transform values (e.g., wavelet coefficients when using wavelets) while omitting insignificant ones. These methods are mainly used for specific applications, including signal analysis, sensor-based measurements, and location-based services, etc.

    • Generalized fast mixed-radix algorithm for the computation of forward and inverse MDCTs

      2012, Signal Processing
      Citation Excerpt :

      Since the layer III specifies two different block sizes: a long block (N=36) and a short block (N=12), where the lengths of data blocks are not a power of two, the MDCT computation via the complex-valued FFT is not so efficient because it requires the complex number operations and also much memory to store the complex signal. To solve this problem, Britanak [7–12], Nikolajevic and Fettweis [13], Lee [14], Cheng and Hsu [15], and Truong et al. [16] proposed various fast algorithms based on the computation of lower order type-IV discrete cosine transform (DCT-IV) or type-II discrete cosine transform (DCT-II). Shao and Johnson [17] derived a new fast algorithm to compute the DCT-IV and MDCT for N=2n based on a modified split-radix FFT algorithm [18].

    • A survey of efficient MDCT implementations in MP3 audio coding standard: Retrospective and state-of-the-art

      2011, Signal Processing
      Citation Excerpt :

      The proposed efficient implementations of polyphase analysis/synthesis filter banks with additional possible optimizations are almost completely presented in [7–10,23]. On the other hand, many fast MDCT algorithms have been developed in the time period 1990–2010 which could/can be adopted for the efficient MDCT implementation in MP3 [12–59]. Almost all existing fast MDCT algorithms employ a discrete sinusoidal unitary transform of reduced size such as the discrete Fourier transform (DFT) and its fast implementation, FFT [13–15,17,20,26,28], discrete cosine/sine transform of type II (DCT-II/DST-II) [29], discrete cosine/sine transform of type IV (DCT-IV/DST-IV) [12,16,23,35,47,53,55,57,58], or they are obtained in DCT-IV/DCT-II combination [19,21,32,36,41,44].

    View all citing articles on Scopus
    View full text