Abstract
Stream processors can achieve high performance in stream applications that share stream characteristics of large parallelism, intensive computation and little data reuse. Transform coding, as a core component in video compression, is widely used in video storage and video transmission. This paper summarizes stream execution mechanism and explores design approaches of programmable stream processors including the Imagine stream processor and graphics processing unit (GPU). Based on the stream processing model, stream algorithms for block-based and frame-based (nonblock-based) transform coding are presented and mapped onto stream processors. Especially, an Interleaved Streaming Transform (IST) algorithm on Imagine and a Row-wise Zonal Transform (RZT) algorithm on GPU for 4×4 integer transform in H.264 are proposed to exploit great potential of stream processing for block-based transform. Our experiments of transform coding suite on Imagine and GPU show that the coding efficiency of stream processors is far beyond the real-time requirements of current video applications, dealing with a variety of different video resolutions ranging from QCIF to high definition (HD). The performance evaluation of stream implementations discusses the architectural supports for transform coding, and presents the significant improvements over other programmable platforms. Transform coding may take advantage of the flexibility of programmable stream processors with high performance to play an important role in the future.
Similar content being viewed by others
References
MPEG-2 Video Group (1996) Information technology-generic coding of moving pictures and associated audio information: Video (ISO/IEC 13818-2)
H.264/MPEG-4 Part 10 Transform & quantization, http://www.vcodex.com
JVT (2003) Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264| ISO/IEC 14496-10 AVC). May
Alam M, Badawy W, Jullien G (2002) Integer DWT reference code and specifications for MPEG-4 (ISO/IEC JTC1/SC29/WG11 MPEG2002/M8582)
Wei F, Li X (2004) SIMD optimization of transform and quantization in H.264. Comput Eng Appl 17:24–27
Liu W, Liu K, Wu C, Li Y (2006) The SIMD implementation of reversible integer wavelet transform. Space Electron Technol 1:26–31
Shen H, Fan Y, Wang F, Hao C (2004) An implementation of transform encoding on DSP in H.264. Meas Control Technol, Sup
Liu B, Liu G, Su R (2005) Implementation and optimization of pixel-compression module in H.264 based on DSP system. Microelectronics 22(6):200–205
Endrigo R, Porto C, Schiavon Porto M, Leal da Silva T, Paiva da Rosa LZ, Guntzel JLA, Agostini LV (2005) An integer 2-D DCT architecture for H.264/AVC video coding standard. In: XX SIM-south symposium on microelectronics, 2005
Amer I, Badawy W, Jullien G (2004) Hardware prototyping for the H.264 4×4 transformation. In: International conference on acoustics, speech, and signal processing, 2004
Dally B, Hanrahan P, Fedkiw R (2001) A streaming supercomputer, September 18, 2001, http://merrimac.stanford.edu/
Kapasi UJ, Rixner S, Dally WJ, Khailany B, Ahn JH, Mattson P, Owens JD (2003) Programmable stream processors. IEEE Comput, August, 54–62
Khailany B, Dally WJ, Kapasi UJ, Mattson P, Namkoong J, Owens JD, Towles B, Chang A (2001) Imagine: media processing with streams. IEEE Micro, March–April 2001
Wu E, Liu Y (2004) General-purpose computation on GPU. J Comput Aided Des Comput Graph 16(5):601–612
Kapasi UJ, Dally WJ, Rixner S, Owens JD, Khailany B (2002) The imagine stream processor. In: International conference on computer design, September 2002
Wu E (2004) State of the art and future challenge on general purpose computation by graphics processing unit. J Softw 15(10):1493–1504
Owens J (2005) Streaming architectures and technology trends. GPU Gems 2, NVIDIA Corporation
Richardson IEG (2003) H.264 and MPEG-4 video compression—video coding for next-generation multimedia. Wiley, New York
Sweldens W (1998) The lifting scheme: a construction of second generation wavelets. SIAM J Math Anal 29(2):511–546
Das A, Mattson P, Kapasi U, Owens J, Rixner S, Jayasena N (2004) Imagine programming system user’s guide 2.0, June 2004
Malvar HS, Hallapuro A, Karczewicz M, Kerofsky L (2003) Low-complexity transform and quantization in H.264/AVC. IEEE Trans Circuits Syst Video Technol 13(7)
Li H, Zhang C, Li L, Pang M (2006) A streaming implementation of transform and quantization in H.264. In: High performance computing and communications, September 2006
Jayasena N, Erez M, Ahn JH, Dally WJ (2004) Stream register files with indexed access. In: Tenth international symposium on high performance computer architecture, February 2004
The Imagine Project, Stanford University, http://cva.stanford.edu/imagine/
Mark WR, Glanville RS, Akeley K, Kilgard MJ (2003) Cg: a system for programming graphics hardware in a C-like language. ACM Trans Graph, August 2003
Tenllado C, Setoain J, Prieto M, Pinuel L, Tirado F (2007) Parallel implementation of the 2D discrete wavelet transform on graphics processing units: fiter-bank versus lifting. IEEE Trans Parallel Distrib Syst
Rixner S, Dally WJ, Kapasi UJ, Khailany B, Lopez-Lagunas A, Mattson PR, Owens JD (1998) A bandwidth-efficient architecture for media processing. Micro-31
Texas Instruments Inc, http://www.ti.com
Kuzmanov G, Zafarifar B, Shrestha P, Vassiliadis S (2002) Microarchitectural extension for lifting-based DWT. 3rd progress workshop on embedded systems, October 2002
Mattson P, Kapasi U, Owens J (2002) Imagine programming system develop’s guide, http://cva.stanford.edu/Imagine/project/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, H., Zhang, C., Li, L. et al. Transform coding on programmable stream processors. J Supercomput 45, 66–87 (2008). https://doi.org/10.1007/s11227-008-0192-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-008-0192-2