Elsevier

Pattern Recognition

Volume 40, Issue 7, July 2007, Pages 2049-2062
Pattern Recognition

Unified feature analysis in JPEG and JPEG 2000-compressed domains

https://doi.org/10.1016/j.patcog.2006.11.009Get rights and content

Abstract

Retrieving images compressed by different algorithms typically involves a pre-processing operation to decompress them onto the spatial domain from which features are extracted for further analysis. Our objective is to investigate common features that can be found in JPEG-compressed and JPEG 2000-compressed images so that image indexing can be done directly in their respective compressed domains. A fundamental difference between JPEG and JPEG 2000 is their transforms; the former uses a block-based discrete cosine transform (BDCT) while the latter uses a wavelet transform (WT). Direct comparison on BDCT blocks and WT subbands cannot reveal their relationship. By employing our proposed subband-filtering model, the BDCT coefficients can be concatenated to form structures similar to WT subbands. Our theoretical studies show that the concatenated BDCT and WT filters share common characteristics in terms of passband regions, magnitude and energy spectra. In particular, their low-pass filters are identical for Haar wavelets and highly similar for other wavelet kernels. Despite the fact that compression can affect features that can be extracted, our experimental results confirm that common features can always be extracted from JPEG- and JPEG 2000-compressed domains irrespective of the values of the compression ratio and the types of WT kernels used. As a result, similar JPEG-compressed and JPEG 2000-compressed images can be retrieved from one another without requiring a full decompression.

Introduction

With the rapid growth of Internet and multimedia systems, the use of visual information has increased enormously, such that image indexing and retrieval techniques have become important. These techniques extract meaningful information (features) from an image so that images can be classified and retrieved efficiently based on their contents. Examples of application areas include video on demand applications and multimedia information systems.

Over recent decades, retrieval systems such as QBIC, Virage, Photobook and Visual SEEK have been proposed [1], [2], [3], [4], [5], [6], [7], [8]. All of these systems use features extracted from the spatial domain for indexing. This is in contrast to the fact that images are often compressed using JPEG or JPEG 2000 to reduce their size for storage and transmission [9], [10], [11]. Retrieving these kinds of compressed images then requires reconversion to the uncompressed spatial domain for feature extraction and analysis. This approach requires many decompression operations, especially for large image archives. To avoid some of these operations, it was proposed that feature extraction be done directly in the transformed domains. For example, DCT-based indexing techniques extracted DCT coefficients in JPEG for indexing [12], [13], [14], [15], [16], [17], [18], [19] while wavelet-based indexing techniques used wavelet coefficients as features [20], [21], [22], [23], [24], [25], [26].

Different compression techniques result in different transformed coefficients. This in turn implies that features that can be extracted in the compressed domain depend greatly on the compression scheme used. It is extremely desirable if we can explore features that are common in the uncompressed spatial domain, DCT and wavelet transform (WT) domains, so that indexing can be done from their original domain without incurring a full decompression. Direct comparison between DCT and WT domains cannot easily reveal their relationship. We proposed a subband-filtering model that can be used to theoretically study transformed outputs from JPEG and JPEG 2000 schemes. This helps quantify how similar the two transformed outputs are and thus helps identify common features in these two transformed domains. As compression can seriously affect image quality, it will also affect features that can be extracted in the compressed domains. To test the effect of compression on the common features found in the two transformed domains, a retrieval system was built that consisted of JPEG-compressed and JPEG 2000-compressed images at different compression ratios.

This paper is organized as follows. Section 2 briefly discusses the DCT in JPEG and the WT in JPEG 2000 compression schemes. Our proposed subband-filtering model used to relate DCT and WT is described in Section 3. Using this model, similarity between DCT and WT subbands can be discovered and is discussed in Section 4. Section 5 then studies the effect of compression on the subbands similarity, while Section 6 describes features that are extracted in different compressed domains for indexing. Our experimental results on retrieving JPEG-compressed and JPEG 2000-compressed images are given in Section 7. Finally, Section 8 concludes the paper.

Section snippets

Discrete cosine transform and wavelet transform

Both JPEG and JPEG 2000 perform compression by first transforming images into the spatial-spectral domain. However, a fundamental difference between JPEG and JPEG 2000 is their transforms; the former uses a block-based discrete cosine transform (BDCT) while the latter uses a WT. In the JPEG scheme, an image is sub-divided into 8×8 blocks and then DCT is applied independently to each block. Let xm,n(i,j) be the input pixel at (i,j) and (u,v) be the frequency index, then the DCT coefficient in

Subband-filtering model

A direct comparison of the outputs from BDCT and WT cannot reveal their relationship. In this section, we will develop a subband-filtering model that relates their outputs using low-pass/bandpass filters and downsampling operations. Since the 2D transform is obtained from 1D row followed by 1D column transforms, our formulation is derived in 1D for simplicity.

Spectral characteristics of BDCT and WT filters

Using the subband-filtering model, filters F˜u,DCT(z) and F˜i,WT(z) can be used to study the similarity in outputs from BDCT and WT. A comparative study is performed to discover their similarities in terms of the passband regions, magnitude spectrum and energy spectrum. To quantify similarity between filters A(k) and B(k), the similarity measure in Ref. [28] is used which is defined asS=maxjk=0N-1|A(k)||B(k-j)|k=0N-1|A(k)|2k=0N-1|B(k)|2,where S lies between 0 and 1. A large S means A(k) and B

Common feature analysis in JPEG- and JPEG 2000-compressed images

Section 4 shows that the four pairs of BDCT and WT filters share similar spectral characteristics. In this section, we examine the similarity of outputs from these four filter pairs. The effect of compression on the outputs will also be studied. In particular, 67 images from four image classes have been compressed by JPEG and JPEG 2000 at compression ratios ranging from 1.6 to 44. If a high degree of similarity is still found under various compression ratios, then common features can be

Common feature extraction

Assume that the image size is N×N and the level of decomposition in WT is three. There are altogether 10 subbands in WT. The coefficients from all the BDCT blocks Xm,n(u,v) with same u and v are concatenated to form 64 blocks. These 64 blocks are required to be rearranged to form 10 blocks so that they can be compared directly to the 10 subbands in WT. Using our results in Section 4.2, the rearrangement can be written asBj={X˜m,n(εj)|εj(u,v),unuj,vnvj},where εj defines the BDCT coefficient

Simulation results

To test the performance of our proposed methods for common feature extraction in different compressed domains, a retrieval system was built. The system consists of 1800 images from nine image classes as shown in Fig. 5. These images are compressed by both JPEG and JPEG 2000 schemes at seven compression ratios ranging from 1.6 to 72. Our simulation results consist of using the JPEG 2000-compressed images as the query image to find out the most similar JPEG-compressed images and vice versa. As

Conclusions

A fundamental difference between JPEG and JPEG 2000 is their transforms; the former uses a block-based discrete cosine transform (BDCT) while the latter uses a wavelet transform (WT). Direct comparison on their filter outputs cannot reveal their relationship. We have developed a subband-filtering model that relates outputs from BDCT and WT using low-pass/bandpass filters and downsampling operations. From this model, it is shown that the low-pass filtering operation in WT and BDCT are very

Acknowledgments

The authors thank the reviewers for their constructive comments. This work is supported by the RGC Grant PolyU 5210/04E, and the Centre for Multimedia Signal Processing (A452), Department of Electronic and Information Engineering, the Hong Kong Polytechnic University. K.M. Au acknowledges the research studentship provided by the University.

About the Author—KA-MAN AU received a B.Eng. (Hons) degree and a M.Phil. degree in Electronic and Information Engineering from the Hong Kong Polytechnic University, in the years 2002 and 2005, respectively. Her research interests include data compression, image processing, pattern recognition, image and video retrieval in compressed domains.

References (29)

  • C.W. Ngo et al.

    Exploiting image indexing techniques in DCT domain

    Pattern Recognition

    (2001)
  • C.C. Chang et al.

    Retrieving digital images from a JPEG compressed image database

    Image Vision Comput.

    (2004)
  • S. Climer et al.

    Image database indexing using JPEG coefficients

    Pattern Recognition

    (2002)
  • M. Flickner et al.

    Query by image and video content

    IEEE Trans. Comput.

    (1995)
  • A. Gupta et al.

    Visual information retrieval

    Commun. ACM

    (1997)
  • A. Pentland et al.

    Photobook: content-based manipulation of image databases

    Int. J. Comput. Vision

    (1996)
  • J. Wei

    Color object indexing and retrieval in digital libraries

    IEEE Trans. Image Process.

    (2002)
  • J.L. Shih et al.

    Color image retrieval based on primitives of colour moments

    IEEE Proc. Vision Image Signal Process.

    (2002)
  • J.R. Smith et al.

    VisualSEEK: a fully automated content-based image query system

    Proc. ACM Multimedia

    (1996)
  • S. Lim, G. Lu, Spatial statistics for content based image retrieval, International Conference on Information...
  • T. Gevers et al.

    PicToSeek: combining color and shape invariant features for image retrieval

    IEEE Trans. Image Process.

    (2000)
  • G.K. Wallace

    The JPEG still picture compression standard

    Commun. ACM

    (1991)
  • B.E. Usevitch

    A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000

    IEEE Signal Process. Mag.

    (2001)
  • D.S. Taubman et al.

    JPEG2000: Image Compression Fundamentals, Standards and Practice

    (2002)
  • Cited by (10)

    View all citing articles on Scopus

    About the Author—KA-MAN AU received a B.Eng. (Hons) degree and a M.Phil. degree in Electronic and Information Engineering from the Hong Kong Polytechnic University, in the years 2002 and 2005, respectively. Her research interests include data compression, image processing, pattern recognition, image and video retrieval in compressed domains.

    About the Author—NGAI-FONG LAW received a B.Eng. degree with first-class honors from the University of Auckland, New Zealand, in 1993 and a Ph.D. degree from the University of Tasmania, Australia, in 1997, both in electrical and electronic engineering. She is currently with the Electronic and Information Engineering Department, Hong Kong Polytechnic University. Her research interests include signal and image processing, wavelet transforms, image enhancement and compression. Recently, she has also been working on video searching for Internet applications and Bioinformatics.

    About the Author—WAN-CHI SIU received an Associateship from The Hong Kong Polytechnic University (formerly called the Hong Kong Polytechnic), a M.Phil. degree from The Chinese University of Hong Kong, and a Ph.D. degree from the Imperial College of Science, Technology, and Medicine, London, U.K. in 1975, 1977, and 1984, respectively. He was with The Chinese University of Hong Kong between 1975 and 1980. He then joined The Hong Kong Polytechnic University as a Lecturer in 1980 and became Chair Professor in 1992. He was the Head of Department of Electronic and Information Engineering and subsequently became Dean of the Engineering Faculty between 1994 and 2002. He is now the Director of the Centre for Multimedia Signal Processing at the same university. He has published over 200 research papers. His research interests include DSP, fast algorithms, transforms, wavelets, image and video coding, and computational aspects of pattern recognition and neural networks. He is a Member of the Editorial Board of the Journal of VLSI Signal Processing Systems for signal, image and video technology and the EURASIP Journal on Applied Signal Processing.

    Dr. Siu was a Guest Editor of a Special Issue of the IEEE Transactions on Circuits and Systems II, published in May 1998, and was an Associate Editor of the same journal from 1995 to 1997. He has been the general chair or the technical program chair of a number of international conferences. In particular, he was the Technical Program Chair of the IEEE International Symposium on Circuits and Systems (ISCAS’97) and the General Chair of the International Symposium on Intelligent Multimedia, Video, and Speech Processing (ISIMP’2001), which were held in Hong Kong in June 1997 and May 2001, respectively. He was the General Chair of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’2003), which was held in Hong Kong. Between 1991 and 1995, he was a member of the Physical Sciences and Engineering Panel of the Research Grants Council (RGC), Hong Kong Government, and in 1994, he chaired the first Engineering and Information Technology Panel to assess the research quality of 19 Cost Centers (departments) from all universities in Hong Kong. He is a Chartered Engineer and a Fellow of both the IEE and the HKIE.

    View full text