doi:10.1016/j.image.2007.03.006
Copyright © 2007 Elsevier B.V. All rights reserved.
The discrete modal transform and its application to lossy image compression
aDepartment of Informatics, Aristotle University of Thessaloniki, Box 451, 54124 Thessaloniki, Greece
Received 30 June 2006;
revised 16 March 2007;
accepted 23 March 2007.
Available online 19 April 2007.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
This paper introduces the discrete modal transform (DMT), a 1D and 2D discrete, non-separable transform for signal processing, which, in the mathematical sense, is a generalization of the well-known discrete cosine transform (DCT). A 3D deformable surface model is used to represent the image intensity and the introduced discrete transform is a by-product of the explicit surface deformation governing equations. The properties of the proposed transform are similar to those of the DCT. To illustrate these properties, the proposed transform is applied to lossy image compression and the obtained results are compared to those of a DCT-based compression scheme. Experimental results show that DMT, which includes an embedded compression ratio selection mechanism, has excellent energy compaction properties and achieves comparable compression results to DCT at low compression ratios, while being in general better than DCT at high compression ratios.
Keywords: Signal transforms; Discrete cosine transform; 3D deformable models; Intensity surface; Lossy compression; Image decomposition; Signal analysis
Fig. 1. (a) Facial image, (b) intensity surface representation of the image.
Fig. 2. (a) Quadrilateral surface (mesh) model, (b) example of a 3D surface model comprised of 8 nodes of mass m connected with identical springs of stiffness
and damping coefficient c. Three forces act on three model nodes and result in model deformation.
Fig. 3. (a) 2D open curve model, (b) example of a 2D curve model comprised of 5 nodes of mass m connected with identical springs of stiffness
and damping coefficient c. Two forces act on two model nodes and result in model deformation.
Fig. 4. The denominator of equation (30) for Nh=Nw=8 and λ=1.
Fig. 5. The basis vectors of block size 8 of the 1D DMT (λ=1) and the DCT.
Fig. 6. (a) An outdoor image, (b) DMT coefficients for λ=1.
Fig. 7. (a) A human face, (b) DMT coefficients for λ=1.
Fig. 8. The percentage of the energy compaction is calculated for the white area shown above. This area corresponds to 3% of the frequency domain.
Fig. 9. Decorrelation efficiency (DE) versus ρ. Curves are provided for (a) DMT, λ=1 (DMT1), (b) DMT, λ=10 (DMT10), (c) DMT, λ=30 (DMT30), (d) DCT, (DCT), (e) DCT with quantization table (DCTQT).
Fig. 10. Energy packing ability EPA(η) versus ρ for η=2. Curves are provided for (a) DMT, λ=1 (DMT1), (b) DMT, λ=10 (DMT10), (c) DMT, λ=30 (DMT30), (d) DCT (DCT), (e) DCT with quantization table (DCTQT).
Fig. 11. Energy packing ability EPA(η) versus ρ for η=3. Curves are provided for (a) DMT, λ=1 (DMT1), (b) DMT, λ=10 (DMT10), (c) DMT, λ=30 (DMT30), (d) DCT (DCT), (e) DCT with quantization table (DCTQT).
Fig. 12. PSNR and WPSNR between the original and the compressed image versus the percentage of non-zero coefficients for different values of λ and Q for DMT and DCT, respectively, for different test images:(a) Lenna, (b) Mandrill, (c) an outdoor image, (d) an indoor image, (e) a facial image and (f) a studio image.
Fig. 13. The total perceptual error of the Watson metric between the original and the compressed image versus the percentage of non-zero coefficients for different values of λ and Q for DMT and DCT, respectively, for different test images:(a) Lenna, (b) Mandrill, (c) an outdoor image, (d) an indoor image, (e) a facial image and (f) a studio image.
Fig. 14. Application of DMT and DCT to lossy image compression. Factors Q, λ have been selected so that the two algorithms achieve approximately the same compression for each image. (a), (b), (c): Original images. Compressed images using DMT: (d) λ=250, percentage of non-zero coefficients=6% and PSNR=42.80, (e) λ=250, percentage of non-zero coefficients =10% and PSNR=36.59, (f): λ=25, percentage of non-zero coefficients=14.5% and PSNR=40.27. Compressed images using DCT: (g) Q=2, percentage of non-zero coefficients=6% and PSNR=42.77, (h) Q=2, percentage of non-zero coefficients =10% and PSNR=36.45, (i) Q=2, percentage of non-zero coefficients =14.5% and PSNR=40.11.
Fig. 15. Compression ratio–distortion curves for both DMT and DCT, for various test images: (a) a lake image, (b) a house image, (c) an animal image, (d) a child image, (e) a flower image and (f) a portrait image. The distortion is measured in terms of PSNR and WPSNR.
Fig. 16. Compression ratio–distortion curves for both DMT and DCT for various test images: (a) a lake image, (b) a house image, (c) an animal image, (d) a child image, (e) a flower image and (f) a portrait image. The distortion is measured in terms of the total perceptual error of the Watson metric.
Fig. 17. Compression ratio–distortion curves for both DMT and DCT, for various test images depicting: (a) a garden, (b) a basket, (c) the sea, (d) a woman, (e) a human face and (f) a forest. The distortion is measured in terms of PSNR and WPSNR.
Fig. 18. Compression ratio–distortion curves for both DMT and DCT for various test images depicting: (a) a garden, (b) a basket, (c) the sea, (d) a woman, (e) a human face and (f) a forest. The distortion is measured in terms of the total perceptual error of the Watson metric.
Table 1.
The basis images of the 2D DMT for a block size of dimensions Nh=3, Nw=3 and λ=1

Table 2.
The basis images of 2D DCT for a block size of dimensions Nh=3 and Nw=3

Table 3.
Percentage of total energy residing in the 3% low-frequency region (white area of Fig. 8) for DMT and different values of λ

Table 4.
DCT quantization table: luminance
