Planelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images

Kiani, Vahid; Harati, Ahad; Vahedian, Abedin

doi:10.29252/jsdp.16.2.41

Volume 16, Issue 2 (9-2019) JSDP 2019, 16(2): 41-60 | Back to browse issues page

‎ 10.29252/jsdp.16.2.41

Mendeley

Zotero

RefWorks

Kiani V, Harati A, Vahedian A. Planelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images. JSDP 2019; 16 (2) :41-60
URL: http://jsdp.rcisp.ac.ir/article-1-564-en.html

Planelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images

Vahid Kiani

, Ahad Harati ^*

, Abedin Vahedian

Ferdowsi University of Mashhad

Abstract: (3785 Views)

With the advent of cheap indoor RGB-D sensors, proper representation of piecewise planar depth images is crucial toward an effective compression method. Although there exist geometrical wavelets for optimal representation of piecewise constant and piecewise linear images (i.e. wedgelets and platelets), an adaptation to piecewise linear fractional functions which correspond to depth variation over planar regions is still missing. Such planar regions constitute major portions of the indoor depth images and need to be well represented to allow for desirable rate-distortion trade-off.
In this paper, second-order planelet transform is introduced as an optimal representation for piecewise planar depth images with sharp edges along smooth curves. Also, to speed up the computation of planelet approximation of depth images, an iterative estimation procedure is described based on non-linear least squares and discontinuity relaxation. The computed approximation is fed to a rate-distortion optimized quad-tree based encoder; and the pruned quadtree is encoded into the bit-stream. Spatial horizontal and vertical plane prediction modes are also introduced to further exploit geometric redundancy of depth images and increase the compression ratio.
Performance of the proposed planelet-based coder is compared with wedgelets, platelets, and general image encoders on synthetic and real-world Kinect-like depth images. The synthetic images dataset consists of 30 depth images of different scenes which are manually selected from eight video sequences of ICL-NUIM RGBD Benchmark dataset. The dataset of real-world images also includes 30 depth images of indoor scenes selected from Washington RGBD Scenes V2 dataset captured by Kinect-like cameras.
In contrast to former geometrical wavelets which approximate smooth regions of each image using constant and linear functions, planelet transform exploits a non-linear model based on linear fractional functions to approximate every smooth region. Visual comparisons by 3D surface reconstruction and visualization of the decoded depth images as surface plots revealed that at a specific bit-rate the planelets-based coder better preserves the geometric structure of the scene compared with the former geometric wavelets and the general images coders.
Numerical evaluations showed that compression of synthetic depth-images by planelets results in a considerable PSNR improvement of 0.83 dB and 6.92 dB over platelets and wedgelets, respectively. Due to absence of the noise, the plane prediction modes were very successful on synthetic images and boosted the PSNR gap over platelets and wedgelets to 5.73 dB and 11.82 dB, respectively. The proposed compression scheme also performed well on the real-world depth images. Compared with wedgelets, planelets-based coder with spatial prediction achieved noticeable quality improvement of 2.7 dB at the bit-rate of 0.03 bpp. It also led to 1.46 dB quality improvement over platelets at the same bit-rate. In this experiment, application of planelets-based coder led to 2.59 dB and 1.56 dB increase in PSNR over JPEG2000 and H.264 general image coders. Similar results are also achieved in terms of SSIM metric.

Keywords: depth-image compression, geometrical wavelets, planelet transform, linear fractional functions

Full-Text [PDF 5386 kb] (1734 Downloads)

Type of Study: Research | Subject: Paper
Received: 2017/05/5 | Accepted: 2019/01/9 | Published: 2019/09/17 | ePublished: 2019/09/17

References

1. [1] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, "RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments," International Journal of Robotics Research, vol. 31, pp. 647-663, April 2012. [DOI:10.1177/0278364911434148]

2. [2] K. Khoshelham ,and S. Elberink, "Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications," Sensors, vol. 12, pp. 1437-1454, February 2012. [DOI:10.3390/s120201437] [PMID] [PMCID]

3. [3] Y. Wang, F. Zhong, Q. Peng, and X. Qin, "Depth map enhancement based on color and depth consistency," The Visual Computer, vol. 30, pp. 1157-1168, October 2014. [DOI:10.1007/s00371-013-0896-z]

4. [4] T. Whelan, H. Johannsson, M. Kaess, J. Leonard, and J. McDonald, "Robust real-time visual odometry for dense RGB-D mapping," in IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, 2013, pp. 5724-5731. [DOI:10.1109/ICRA.2013.6631400]

5. [5] M. Paknezhad ,and M. Rezaeian, "Indoor Planar Modeling Using RGB-D Images," Signal and Data Processing, vol. 14, pp. 143-160, 2017. [DOI:10.29252/jsdp.14.3.143]

6. [6] J. Smisek, M. Jancosek, and T. Pajdla, "3D with Kinect," in IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 2011, pp. 1154-1160. [DOI:10.1109/ICCVW.2011.6130380]

7. [7] C. Mutto, P. Zanuttigh, and G. Cortelazzo, "Microsoft Kinect™ Range Camera," in Time-of-Flight Cameras and Microsoft Kinect™, C. Mutto, P. Zanuttigh, and G. Cortelazzo, Eds., ed: Springer US, 2012, pp. 33-47. [DOI:10.1007/978-1-4614-3807-6_3]

8. [8] J. Fu, D. Miao, W. Yu, S. Wang, Y. Lu, and S. Li, "Kinect-like depth data compression," IEEE Transactions on Multimedia, vol. 15, pp. 1340 - 1352, October 2013. [DOI:10.1109/TMM.2013.2247584]

9. [9] P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Müller, P. de With, et al., "The effects of multiview depth video compression on multiview rendering," Signal Processing: Image Communi-cation, vol. 24, pp. 73-88, January 2009. [DOI:10.1016/j.image.2008.10.010]

10. [10] K. Muller, P. Merkle, and T. Wiegand, "3-D Video Representation Using Depth Maps," Proceedings of the IEEE, vol. 99, pp. 643-656, April 2010. [DOI:10.1109/JPROC.2010.2091090]

11. [11] J. Ruiz-Hidalgo, J. Morros, P. Aflaki, F. Calderero, and F. Marqués, "Multiview depth coding based on combined color/depth segmentation," Journal of Visual Communication and Image Representation, vol. 23, pp. 42-52, January 2012. [DOI:10.1016/j.jvcir.2011.08.001]

12. [12] I. Daribo, H. Saito, R. Furukawa, S. Hiura, and N. Asada, "Effects of Wavelet-Based Depth Video Compression," in 3D-TV System with Depth-Image-Based Rendering, C. Zhu, Y. Zhao, L. Yu, and M. Tanimoto, Eds., ed: Springer New York, 2013, pp. 277-298. [DOI:10.1007/978-1-4419-9964-1_10]

13. [13] M. Maitre ,and M. Do, "Depth and depth-color coding using shape-adaptive wavelets," Journal of Visual Communication and Image Represen-tation, vol. 21, pp. 513-522, July-August 2010. [DOI:10.1016/j.jvcir.2010.03.005]

14. [14] D. Donoho, "Wedgelets: nearly minimax esti-mation of edges," Annals of Statistics, vol. 27, pp. 859-897, April 1999. [DOI:10.1214/aos/1018031261]

15. [15] A. Lisowska, Geometrical Multiresolution Adaptive Transforms - Theory and Applications, 1st ed.: Springer International Publishing, 2014. [DOI:10.1007/978-3-319-05011-9]

16. [16] J. Romberg, M. Wakin, and R. Baraniuk, "Multiscale wedgelet image analysis: fast decompositions and modeling," in International Conference on Image Processing, Rochester, New York, 2002, pp. 585-588.

17. [17] H. Bagherzadeh, A. Harati, Z. Amiri, and R. KamyabiGol, "Video Denoising Using block Shearlet Transform," Signal and Data Processing, vol. 15, pp. 17-30, 2018. [DOI:10.29252/jsdp.15.2.17]

18. [18] R. Willett and R. Nowak, "Platelets: a multiscale approach for recovering edges and surfaces in photon-limited medical imaging," IEEE Transactions on Medical Imaging, vol. 22, pp. 332-350, March 2003. [DOI:10.1109/TMI.2003.809622] [PMID]

19. [19] M. Dou, L. Guan, J. Frahm, and H. Fuchs, "Exploring High-Level Plane Primitives for Indoor 3D Reconstruction with a Hand-held RGB-D Camera," in Computer Vision - ACCV 2012 Workshops. vol. 7729, J.-I. Park and J. Kim, Eds., ed: Springer Berlin Heidelberg, 2013, pp. 94-108. [DOI:10.1007/978-3-642-37484-5_9]

20. [20] R. Kaushik and J. Xiao, "Accelerated patch-based planar clustering of noisy range images in indoor environments for robot mapping," Robotics and Autonomous Systems, vol. 60, pp. 584-598, April 2012. [DOI:10.1016/j.robot.2011.12.001]

21. [21] M. Brown, D. Burschka, and G. Hager, "Advances in computational stereo," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 993-1008, August 2003. [DOI:10.1109/TPAMI.2003.1217603]

22. [22] D. Scharstein and R. Szeliski, "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," International Journal of Computer Vision, vol. 47, pp. 7-42, April-June 2002.

23. [23] N. Thakoor, J. Sungying, and G. Jean, "Real-time Planar Surface Segmentation in Disparity Space," in IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA, 2007, pp. 1-8. [DOI:10.1109/CVPR.2007.383414]

24. [24] D. Sebai, F. Chaieb, K. Mammou, and F. Ghorbel, "Piece-wise linear function estimation for platelet-based depth maps coding using edge detection," in Proceedings of SPIE 8290, Three-Dimensional Image Processing (3DIP) and Applications II, Burlingame, California, United States, 2012, p. 82901C. [DOI:10.1117/12.909286]

25. [25] Z. Yu, W. Han, and Z. Jiying, "Depth map compression based on platelet coding and quadratic curve fitting," in IEEE 27th Canadian Conference on Electrical and Computer Engi-neering (CCECE), Halifax, Nova Scotia, Canada, 2014, pp. 1-4.

26. [26] I. Daribo, C. Tillier, and B. Pesquet-Popescu, "Adaptive wavelet coding of the depth map for stereoscopic view synthesis," in IEEE 10th Workshop on Multimedia Signal Processing, Cairns, Queensland, Australia, 2008, pp. 413-417. [DOI:10.1109/MMSP.2008.4665114]

27. [27] O. Kwan-Jung, Y. Sehoon, A. Vetro, and H. Yo-Sung, "Depth Reconstruction Filter and Down/Up Sampling for Depth Coding in 3-D Video," IEEE Signal Processing Letters, vol. 16, pp. 747-750, September 2009. [DOI:10.1109/LSP.2009.2024112]

28. [28] O. Kwan-Jung, A. Vetro, and H. Yo-Sung, "Depth Coding Using a Boundary Reconstruction Filter for 3-D Video Systems," IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, pp. 350-359, March 2011. [DOI:10.1109/TCSVT.2011.2116590]

29. [29] K. Min-Koo and H. Yo-Sung, "Depth Video Coding Using Adaptive Geometry Based Intra Prediction for 3-D Video Systems," IEEE Transactions on Multimedia, vol. 14, pp. 121-128, February 2012. [DOI:10.1109/TMM.2011.2169238]

30. [30] F. Pece, J. Kautz, and T. Weyrich, "Three Depth-Camera Technologies Compared," in 1st BEAMING Workshop, Barcelona, Spain, 2011, pp. 1-4.

31. [31] S. Mehrotra, Z. Zhengyou, C. Qin, Z. Cha, and P. Chou, "Low-complexity, near-lossless coding of depth maps from kinect-like depth cameras," in IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), Saint-Malo, France, 2011, pp. 1-6. [DOI:10.1109/MMSP.2011.6093803]

32. [32] D. Sandberg, P. Forssen, and J. Ogniewski, "Model-Based Video Coding Using Colour and Depth Cameras," in International Conference on Digital Image Computing Techniques and Applications (DICTA), Noosa, QLD, Australia, 2011, pp. 158-163. [DOI:10.1109/DICTA.2011.33]

33. [33] D. Donoho and X. Huo, "Beamlets and Multiscale Image Analysis," in Multiscale and Multi-resolution Methods. vol. 20, T. Barth, T. Chan, and R. Haimes, Eds., ed: Springer Berlin Heidelberg, 2002, pp. 149-196. [DOI:10.1007/978-3-642-56205-1_3]

34. [34] A. Lisowska, "Second Order Wedgelets in Image Coding," in The International Conference on Computer as a Tool, Warsaw, Poland, 2007, pp. 237-244. [DOI:10.1109/EURCON.2007.4400239]

35. [35] A. Lisowska, "Moments-Based Fast Wedgelet Transform," Journal of Mathematical Imaging and Vision, vol. 39, pp. 180-192, February 2011. [DOI:10.1007/s10851-010-0233-3]

36. [36] A. Lisowska, "Smoothlet Transform: Theory and Applications," in Advances in Imaging and Electron Physics. vol. 178, W. H. Peter, Ed., ed: Elsevier, 2013, pp. 97-145. [DOI:10.1016/B978-0-12-407701-0.00002-9]

37. [37] F. Friedrich, L. Demaret, H. Führ, and K. Wicker, "Efficient Moment Computation over Polygonal Domains with an Application to Rapid Wedgelet Approximation," SIAM Journal on Scientific Computing, vol. 29, pp. 842-863, April 2007. [DOI:10.1137/050646597]

38. [38] V. Kiani, A. Harati, and A. Vahedian, "Iterative Wedgelet Transform: An efficient algorithm for computing wedgelet representation and approximation of images," Journal of Visual Communication and Image Representation, vol. 34, pp. 65-77, January 2016. [DOI:10.1016/j.jvcir.2015.10.009]

39. [39] V. Kiani, A. Harati, and A. Vahedian, "A relaxation approach to computation of second-order wedgelet transform with application to image compression," Signal Processing: Image Communication, vol. 41, pp. 115-127, February 2016. [DOI:10.1016/j.image.2015.12.005]

40. [40] R. Shukla, P. Dragotti, M. Do, and M. Vetterli, "Rate-distortion optimized tree-structured com-pression algorithms for piecewise polynomial images," IEEE Transactions on Image Pro-cessing, vol. 14, pp. 343-359, March 2005. [DOI:10.1109/TIP.2004.840710] [PMID]

41. [41] A. Handa, T. Whelan, J. McDonald, and A. Davison, "A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM," in IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 2014, pp. 1524-1531. [DOI:10.1109/ICRA.2014.6907054]

42. [42] K. Lai, L. Bo, and D. Fox, "Unsupervised feature learning for 3D scene labeling," in IEEE Inter-national Conference on Robotics and Automation (ICRA), Hong Kong, China, 2014, pp. 3050-3057. [DOI:10.1109/ICRA.2014.6907298]

43. [43] W. Zhou, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, pp. 600-612, 2004. [DOI:10.1109/TIP.2003.819861] [PMID]

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Signal and Data Processing

Vote