Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Irannejad, Maziar; Mahdavi-Nasab, Homayoun

doi:10.1007/s11045-019-00671-6

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Published: 25 July 2019

Volume 31, pages 465–489, (2020)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

205 Accesses
1 Citation
Explore all metrics

Abstract

In the current decade, scalability has been developed in video coding (VC) schemes to reply end-user demands and heterogeneity of networks. In this paper, a low bit-rate signal-to-noise ratio (SNR) scalable VC based on dictionary learning (DL) and sparse representation is proposed. A notable feature of SNR scalability compared to spatial and temporal versions is that there are not any limitations in the number of enhancement layers, making it more applicable to adapt to different conditions. In this research, unlike traditional VC in which the discrete cosine transform (DCT) coefficients of video signals are quantized to obtain different SNR qualities, sparse codes are applied. Sparse coding is done over trained overcomplete dictionaries, for which three different DL algorithms, namely MOD, K-SVD, and RLS-DLA, are utilized and compared. The dictionaries are trained over the DCT domain of general natural images, to achieve higher compression and prevent blocking artifacts. The results of the proposed method are compared with non-scalable coding based on DL, and scalable and non-scalable coding schemes based on complete DCT dictionary employed in traditional VC standards such as MPEG.X and H.26X. The results show that, although video scalability naturally decreases the quality compared to non-scalable coding, the proposed scheme presents superior subjective and rate–distortion performance compared to non-scalable and scalable VC based on the traditional DCT quantization. Moreover, among the three DL methods applied, RLS-DLA achieves superior results both for non-scalable and scalable VC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video steganography: recent advances and challenges

Article Open access 04 April 2023

Jayakanth Kunhoth, Nandhini Subramanian, … Ahmed Bouridane

Learning a Deep Convolutional Network for Image Super-Resolution

MFFN: image super-resolution via multi-level features fusion network

Article 15 February 2023

Yuantao Chen, Runlong Xia, … Ke Zou

Notes

Internet Protocol Television.
Hyper Text Transfer Protocol.
Group of Pictures.
Joint Scalable Video Model (JSVM) reference software for SVC. Online Available: CVS server garcon.ient.rwth-aachen.de.

References

Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing over complete dictionaries for sparse representation. IEEE Transactions on Signal Processing,54(11), 4311–4322.
Article Google Scholar
Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation,19(4), 270–283.
Article Google Scholar
Choupani, R., Wong, S., & Tolun, M. (2015). Hierarchical SNR scalable video coding with adaptive quantization for reduced drift error. In International conference on computer vision theory and applications (VISAPP) (pp. 117–123).
Eldar, Y. C., & Kutyniok, G. (2012). Compressed sensing, theory and applications. Cambridge: Cambridge University Press.
Book Google Scholar
Engan, K., Skretting, K., & Husøy, J. H. (2007). Family of iterative LS-based dictionary learning algorithms ILS-DLA for sparse signal representation. Digital Signal Processing,22(1), 32–49.
Article Google Scholar
Ghandi, M. M., & Ghanbari, M. (2006). Error concealment for SNR scalable video coding. Signal Processing: Image Communication,21(2), 91–99.
Google Scholar
Ghareeb, M., Ksentini, A., & Viho, C. (2011). A multipath video streaming approach for SNR scalable video coding (SVC) in overlay networks. In IEEE consumer communications and networking conference (CCNC) (pp. 605–610).
Irannejad, M., & Mahdavi-Nasab, H. (2018). Block matching video compression based on sparse representation and dictionary learning. Circuits, Systems, and Signal Processing,37(8), 3537–3557.
Article MathSciNet Google Scholar
Ji, X. X., & Zhang, G. (2017). An adaptive SAR image compression method. Computers & Electrical Engineering,62, 473–484.
Article Google Scholar
Kim, T. J., Hong, J. E., & Suh, J. W. (2011). Fast mode decision for combined scalable video coding based on the block complexity function. IEEE Transactions on Consumer Electronics,57(1), 247–252.
Article Google Scholar
Koutsonikolas, D., Hu, Y., Wang, C., Comer, M., & Mohamed, A. (2011). Efficient online wifi delivery of layered coding media using inter layer network coding. In International conference on distributed computing systems (ICDCS) (pp. 237–247).
Leuvun, S. V., Cock, J. D., Cantos, R. G., Martinez, J. L., & Walle, R. V. D. (2011). Generic techniques to reduce SVC enhancement layer encoding complexity. IEEE Transactions on Consumer Electronics,57(2), 827–832.
Article Google Scholar
Lewicki, M. S., & Sejnowski, T. J. (2000). Learning overcomplete representations. Neural Computation,12(2), 337–365.
Article Google Scholar
Nejati, M., Samavi, S., Karimi, N., Soroushmehr, S. M. R., & Najarian, K. (2016). Boosted dictionary learning for image compression. IEEE Transactions on Image Processing,25(10), 4900–4915.
Article MathSciNet Google Scholar
Ostovari, P., & Wu, J. (2016). Robust wireless transmission of scalable coded videos using two-dimensional network coding. Computer Networks,95, 115–126.
Article Google Scholar
Richardson, I. E. (2010). The H.264 advanced video compression standard (2nd ed.). Hoboken: Wiley.
Book Google Scholar
Sadeghi, M., Babaie-Zadeh, M., & Jutten, C. (2013). Dictionary learning for sparse representation: A novel approach. IEEE Signal Processing Letters,20(12), 1195–1198.
Article Google Scholar
Sanchez, Y., Schierl, T., Hellge, C., Wiegand, T., Hong, D., Vleeschauwer, D. D., et al. (2012). Efficient http based streaming using scalable video coding. Signal Processing: Image Communication,27(4), 329–342.
Google Scholar
Skretting, K. (2018). Dictionary learning tools for Matlab. University of Stavanger. http://www.ux.uis.no/~karlsk/dle/. Accessed July 2017.
Skretting, K., & Engan, D. K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In Proceeding on IEEE ICASSP (pp. 1517–1520).
Sullivan, G. J., Ohm, J., Han, W. J., & Wiegand, T. (2012). Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology,22(12), 1649–1668.
Article Google Scholar
Sun, Y., Xu, M., Tao, X., & Lu, J. (2014). Online dictionary learning based intra-frame video coding. Wireless Personal Communications,74(4), 1281–1295.
Article Google Scholar
Taheri, A. M., & Mahdavi-Nasab, H. (2018). Sparse representation based facial image compression via multiple dictionaries and separated ROI. Multimedia Tools and Applications, 77(23), 31095–31114.
Article Google Scholar
Tsai, C. Y., & Hang, H. M. (2010). A rate–distortion analysis on motion prediction efficiency and mode decision for scalable wavelet video coding. Journal of Visual Communication and Image Representation,21(8), 917–929.
Article Google Scholar
Wang, H., Xiao, S., & Kuo, C. (2010). Robust video multicast with joint network coding and video inter leaving. Journal of Visual Communication and Image Representation,21(2), 77–88.
Article Google Scholar
Wang, H., Xiao, S., & Kuo, C. (2011). Random linear network coding with ladder-shaped global coding matrix for robust video transmission. Journal of Visual Communication and Image Representation,22(3), 203–212.
Article Google Scholar
Wu, J. S., Tai, K. H., Li, G. L., Chen, M. J., & Tang, Y. H. (2015). Effective computation-aware algorithm by inter-layer motion analysis for scalable video coding. Journal of Visual Communication and Image Representation,32(5), 107–119.
Article Google Scholar
Xie, J., & Chia, L. T. (2005). Enhancement layer rate control for high bit rate SNR scalable video coding. Journal of Visual Communication and Image Representation,16(2), 159–179.
Article Google Scholar
Xiong, H., Pan, Z., Ye, X., & Chen, C. W. (2013). Sparse spatio-temporal representation with adaptive regularized dictionary learning for low bit-rate video coding. IEEE Transactions on Circuits and Systems for Video Technology,23(4), 710–728.
Article Google Scholar
Yaghoobi, M., Daudet, L., & Davies, M. E. (2009). Parametric dictionary design for sparse coding. IEEE Transactions on Signal Processing,57(12), 4800–4810.
Article MathSciNet Google Scholar
Zhu, J. Y., Wang, Z. Y., Zhong, R., & Qu, S. M. (2015). Dictionary based surveillance image compression. Journal of Visual Communication and Image Representation,31(8), 225–230.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Iran
Maziar Irannejad & Homayoun Mahdavi-Nasab
Digital Processing and Machine Vision Research Center, Najafabad Branch, Islamic Azad University, Najafabad, Iran
Maziar Irannejad & Homayoun Mahdavi-Nasab

Authors

Maziar Irannejad
View author publications
You can also search for this author in PubMed Google Scholar
Homayoun Mahdavi-Nasab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Homayoun Mahdavi-Nasab.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Irannejad, M., Mahdavi-Nasab, H. Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation. Multidim Syst Sign Process 31, 465–489 (2020). https://doi.org/10.1007/s11045-019-00671-6

Download citation

Received: 04 June 2018
Revised: 20 May 2019
Accepted: 22 July 2019
Published: 25 July 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11045-019-00671-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Abstract

Access this article

Similar content being viewed by others

Video steganography: recent advances and challenges

Learning a Deep Convolutional Network for Image Super-Resolution

MFFN: image super-resolution via multi-level features fusion network

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation

Abstract

Access this article

Similar content being viewed by others

Video steganography: recent advances and challenges

Learning a Deep Convolutional Network for Image Super-Resolution

MFFN: image super-resolution via multi-level features fusion network

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation