Abstract
In the current decade, scalability has been developed in video coding (VC) schemes to reply end-user demands and heterogeneity of networks. In this paper, a low bit-rate signal-to-noise ratio (SNR) scalable VC based on dictionary learning (DL) and sparse representation is proposed. A notable feature of SNR scalability compared to spatial and temporal versions is that there are not any limitations in the number of enhancement layers, making it more applicable to adapt to different conditions. In this research, unlike traditional VC in which the discrete cosine transform (DCT) coefficients of video signals are quantized to obtain different SNR qualities, sparse codes are applied. Sparse coding is done over trained overcomplete dictionaries, for which three different DL algorithms, namely MOD, K-SVD, and RLS-DLA, are utilized and compared. The dictionaries are trained over the DCT domain of general natural images, to achieve higher compression and prevent blocking artifacts. The results of the proposed method are compared with non-scalable coding based on DL, and scalable and non-scalable coding schemes based on complete DCT dictionary employed in traditional VC standards such as MPEG.X and H.26X. The results show that, although video scalability naturally decreases the quality compared to non-scalable coding, the proposed scheme presents superior subjective and rate–distortion performance compared to non-scalable and scalable VC based on the traditional DCT quantization. Moreover, among the three DL methods applied, RLS-DLA achieves superior results both for non-scalable and scalable VC.
Similar content being viewed by others
Notes
Internet Protocol Television.
Hyper Text Transfer Protocol.
Group of Pictures.
Joint Scalable Video Model (JSVM) reference software for SVC. Online Available: CVS server garcon.ient.rwth-aachen.de.
References
Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing over complete dictionaries for sparse representation. IEEE Transactions on Signal Processing,54(11), 4311–4322.
Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation,19(4), 270–283.
Choupani, R., Wong, S., & Tolun, M. (2015). Hierarchical SNR scalable video coding with adaptive quantization for reduced drift error. In International conference on computer vision theory and applications (VISAPP) (pp. 117–123).
Eldar, Y. C., & Kutyniok, G. (2012). Compressed sensing, theory and applications. Cambridge: Cambridge University Press.
Engan, K., Skretting, K., & Husøy, J. H. (2007). Family of iterative LS-based dictionary learning algorithms ILS-DLA for sparse signal representation. Digital Signal Processing,22(1), 32–49.
Ghandi, M. M., & Ghanbari, M. (2006). Error concealment for SNR scalable video coding. Signal Processing: Image Communication,21(2), 91–99.
Ghareeb, M., Ksentini, A., & Viho, C. (2011). A multipath video streaming approach for SNR scalable video coding (SVC) in overlay networks. In IEEE consumer communications and networking conference (CCNC) (pp. 605–610).
Irannejad, M., & Mahdavi-Nasab, H. (2018). Block matching video compression based on sparse representation and dictionary learning. Circuits, Systems, and Signal Processing,37(8), 3537–3557.
Ji, X. X., & Zhang, G. (2017). An adaptive SAR image compression method. Computers & Electrical Engineering,62, 473–484.
Kim, T. J., Hong, J. E., & Suh, J. W. (2011). Fast mode decision for combined scalable video coding based on the block complexity function. IEEE Transactions on Consumer Electronics,57(1), 247–252.
Koutsonikolas, D., Hu, Y., Wang, C., Comer, M., & Mohamed, A. (2011). Efficient online wifi delivery of layered coding media using inter layer network coding. In International conference on distributed computing systems (ICDCS) (pp. 237–247).
Leuvun, S. V., Cock, J. D., Cantos, R. G., Martinez, J. L., & Walle, R. V. D. (2011). Generic techniques to reduce SVC enhancement layer encoding complexity. IEEE Transactions on Consumer Electronics,57(2), 827–832.
Lewicki, M. S., & Sejnowski, T. J. (2000). Learning overcomplete representations. Neural Computation,12(2), 337–365.
Nejati, M., Samavi, S., Karimi, N., Soroushmehr, S. M. R., & Najarian, K. (2016). Boosted dictionary learning for image compression. IEEE Transactions on Image Processing,25(10), 4900–4915.
Ostovari, P., & Wu, J. (2016). Robust wireless transmission of scalable coded videos using two-dimensional network coding. Computer Networks,95, 115–126.
Richardson, I. E. (2010). The H.264 advanced video compression standard (2nd ed.). Hoboken: Wiley.
Sadeghi, M., Babaie-Zadeh, M., & Jutten, C. (2013). Dictionary learning for sparse representation: A novel approach. IEEE Signal Processing Letters,20(12), 1195–1198.
Sanchez, Y., Schierl, T., Hellge, C., Wiegand, T., Hong, D., Vleeschauwer, D. D., et al. (2012). Efficient http based streaming using scalable video coding. Signal Processing: Image Communication,27(4), 329–342.
Skretting, K. (2018). Dictionary learning tools for Matlab. University of Stavanger. http://www.ux.uis.no/~karlsk/dle/. Accessed July 2017.
Skretting, K., & Engan, D. K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In Proceeding on IEEE ICASSP (pp. 1517–1520).
Sullivan, G. J., Ohm, J., Han, W. J., & Wiegand, T. (2012). Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology,22(12), 1649–1668.
Sun, Y., Xu, M., Tao, X., & Lu, J. (2014). Online dictionary learning based intra-frame video coding. Wireless Personal Communications,74(4), 1281–1295.
Taheri, A. M., & Mahdavi-Nasab, H. (2018). Sparse representation based facial image compression via multiple dictionaries and separated ROI. Multimedia Tools and Applications, 77(23), 31095–31114.
Tsai, C. Y., & Hang, H. M. (2010). A rate–distortion analysis on motion prediction efficiency and mode decision for scalable wavelet video coding. Journal of Visual Communication and Image Representation,21(8), 917–929.
Wang, H., Xiao, S., & Kuo, C. (2010). Robust video multicast with joint network coding and video inter leaving. Journal of Visual Communication and Image Representation,21(2), 77–88.
Wang, H., Xiao, S., & Kuo, C. (2011). Random linear network coding with ladder-shaped global coding matrix for robust video transmission. Journal of Visual Communication and Image Representation,22(3), 203–212.
Wu, J. S., Tai, K. H., Li, G. L., Chen, M. J., & Tang, Y. H. (2015). Effective computation-aware algorithm by inter-layer motion analysis for scalable video coding. Journal of Visual Communication and Image Representation,32(5), 107–119.
Xie, J., & Chia, L. T. (2005). Enhancement layer rate control for high bit rate SNR scalable video coding. Journal of Visual Communication and Image Representation,16(2), 159–179.
Xiong, H., Pan, Z., Ye, X., & Chen, C. W. (2013). Sparse spatio-temporal representation with adaptive regularized dictionary learning for low bit-rate video coding. IEEE Transactions on Circuits and Systems for Video Technology,23(4), 710–728.
Yaghoobi, M., Daudet, L., & Davies, M. E. (2009). Parametric dictionary design for sparse coding. IEEE Transactions on Signal Processing,57(12), 4800–4810.
Zhu, J. Y., Wang, Z. Y., Zhong, R., & Qu, S. M. (2015). Dictionary based surveillance image compression. Journal of Visual Communication and Image Representation,31(8), 225–230.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Irannejad, M., Mahdavi-Nasab, H. Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation. Multidim Syst Sign Process 31, 465–489 (2020). https://doi.org/10.1007/s11045-019-00671-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11045-019-00671-6