skip to main content
10.1145/3605573.3605588acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability

Published:13 September 2023Publication History

ABSTRACT

Entropy coding is essential to data compression, image and video coding, etc. The Range variant of Asymmetric Numeral Systems (rANS) is a modern entropy coder, featuring superior speed and compression rate. As rANS is not designed for parallel execution, the conventional approach to parallel rANS partitions the input symbol sequence and encodes partitions with independent codecs, and more partitions bring extra overhead. This approach is found in state-of-the-art implementations such as DietGPU. It is unsuitable for content-delivery applications, as the parallelism is wasted if the decoder cannot decode all the partitions in parallel, but all the overhead is still transferred.

To solve this, we propose Recoil, a parallel rANS decoding approach with decoder-adaptive scalability. We discover that a single rANS-encoded bitstream can be decoded from any arbitrary position if the intermediate states are known. After renormalization, these states also have a smaller upper bound, which can be stored efficiently. We then split the encoded bitstream using a heuristic to evenly distribute the workload, and store the intermediate states and corresponding symbol indices as metadata. The splits can then be combined simply by eliminating extra metadata entries.

The main contribution of Recoil is reducing unnecessary data transfer by adaptively scaling parallelism overhead to match the decoder capability. The experiments show that Recoil decoding throughput is comparable to the conventional approach, scaling massively on CPUs and GPUs and greatly outperforming various other ANS-based codecs.

Skip Supplemental Material Section

Supplemental Material

References

  1. Eirikur Agustsson and Radu Timofte. 2017. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google ScholarGoogle Scholar
  2. Kasidis Arunruangsirilert, Pasapong Wongprasert, and Jiro Katto. 2023. Performance Evaluations of C-Band 5G NR FR1 (Sub-6 GHz) Uplink MIMO on Urban Train. In 2023 IEEE Wireless Communications and Networking Conference (WCNC). 1–6. https://doi.org/10.1109/WCNC55385.2023.10118777Google ScholarGoogle ScholarCross RefCross Ref
  3. Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  4. Yann Collet. 2023. New Generation Entropy coders. Retrieved 2023-04-01 from https://github.com/Cyan4973/FiniteStateEntropyGoogle ScholarGoogle Scholar
  5. Yann Collet. 2023. Zstandard - Real-time data compression algorithm. Retrieved 2023-04-03 from http://facebook.github.io/zstd/Google ScholarGoogle Scholar
  6. Sebastian Deorowicz. 2020. Silesia compression corpus. Retrieved 2023-04-10 from https://sun.aei.polsl.pl/ sdeor/index.php?page=silesiaGoogle ScholarGoogle Scholar
  7. Jarek Duda. 2009. Asymmetric numeral systems. arxiv:0902.0271 [cs.IT]Google ScholarGoogle Scholar
  8. Shunji Funasaka, Koji Nakano, and Yasuaki Ito. 2016. Light Loss-Less Data Compression, with GPU Implementation, Vol. 10048. 281–294. https://doi.org/10.1007/978-3-319-49583-5_22Google ScholarGoogle ScholarCross RefCross Ref
  9. Fabian Giesen. 2014. Interleaved entropy coders. arxiv:1402.3392 [cs.IT]Google ScholarGoogle Scholar
  10. Fabian Giesen. 2018. Simple rANS encoder/decoder (arithmetic coding-ish entropy coder). Retrieved 2023-04-10 from https://github.com/rygorous/ryg_ransGoogle ScholarGoogle Scholar
  11. Jeff Johnson. 2022. DietGPU: GPU-based lossless compression for numerical data. https://github.com/facebookresearch/dietgpuGoogle ScholarGoogle Scholar
  12. Joint Photographic Experts Group. 2022. JPEG - JPEG XL. Retrieved 2023-04-03 from https://jpeg.org/jpegxl/Google ScholarGoogle Scholar
  13. Fabian Knorr, Peter Thoman, and Thomas Fahringer. 2021. Ndzip-Gpu: Efficient Lossless Compression of Scientific Floating-Point Data on GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (St. Louis, Missouri) (SC ’21). Association for Computing Machinery, New York, NY, USA, Article 93, 14 pages. https://doi.org/10.1145/3458817.3476224Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Pavel Krajcevski, Srihari Pratapa, and Dinesh Manocha. 2016. GST: GPU-Decodable Supercompressed Textures. ACM Trans. Graph. 35, 6, Article 230 (dec 2016), 10 pages. https://doi.org/10.1145/2980179.2982439Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fangzheng Lin, Heming Sun, Jinming Liu, and Jiro Katto. 2023. Multistage Spatial Context Models for Learned Image Compression. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. https://doi.org/10.1109/ICASSP49357.2023.10095875Google ScholarGoogle ScholarCross RefCross Ref
  16. Matt Mahoney. 2023. Large Text Compression Benchmark. Retrieved 2023-04-03 from https://mattmahoney.net/dc/text.htmlGoogle ScholarGoogle Scholar
  17. David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint Autoregressive and Hierarchical Priors for Learned Image Compression. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  18. Seyyed Mahdi Najmabadi, Trung-Hieu Tran, Sherif Eissa, Harsimran Singh Tungal, and Sven Simon. 2019. An Architecture for Asymmetric Numeral Systems Entropy Decoder - A Comparison with a Canonical Huffman Decoder. J. Signal Process. Syst. 91, 7 (jul 2019), 805–817. https://doi.org/10.1007/s11265-018-1421-4Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. NVIDIA. 2023. NVCOMP. Retrieved 2023-04-10 from https://developer.nvidia.com/nvcompGoogle ScholarGoogle Scholar
  20. NVIDIA. 2023. nvJPEG. Retrieved 2023-04-10 from https://developer.nvidia.com/nvjpegGoogle ScholarGoogle Scholar
  21. Adnan Ozsoy and Martin Swany. 2011. CULZSS: LZSS Lossless Data Compression on CUDA. In 2011 IEEE International Conference on Cluster Computing. 403–411. https://doi.org/10.1109/CLUSTER.2011.52Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ritesh A. Patel, Yao Zhang, Jason Mak, Andrew Davidson, and John D. Owens. 2012. Parallel lossless data compression on the GPU. In 2012 Innovative Parallel Computing (InPar). 1–9. https://doi.org/10.1109/InPar.2012.6339599Google ScholarGoogle ScholarCross RefCross Ref
  23. Evangelia Sitaridi, Rene Mueller, Tim Kaldewey, Guy Lohman, and Kenneth A. Ross. 2016. Massively-Parallel Lossless Data Decompression. In 2016 45th International Conference on Parallel Processing (ICPP). 242–247. https://doi.org/10.1109/ICPP.2016.35Google ScholarGoogle ScholarCross RefCross Ref
  24. André Weißenberger and Bertil Schmidt. 2018. Massively Parallel Huffman Decoding on GPUs. In Proceedings of the 47th International Conference on Parallel Processing (Eugene, OR, USA) (ICPP ’18). Association for Computing Machinery, New York, NY, USA, Article 27, 10 pages. https://doi.org/10.1145/3225058.3225076Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. André Weißenberger and Bertil Schmidt. 2019. Massively Parallel ANS Decoding on GPUs. In Proceedings of the 48th International Conference on Parallel Processing (Kyoto, Japan) (ICPP ’19). Association for Computing Machinery, New York, NY, USA, Article 100, 10 pages. https://doi.org/10.1145/3337821.3337888Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
          August 2023
          858 pages
          ISBN:9798400708435
          DOI:10.1145/3605573

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 September 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate91of313submissions,29%
        • Article Metrics

          • Downloads (Last 12 months)59
          • Downloads (Last 6 weeks)9

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format