Skip to main content

Alphabet Partitioning for Compressed Rank/Select and Applications

  • Conference paper
Algorithms and Computation (ISAAC 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6507))

Included in the following conference series:

Abstract

We present a data structure that stores a string s[1..n] over the alphabet [1..σ] in nH 0(s) + o(n)(H 0(s) + 1) bits, where H 0(s) is the zero-order entropy of s. This data structure supports the queries access and rank in time \(({\mathcal O}{{\rm lg lg}\sigma})\), and the select query in constant time. This result improves on previously known data structures using \(nH_0(s)+o(n\lg\sigma)\) bits, where on highly compressible instances the redundancy \(o(n\lg\sigma)\) cease to be negligible compared to the nH 0(s) bits that encode the data. The technique is based on combining previous results through an ingenious partitioning of the alphabet, and practical enough to be implementable. It applies not only to strings, but also to several other compact data structures. For example, we achieve (i) faster search times and lower redundancy for the smallest existing full-text self-index; (ii) compressed permutations π with times for π() and π − 1() improved to log-logarithmic; and (iii) the first compressed representation of dynamic collections of disjoint sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andersson, A.: Sorting and searching revisited. In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 185–197. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  2. Barbay, J., Golynski, A., Munro, J.I., Rao, S.S.: Adaptive searching in succinctly encoded binary relations and tree-structured documents. Theoretical Computer Science 387(3), 284–297 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  3. Barbay, J., He, M., Munro, J.I., Rao, S.S.: Succinct indexes for strings, binary relations and multi-labeled trees. In: Proc. 18th SODA, pp. 680–689 (2007)

    Google Scholar 

  4. Barbay, J., Navarro, G.: Compressed representations of permutations, and applications. In: Proc. 26th STACS, pp. 111–122 (2009)

    Google Scholar 

  5. Blandford, D., Blelloch, G.: Index compression through document reordering. In: Proc. DCC, pp. 342–351 (2002)

    Google Scholar 

  6. Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)

    Google Scholar 

  7. Claude, F., Navarro, G.: Practical rank/select queries over arbitrary sequences. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 176–187. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms 3(2) (2007)

    Google Scholar 

  9. Gagie, T., Navarro, G., Nekrich, Y.: Fast and compact prefix codes. In: van Leeuwen, J., Muscholl, A., Peleg, D., Pokorný, J., Rumpe, B. (eds.) SOFSEM 2010. LNCS, vol. 5901, pp. 419–427. Springer, Heidelberg (2010)

    Google Scholar 

  10. Golynski, A., Munro, J.I., Rao, S.S.: Rank/select operations on large alphabets: a tool for text indexing. In: Proc. 17th SODA, pp. 368–373 (2006)

    Google Scholar 

  11. Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. 14th SODA, pp. 841–850 (2003)

    Google Scholar 

  12. Hreinsson, J.B., Krøyer, M., Pagh, R.: Storing a compressed function with constant time access. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 730–741. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Levcopoulos, C., Petersson, O.: Sorting shuffled monotone sequences. Information and Computation 112(1), 37–50 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  14. Mäkinen, V., Navarro, G.: Rank and select revisited and extended. Theoretical Computer Science 387(3), 332–347 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Manzini, G.: An analysis of the Burrows-Wheeler transform. Journal of the ACM 48(3), 407–430 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  16. Mehlhorn, K.: Sorting presorted files. In: Weihrauch, K. (ed.) GI-TCS 1979. LNCS, vol. 67, pp. 199–212. Springer, Heidelberg (1979)

    Google Scholar 

  17. Munro, J.I., Raman, R., Raman, V., Rao, S.S.: Succinct representations of permutations. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 345–356. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  18. Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39(1):article 2 (2007)

    Google Scholar 

  19. Rahman, N., Raman, R.: Rank and select operations on binary strings. In: Kao, M.-Y. (ed.) Encyclopedia of Algorithms. Springer, Heidelberg (2008)

    Google Scholar 

  20. Raman, R., Raman, V., Rao, S.: Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. In: Proc. 13th SODA, pp. 233–242 (2002)

    Google Scholar 

  21. Sadakane, K.: New text indexing functionalities of the compressed suffix arrays. Journal of Algorithms 48(2), 294–313 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Tarjan, R.E., van Leeuwen, J.: Worst-case analysis of set union algorithms. Journal of the ACM 31(2), 245–281 (1984)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barbay, J., Gagie, T., Navarro, G., Nekrich, Y. (2010). Alphabet Partitioning for Compressed Rank/Select and Applications. In: Cheong, O., Chwa, KY., Park, K. (eds) Algorithms and Computation. ISAAC 2010. Lecture Notes in Computer Science, vol 6507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17514-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17514-5_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17513-8

  • Online ISBN: 978-3-642-17514-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics