skip to main content
10.1145/1993636.1993638acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article
Free Access

The power of simple tabulation hashing

Published:06 June 2011Publication History

ABSTRACT

Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC'77). Keys are viewed as consisting of c characters. We initialize c tables T_1, ..., T_c mapping characters to random hash codes. A key x=(x_1, ..., x_c) is hashed to T_1[x_1] xor ... xor T_c[x_c].

While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., Chernoff-type concentration, min-wise hashing for estimating set intersection, and cuckoo hashing.

Skip Supplemental Material Section

Supplemental Material

stoc_1a_1.mp4

mp4

147 MB

References

  1. V. Braverman, K.-M. Chung, Z. Liu, M. Mitzenmacher, and R. Ostrovsky. AMS without 4-wise independence on product domains. In Proc. 27th Symposium on Theoretical Aspects of Computer Science (STACS), pages 119--130, 2010.Google ScholarGoogle Scholar
  2. J. S. Cohen and D. M. Kane. Bounds on the independence required for cuckoo hashing. Manuscript, 2009.Google ScholarGoogle Scholar
  3. M. Dietzfelbinger and M. Rink. Applications of a splitting trick. In Proc. 36th International Colloquium on Automata, Languages and Programming (ICALP), pages 354--365, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Dietzfelbinger and P. Woelfel. Almost random graphs with simple hash functions. In Proc. 25th ACM Symposium on Theory of Computing (STOC), pages 629--638, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Indyk. A small approximately min-wise independent family of hash functions. Journal of Algorithms, 38(1):84--90, 2001. See also SODA'99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. J. Karloff and P. Raghavan. Randomized algorithms and pseudorandom numbers. Journal of the ACM, 40(3):454--476, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. E. Knuth. Notes on open addressing. Unpublished memorandum. See http://citeseer.ist.psu.edu/knuth63notes.html, 1963.Google ScholarGoogle Scholar
  8. M. Mitzenmacher and S. P. Vadhan. Why simple hash functions work: exploiting the entropy in a data stream. In Proc. 19th ACM/SIAM Symposium on Discrete Algorithms (SODA), pages 746--755, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Motwani and P. Raghavan. Randomized algorithms. Cambridge University Press, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Pagh, R. Pagh, and M. Ruzić. Linear probing with constant independence. SIAM Journal on Computing, 39(3):1107--1120, 2009. See also STOC'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. P\v atra\c scu and M. Thorup. On the k-independence required by linear probing and minwise independence. In Proc. 37th International Colloquium on Automata, Languages and Programming (ICALP), pages 715--726, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. P. Schmidt, A. Siegel, and A. Srinivasan. Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics, 8(2):223--250, 1995. See also SODA'93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Siegel. On universal classes of extremely random constant-time hash functions. SIAM Journal on Computing, 33(3):505--543, 2004. See also FOCS'89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Thorup. String hashing for linear probing. In Proc. 20th ACM/SIAM Symposium on Discrete Algorithms (SODA), pages 655--664, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Thorup and Y. Zhang. Tabulation based 4-universal hashing with applications to second moment estimation. In Proc. 15th ACM/SIAM Symposium on Discrete Algorithms (SODA), pages 615--624, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Thorup and Y. Zhang. Tabulation based 5-universal hashing and linear probing. In Proc. 12th Workshop on Algorithm Engineering and Experiments (ALENEX), 2009.Google ScholarGoogle Scholar
  17. M. N. Wegman and L. Carter. New classes and applications of hash functions. Journal of Computer and System Sciences, 22(3):265--279, 1981. See also FOCS'79. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The power of simple tabulation hashing

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              STOC '11: Proceedings of the forty-third annual ACM symposium on Theory of computing
              June 2011
              840 pages
              ISBN:9781450306911
              DOI:10.1145/1993636

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 6 June 2011

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              STOC '11 Paper Acceptance Rate84of304submissions,28%Overall Acceptance Rate1,469of4,586submissions,32%

              Upcoming Conference

              STOC '24
              56th Annual ACM Symposium on Theory of Computing (STOC 2024)
              June 24 - 28, 2024
              Vancouver , BC , Canada

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader