Article

Uniform hashing in constant time and linear space

Authors:
Anna Ostlin

IT University of Copenhagen, Kobenhavn NV, Denmark

IT University of Copenhagen, Kobenhavn NV, Denmark
View Profile

,
Rasmus Pagh

IT University of Copenhagen, Kobenhavn NV, Denmark

IT University of Copenhagen, Kobenhavn NV, Denmark
View Profile

STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computingJune 2003Pages 622–628https://doi.org/10.1145/780542.780633

Published:09 June 2003Publication History

STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing

Pages 622–628

ABSTRACT

Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theoretical ideal can be realized by hash functions that do not take up too much space and can be evaluated quickly. In this paper we present an almost ideal solution to this problem: A hash function that, on any set of n inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM, and can be stored in O(n) words, which is optimal. For many hashing schemes this is the first hash function that makes their uniform hashing analysis come true, with high probability, without incurring overhead in time or space.

References

N. Alon. Eigenvalues and expanders. Combinatorica, 6(2):83--96, 1986.]] Google ScholarDigital Library
N. Alon, L. Babai, and A. Itai. A fast and simple randomized parallel algorithm for the maximal independent set problem. J. Algorithms, 7(4):567--583, 1986.]] Google ScholarDigital Library
N. Alon, M. Dietzfelbinger, P. B. Miltersen, E. Petrank, and G. Tardos. Is linear hashing good? In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC~'97), pages 465--474. ACM Press, 1997.]] Google ScholarDigital Library
Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM J. Comput., 29(1):180--200, 1999.]] Google ScholarDigital Library
M. Bellare, O. Goldreich, and H. Krawczyk. Stateless evaluation of pseudorandom functions: Security beyond the birthday barrier. In Proc. of 19th annual international cryptology conference (CRYPTO'99), volume 1666 of Lecture Notes in Computer Science, pages 270--287. Springer-Verlag, 1999.]] Google ScholarDigital Library
P. Berenbrink, A. Czumaj, A. Steger, and B. V\"ocking. Balanced allocations: the heavily loaded case. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing (STOC~'00), pages 745--754. ACM Press, 2000.]] Google ScholarDigital Library
A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations. J. Comput. System Sci., 60(3):630--659, 2000.]] Google ScholarDigital Library
J. L. Carter and M. N. Wegman. Universal classes of hash functions. J. Comput. System Sci., 18(2):143--154, 1979.]]Google ScholarCross Ref
B. Chor, O. Goldreich, J. Hastad, J. Friedman, S. Rudich, and R. Smolensky. The bit extraction problem of t-resilient functions (preliminary version). In Proceedings of the 26th Annual Symposium on Foundations of Computer Science (FOCS~'85), pages 396--407. IEEE Comput. Soc. Press, 1985.]]Google ScholarDigital Library
M. Dietzfelbinger. Universal hashing and k-wise independent random variables via integer arithmetic without primes. In Proceedings of the 13th Symposium on Theoretical Aspects of Computer Science (STACS '96), volume 1046 of Lecture Notes in Computer Science, pages 569--580. Springer-Verlag, 1996.]] Google ScholarDigital Library
M. Dietzfelbinger, J. Gil, Y. Matias, and N. Pippenger. Polynomial hash functions are reliable (extended abstract). In Proceedings of the 19th International Colloquium on Automata, Languages and Programming (ICALP '92), volume 623 of Lecture Notes in Computer Science, pages 235--246. Springer-Verlag, 1992.]] Google ScholarDigital Library
M. Dietzfelbinger and F. Meyer auf der Heide. A new universal class of hash functions and dynamic hashing in real time. In Proceedings of the 17th International Colloquium on Automata, Languages and Programming (ICALP '90), volume 443 of Lecture Notes in Computer Science, pages 6--19. Springer-Verlag, 1990.]] Google ScholarDigital Library
M. Dietzfelbinger and F. Meyer auf der Heide. High performance universal hashing, with applications to shared memory simulations. In Data structures and efficient algorithms, volume 594 of Lecture Notes in Computer Science, pages 250--269. Springer, 1992.]] Google ScholarDigital Library
M. Dietzfelbinger and P. Woelfel. Almost random graphs with simple hash functions. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC '03), 2003.]] Google ScholarDigital Library
M. L. Fredman, J. Komlos, and E. Szemeredi. Storing a sparse table with O(1) worst case access time. J. Assoc. Comput. Mach., 31(3):538--544, 1984.]] Google ScholarDigital Library
O. Goldreich and A. Wigderson. Tiny families of functions with random properties: A quality-size trade-off for hashing. Random Structures & Algorithms, 11(4):315--343, 1997.]] Google ScholarDigital Library
G. Gonnet. Handbook of Algorithms and Data Structures. Addison-Wesley Publishing Co., 1984.]] Google ScholarDigital Library
P. Indyk, R. Motwani, P. Raghavan, and S. Vempala. Locality-preserving hashing in multidimensional spaces. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC '97), pages 618--625. ACM Press, 1999.]] Google ScholarDigital Library
D. E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming. Addison-Wesley Publishing Co., Reading, Mass., second edition, 1998.]] Google ScholarDigital Library
N. Linial and O. Sasson. Non-expansive hashing. Combinatorica, 18(1):121--132, 1998.]]Google ScholarCross Ref
R. Pagh and F. F. Rodler. Cuckoo hashing. In Proceedings of the 9th European Symposium on Algorithms (ESA '01), volume 2161 of Lecture Notes in Computer Science, pages 121--133. Springer-Verlag, 2001.]] Google ScholarDigital Library
J. P. Schmidt and A. Siegel. On aspects of universality and performance for closed hashing (extended abstract). In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC '89), pages 355--366. ACM Press, 1989.]] Google ScholarDigital Library
J. P. Schmidt and A. Siegel. The analysis of closed hashing under limited randomness (extended abstract). In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC '90), pages 224--234. ACM Press, 1990.]] Google ScholarDigital Library
A. Siegel. On universal classes of fast high performance hash functions, their time-space tradeoff, and their applications. In Proceedings of the 30th Annual Symposium on Foundations of Computer Science (FOCS '89), pages 20--25. IEEE Comput. Soc. Press, 1989.]]Google ScholarDigital Library
A. Siegel. On universal classes of extremely random constant time hash functions and their time-space tradeoff. Technical Report TR1995-684, New York University, 1995.]] Google ScholarDigital Library
B. Vocking. How asymmetry helps load balancing. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS~'99), pages 131--141. IEEE Comput. Soc. Press, 1999.]] Google ScholarDigital Library

Index Terms

Uniform hashing in constant time and linear space
1. Information systems
  1. Information retrieval
  2. Information storage systems
    1. Record storage systems
      1. Record storage alternatives
        Hashed file organization
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic reasoning algorithms
      1. Random number generation

Recommendations

Almost random graphs with simple hash functions
STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing

We describe a simple randomized construction for generating pairs of hash functions h₁,h₂ from a universe U to ranges V = [m] = (0,1,...,m-1) and W = [m] so that for every key set S ⊆ U with n = |S| ≤ m/(1 + ε) the (random) bipartite (multi)graph with ...
Read More
Uniform Hashing in Constant Time and Optimal Space

Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many ...
Read More
Entropy-Learned Hashing: Constant Time Hashing with Controllable Uniformity
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Hashing is a widely used technique for creating uniformly random numbers from arbitrary data. This is required in a large range of core data-driven operations including indexing, partitioning, filters, and sketches. As such, hashing is a core component ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
June 2003
740 pages
ISBN:1581136749
DOI:10.1145/780542
Conference Chair:
Lawrence L. Larmore
University of Nevada Las Vegas, Las Vegas, NV
,
Program Chair:
Michel X. Goemans
Massachusetts Institute of Technology, Cambridge, MA
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data structures
hash function
uniform hashing
Qualifiers
- Article
Conference

Acceptance Rates
STOC '03 Paper Acceptance Rate80of270submissions,30%Overall Acceptance Rate1,469of4,586submissions,32%
More
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 1,066
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Uniform hashing in constant time and linear space

STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Almost random graphs with simple hash functions

Uniform Hashing in Constant Time and Optimal Space

Entropy-Learned Hashing: Constant Time Hashing with Controllable Uniformity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Uniform hashing in constant time and linear space

STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Almost random graphs with simple hash functions

Uniform Hashing in Constant Time and Optimal Space

Entropy-Learned Hashing: Constant Time Hashing with Controllable Uniformity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media