Skip to main content

On the k-Independence Required by Linear Probing and Minwise Independence

  • Conference paper
Automata, Languages and Programming (ICALP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6198))

Included in the following conference series:

Abstract

We show that linear probing requires 5-independent hash functions for expected constant-time performance, matching an upper bound of [Pagh et al. STOC’07]. For (1 + ε)-approximate minwise independence, we show that \(\Omega(\lg \frac{1}{\varepsilon})\)-independent hash functions are required, matching an upper bound of [Indyk, SODA’99]. We also show that the multiply-shift scheme of Dietzfelbinger, most commonly used in practice, fails badly in both applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, N., Nussboim, A.: k-wise independent random graphs. In: Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 813–822 (2008)

    Google Scholar 

  2. Black, J.R., Martel, C.U., Qi, H.: Graph and hashing algorithms for modern architectures: Design and performance. In: Proc. 2nd International Workshop on Algorithm Engineering (WAE), pp. 37–48 (1998)

    Google Scholar 

  3. Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. Journal of Computer and System Sciences 60(3), 630–659 (2000); See also STOC 1998

    Google Scholar 

  4. Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Computer Networks 29, 1157–1166 (1997)

    Google Scholar 

  5. Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. Journal of Computer and System Sciences 55(3), 441–453 (1997); See also STOC 1994

    Google Scholar 

  6. Dietzfelbinger, M.: Universal hashing and k-wise independent random variables via integer arithmetic without primes. In: Proc. 13th Symposium on Theoretical Aspects of Computer Science (STACS), pp. 569–580 (1996)

    Google Scholar 

  7. Heileman, G.L., Luo, W.: How caching affects hashing. In: Proc. 7th Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 141–154 (2005)

    Google Scholar 

  8. Indyk, P.: A small approximately min-wise independent family of hash functions. Journal of Algorithms 38(1), 84–90 (2001); See also SODA 1999

    Google Scholar 

  9. Knuth, D.E.: Notes on open addressing (1963) (Unpublished memorandum), http://citeseer.ist.psu.edu/knuth63notes.html

  10. Pagh, A., Pagh, R., Ružić, M.: Linear probing with constant independence. SIAM Journal on Computing 39(3), 1107–1120 (2009); See also STOC 2007

    Google Scholar 

  11. Pagh, R., Rodler, F.F.: Cuckoo hashing. Journal of Algorithms 51(2), 122–144 (2004); See also ESA 2001

    Google Scholar 

  12. Pǎtraşcu, M., Thorup, M.: The power of simple tabulation-based hashing (2010) (manuscript )

    Google Scholar 

  13. Schmidt, J.P., Siegel, A.: The analysis of closed hashing under limited randomness. In: Proc. 22nd ACM Symposium on Theory of Computing (STOC), pp. 224–234 (1990)

    Google Scholar 

  14. Schmidt, J.P., Siegel, A., Srinivasan, A.: Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics 8(2), 223–250 (1995); See also SODA 1993

    Google Scholar 

  15. Siegel, A., Schmidt, J.P.: Closed hashing is computable and optimally randomizable with universal hash functions. Technical Report TR1995-687, Currant Institute (1995)

    Google Scholar 

  16. Thorup, M.: Even strongly universal hashing is pretty fast. In: Proc. 11th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 496–497 (2000)

    Google Scholar 

  17. Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In: Proc. 15th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 615–624 (2004)

    Google Scholar 

  18. Thorup, M., Zhang, Y.: Tabulation based 5-universal hashing and linear probing. In: Proc. 12th Workshop on Algorithm Engineering and Experiments, ALENEX (2009)

    Google Scholar 

  19. Wegman, M.N., Carter, L.: New classes and applications of hash functions. Journal of Computer and System Sciences 22(3), 265–279 (1981); see also FOCS 1979

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pǎtraşcu, M., Thorup, M. (2010). On the k-Independence Required by Linear Probing and Minwise Independence. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds) Automata, Languages and Programming. ICALP 2010. Lecture Notes in Computer Science, vol 6198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14165-2_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14165-2_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14164-5

  • Online ISBN: 978-3-642-14165-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics