skip to main content
10.1145/1807085.1807108acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
tutorial

Information complexity: a tutorial

Published:06 June 2010Publication History

ABSTRACT

The recent years have witnessed the overwhelming success of algorithms that operate on massive data. Several computing paradigms have been proposed for massive data set algorithms such as data streams, sketching, sampling etc. and understanding their limitations is a fundamental theoretical challenge. In this survey, we describe the information complexity paradigm that has proved successful in obtaining tight lower bounds for several well-known problems. Information complexity quantifies the amount of information about the inputs that must be necessarily propagated by any algorithm in solving a problem. We describe the key ideas of this paradigm, and highlight the beautiful interplay of techniques arising from diverse areas such as information theory, statistics and geometry.

References

  1. F. Ablayev. Lower bounds for one-way probabilistic communication complexity and their application to space complexity. Theoretical Computer Science, 157(2):139--159, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexandr Andoni, Piotr Indyk, and Robert Krauthgamer. Overcoming the l_1 non-embeddability barrier: Algorithms for product metrics. In SODA, pages 865--874, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alexandr Andoni, T.S. Jayram, and Mihai Patraşcu. Lower bounds for edit distance and product metrics via Poincaré-type inequalities. In SODA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alexandr Andoni and Robert Krauthgamer. The computational hardness of estimating edit distance. In FOCS, pages 724--734, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Susanne Albers and Jean-Yves Marion, editors. 26th International Symposium on Theoretical Aspects of Computer Science, STACS 2009, February 26-28, 2009, Freiburg, Germany, Proceedings, volume 3 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, 2009.Google ScholarGoogle Scholar
  6. N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137--147, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Bar-Yehuda, B. Chor, E. Kushilevitz, and A. Orlitsky. Privacy, additional information, and communication. IEEE Transactions on Information Theory, 39(6):1930--1943, 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, and Luca Trevisan. Counting distinct elements in a data stream. In José D. P. Rolim and Salil P. Vadhan, editors, RANDOM, volume 2483 of Lecture Notes in Computer Science, pages 1--10. Springer, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ziv Bar-Yossef, T.S. Jayram, Ravi Kumar, and D. Sivakumar. An information statistics approach to data stream and communication complexity. J. Comput. Syst. Sci., 68(4):702--732, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3--15, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Chakrabarti, S. Khot, and X. Sun. Near-optimal lower bounds on the multiparty communication complexity of set-disjointness. In Proceedings of the 18th Annual IEEE Conference on Computational Complexity, pages 107--117, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  12. Graham Cormode and S. Muthukrishnan. Space efficient mining of multigraph streams. In PODS, pages 271--282. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Chakrabarti, Y. Shi, A. Wirth, and A. C-C. Yao. Informational complexity and the direct sum problem for simultaneous message complexity. In Proceedings of the 42nd IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 270--278, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Deza and M. Laurent. Geometry of Cuts and Metrics. Springer, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  16. Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci., 31(2):182--209, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sudipto Guha and Zhiyi Huang. Revisiting the direct sum theorem and space lower bounds in random order streams. In Albers and Marion {AM09}, pages 513--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andre Gronemeier. Asymptotically optimal lower bounds on the nih-multi-party information complexity of the and-function and disjointness. In Albers and Marion {AM09}, pages 505--516.Google ScholarGoogle Scholar
  19. Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM, 53(3):307--323, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Piotr Indyk and David P. Woodruff. Optimal approximations of the frequency moments of data streams. In STOC, pages 202--208, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T.S. Jayram. Hellinger strikes back: A note on the multi-party information complexity of AND. In RANDOM, 2009. To Appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T.S. Jayram, Ravi Kumar, and D. Sivakumar. The one-way communication complexity of hamming distance. Theory of Computing, 4(1):129--135, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  23. T.S. Jayram and David Woodruff. The data stream space complexity of cascaded norms. In FOCS, 2009. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Kremer, N. Nisan, and D. Ron. On randomized one-round communication complexity. Computational Complexity, 8(1):21--49, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mauricio Karchmer, Ran Raz, and Avi Wigderson. Super-logarithmic depth lower bounds via the direct sum in communication complexity. Computational Complexity, 5(3/4):191--204, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  27. J. Lin. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1):145--151, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jiri Matousek. Lectures on Discrete Geometry. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1(2), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. H. Papadimitriou and M. Sipser. Communication complexity. Journal of Computer and System Sciences, 28(2):260--269, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  31. N. Sloane. The on-line encyclopedia of integer sequences! http://www.research.att.com/njas/sequences/A048651.Google ScholarGoogle Scholar
  32. M. Saks and X. Sun. Space lower bounds for distance approximation in the data stream model. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 360--369, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David P. Woodruff. Optimal space lower bounds for all frequency moments. In J. Ian Munro, editor, SODA, pages 167--175. SIAM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. C-C. Yao. Some complexity questions related to distributive computing. In Proceedings of the 11th ACM Symposium on Theory of Computing (STOC), pages 209--213, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. V. M. Zolotarev. Probability metrics. Theory of Probability and its Applications, 28:278--302, 1983.yGoogle ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Information complexity: a tutorial

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
                June 2010
                350 pages
                ISBN:9781450300339
                DOI:10.1145/1807085

                Copyright © 2010 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 6 June 2010

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • tutorial

                Acceptance Rates

                PODS '10 Paper Acceptance Rate27of113submissions,24%Overall Acceptance Rate642of2,707submissions,24%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader