ABSTRACT
The recent years have witnessed the overwhelming success of algorithms that operate on massive data. Several computing paradigms have been proposed for massive data set algorithms such as data streams, sketching, sampling etc. and understanding their limitations is a fundamental theoretical challenge. In this survey, we describe the information complexity paradigm that has proved successful in obtaining tight lower bounds for several well-known problems. Information complexity quantifies the amount of information about the inputs that must be necessarily propagated by any algorithm in solving a problem. We describe the key ideas of this paradigm, and highlight the beautiful interplay of techniques arising from diverse areas such as information theory, statistics and geometry.
- F. Ablayev. Lower bounds for one-way probabilistic communication complexity and their application to space complexity. Theoretical Computer Science, 157(2):139--159, 1996. Google ScholarDigital Library
- Alexandr Andoni, Piotr Indyk, and Robert Krauthgamer. Overcoming the l_1 non-embeddability barrier: Algorithms for product metrics. In SODA, pages 865--874, 2009. Google ScholarDigital Library
- Alexandr Andoni, T.S. Jayram, and Mihai Patraşcu. Lower bounds for edit distance and product metrics via Poincaré-type inequalities. In SODA, 2010. Google ScholarDigital Library
- Alexandr Andoni and Robert Krauthgamer. The computational hardness of estimating edit distance. In FOCS, pages 724--734, 2007. Google ScholarDigital Library
- Susanne Albers and Jean-Yves Marion, editors. 26th International Symposium on Theoretical Aspects of Computer Science, STACS 2009, February 26-28, 2009, Freiburg, Germany, Proceedings, volume 3 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, 2009.Google Scholar
- N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137--147, 1999. Google ScholarDigital Library
- R. Bar-Yehuda, B. Chor, E. Kushilevitz, and A. Orlitsky. Privacy, additional information, and communication. IEEE Transactions on Information Theory, 39(6):1930--1943, 1993.Google ScholarDigital Library
- Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, and Luca Trevisan. Counting distinct elements in a data stream. In José D. P. Rolim and Salil P. Vadhan, editors, RANDOM, volume 2483 of Lecture Notes in Computer Science, pages 1--10. Springer, 2002. Google ScholarDigital Library
- Ziv Bar-Yossef, T.S. Jayram, Ravi Kumar, and D. Sivakumar. An information statistics approach to data stream and communication complexity. J. Comput. Syst. Sci., 68(4):702--732, 2004. Google ScholarDigital Library
- Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theor. Comput. Sci., 312(1):3--15, 2004. Google ScholarDigital Library
- A. Chakrabarti, S. Khot, and X. Sun. Near-optimal lower bounds on the multiparty communication complexity of set-disjointness. In Proceedings of the 18th Annual IEEE Conference on Computational Complexity, pages 107--117, 2003.Google ScholarCross Ref
- Graham Cormode and S. Muthukrishnan. Space efficient mining of multigraph streams. In PODS, pages 271--282. ACM, 2005. Google ScholarDigital Library
- A. Chakrabarti, Y. Shi, A. Wirth, and A. C-C. Yao. Informational complexity and the direct sum problem for simultaneous message complexity. In Proceedings of the 42nd IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 270--278, 2001. Google ScholarDigital Library
- T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991. Google ScholarDigital Library
- M. Deza and M. Laurent. Geometry of Cuts and Metrics. Springer, 1997.Google ScholarCross Ref
- Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci., 31(2):182--209, 1985. Google ScholarDigital Library
- Sudipto Guha and Zhiyi Huang. Revisiting the direct sum theorem and space lower bounds in random order streams. In Albers and Marion {AM09}, pages 513--524. Google ScholarDigital Library
- Andre Gronemeier. Asymptotically optimal lower bounds on the nih-multi-party information complexity of the and-function and disjointness. In Albers and Marion {AM09}, pages 505--516.Google Scholar
- Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM, 53(3):307--323, 2006. Google ScholarDigital Library
- Piotr Indyk and David P. Woodruff. Optimal approximations of the frequency moments of data streams. In STOC, pages 202--208, 2005. Google ScholarDigital Library
- T.S. Jayram. Hellinger strikes back: A note on the multi-party information complexity of AND. In RANDOM, 2009. To Appear. Google ScholarDigital Library
- T.S. Jayram, Ravi Kumar, and D. Sivakumar. The one-way communication complexity of hamming distance. Theory of Computing, 4(1):129--135, 2008.Google ScholarCross Ref
- T.S. Jayram and David Woodruff. The data stream space complexity of cascaded norms. In FOCS, 2009. To appear. Google ScholarDigital Library
- E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarDigital Library
- I. Kremer, N. Nisan, and D. Ron. On randomized one-round communication complexity. Computational Complexity, 8(1):21--49, 1999. Google ScholarDigital Library
- Mauricio Karchmer, Ran Raz, and Avi Wigderson. Super-logarithmic depth lower bounds via the direct sum in communication complexity. Computational Complexity, 5(3/4):191--204, 1995.Google ScholarCross Ref
- J. Lin. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1):145--151, 1991.Google ScholarDigital Library
- Jiri Matousek. Lectures on Discrete Geometry. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2002. Google ScholarDigital Library
- S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1(2), 2005. Google ScholarDigital Library
- C. H. Papadimitriou and M. Sipser. Communication complexity. Journal of Computer and System Sciences, 28(2):260--269, 1984.Google ScholarCross Ref
- N. Sloane. The on-line encyclopedia of integer sequences! http://www.research.att.com/njas/sequences/A048651.Google Scholar
- M. Saks and X. Sun. Space lower bounds for distance approximation in the data stream model. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 360--369, 2002. Google ScholarDigital Library
- David P. Woodruff. Optimal space lower bounds for all frequency moments. In J. Ian Munro, editor, SODA, pages 167--175. SIAM, 2004. Google ScholarDigital Library
- A. C-C. Yao. Some complexity questions related to distributive computing. In Proceedings of the 11th ACM Symposium on Theory of Computing (STOC), pages 209--213, 1979. Google ScholarDigital Library
- V. M. Zolotarev. Probability metrics. Theory of Probability and its Applications, 28:278--302, 1983.yGoogle ScholarCross Ref
Index Terms
- Information complexity: a tutorial
Recommendations
Interactive information complexity
STOC '12: Proceedings of the forty-fourth annual ACM symposium on Theory of computingThe primary goal of this paper is to define and study the interactive information complexity of functions. Let f(x,y) be a function, and suppose Alice is given x and Bob is given y. Informally, the interactive information complexity IC(f) of f is the ...
Lower Bounds on Information Complexity via Zero-Communication Protocols and Applications
FOCS '12: Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer ScienceWe show that almost all known lower bound methods for communication complexity are also lower bounds for the information complexity. In particular, we define a relaxed version of the partition bound of Jain and Klauck and prove that it lower bounds the ...
Interactive Information Complexity
The primary goal of this paper is to define and study the interactive information complexity of functions. Let $f(x,y)$ be a function, and suppose Alice is given $x$ and Bob is given $y$. Informally, the interactive information complexity $\mathsf{IC}(f)$ ...
Comments