Boosting and Hard-Core Set Construction

Klivans, Adam R.; Servedio, Rocco A.

doi:10.1023/A:1022949332276

Boosting and Hard-Core Set Construction

Published: June 2003

Volume 51, pages 217–238, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Boosting and Hard-Core Set Construction

Download PDF

Adam R. Klivans¹ &
Rocco A. Servedio²

799 Accesses
33 Citations
6 Altmetric
1 Mention
Explore all metrics

Abstract

This paper connects hard-core set construction, a type of hardness amplification from computational complexity, and boosting, a technique from computational learning theory. Using this connection we give fruitful applications of complexity-theoretic techniques to learning theory and vice versa. We show that the hard-core set construction of Impagliazzo (1995), which establishes the existence of distributions under which boolean functions are highly inapproximable, may be viewed as a boosting algorithm. Using alternate boosting methods we give an improved bound for hard-core set construction which matches known lower bounds from boosting and thus is optimal within this class of techniques. We then show how to apply techniques from Impagliazzo (1995) to give a new version of Jackson's celebrated Harmonic Sieve algorithm for learning DNF formulae under the uniform distribution using membership queries. Our new version has a significant asymptotic improvement in running time. Critical to our arguments is a careful analysis of the distributions which are employed in both boosting and hard-core set constructions.

References

Babai, L., Fortnow, L., Nisan, N., & Wigderson, A. (1993). BPP has subexponential time simulations unless exptime has publishable proofs. Computational Complexity, 3, 307–318.
Google Scholar
Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., & Rudich, S. (1994). Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In Proceedings of the Twenty-Sixth Annual Symposium on Theory of Computing (pp. 253–262). ACM.
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1989). Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36:4, 929–965.
Google Scholar
Boneh, D., & Lipton, R. (1993). Amplification of weak learning over the uniform distribution. In Proceedings of the Sixth Annual Workshop on Computational Learning Theory (pp. 347–351). ACM.
Bshouty, N., Jackson, J., & Tamon, C. (1999). More efficient PAC learning of DNF with membership queries under the uniform distribution. In Proceedings of the Twelfth Annual Conference on Computational Learning Theory (pp. 286–295).
Drucker, H., & Cortes, C. (1996). Boosting decision trees. In Advances in Neural Information Processing Systems 8 (pp. 479–485).
Google Scholar
Drucker, H., Cortes, C., Jackel, L. D., Lecun, Y., & Vapnik, V. (1994). Boosting and other ensemble methods. Neural Computation, 6:6, 1289–1301.
Google Scholar
Drucker, H., Schapire, R., & Simard, P. (1993a). Boosting performance in neural networks. International Journal of Pattern Recognition and Machine Intelligence, 7:4, 705–719.
Google Scholar
Drucker, H., Schapire, R., & Simard, P. (1993b). Improving performance in neural networks using a boosting algorithm. In Advances in Neural Information Processing Systems 5 (pp. 42–49).
Google Scholar
Freund, Y. (1990). Boosting a weak learning algorithm by majority. In Proceedings of the Third Annual Workshop on Computational Learning Theory (pp. 202–216).
Freund, Y. (1992). An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory (pp. 391–398).
Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121:2, 256–285.
Google Scholar
Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.
Google Scholar
Goldreich, O., Nisan, N., & Wigderson, A. (1995). On Yao's xor-lemma. Electronic Colloquium on Computational Complexity, TR95-050.
Impagliazzo, R. (1995). Hard-core distributions for somewhat hard problems. In Proceedings of the Thirty-Sixth Annual Symposium on Foundations of Computer Science (pp. 538–545). IEEE.
Impagliazzo, R., & Widgerson, A. (1997). P = BPP unless E has subexponential circuits: Derandomizing the xor lemma. In Proceedings of the Twenty-Ninth Annual Symposium on Theory of Computing (pp. 220–229).
Jackson, J. (1995). The Harmonic sieve: A novel application of Fourier analysis to machine learning theory and practice. Ph.D. Thesis, Carnegie Mellon University.
Jackson, J. (1997). An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences, 55, 414–440.
Google Scholar
Jackson, J. (2002). Personal communication.
Jackson, J., & Craven, M. (1996). Learning sparse perceptrons. In Advances in Neural Information Processing Systems 8 (pp. 654–660).
Google Scholar
Kearns, M., & Valiant, L. (1994). Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM, 41:1, 67–95.
Google Scholar
Klivans, A., & Servedio, R. (2001). Learning DNF in time \(2^{\tilde O(n^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}} )} \). In Proceedings of the Twenty-Sixth Annual Symposium on Theory of Computing (pp. 258–265).
Levin, L. (1986). Average case complete problems. SIAM Journal on Computing, 15:1, 285–286.
Google Scholar
Muller, D., & Preparata, F. (1975). Bounds to complexities of networks for sorting and for switching. Journal of the ACM, 22:2, 195–201.
Google Scholar
Nisan, N., & Wigderson, A. (1994). Hardness versus randomness. Journal of Computer and System Sciences, 49, 149–167.
Google Scholar
Schapire, R. (1990). The strength of weak learnability. Machine Learning, 5:2, 197–227.
Google Scholar
Schapire, R., & Singer,Y. (1998). Improved boosting algorithms using confidence-rated predictions. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory (pp. 80–91).
Servedio, R. (2001). Smooth boosting and learning with malicious noise. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory (pp. 473–489).
Shaltiel, R. (2001). Towards proving strong direct product theorems. In Proceedings of the Sixteenth Conference on Computational Complexity (pp. 107–117).
Sudan, M., Trevisan, L., & Vadhan, S. (2001). Pseudorandom generators without the xor lemma. Journal of Computer and System Sciences, 62:2, 236–266.
Google Scholar
Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27:11, 1134–1142.
Google Scholar
Wigderson, A. (1999). Personal communication.

Download references

Author information

Authors and Affiliations

Laboratory for Computer Science, MIT, Cambridge, MA, 02139, USA
Adam R. Klivans
Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
Rocco A. Servedio

Authors

Adam R. Klivans
View author publications
You can also search for this author in PubMed Google Scholar
Rocco A. Servedio
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klivans, A.R., Servedio, R.A. Boosting and Hard-Core Set Construction. Machine Learning 51, 217–238 (2003). https://doi.org/10.1023/A:1022949332276

Download citation

Issue Date: June 2003
DOI: https://doi.org/10.1023/A:1022949332276

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Boosting and Hard-Core Set Construction

Abstract

Article PDF

Similar content being viewed by others

Boosting conditional probability estimators

Quick Real-Boost with: Weight Trimming, Exponential Impurity, Bins, and Pruning

The Boosting and Bootstrap Ensembles for the Pair Classifier Based on the Dual Indiscernibility Matrix

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Boosting and Hard-Core Set Construction

Abstract

Article PDF

Similar content being viewed by others

Boosting conditional probability estimators

Quick Real-Boost with: Weight Trimming, Exponential Impurity, Bins, and Pruning

The Boosting and Bootstrap Ensembles for the Pair Classifier Based on the Dual Indiscernibility Matrix

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation