research-article

Sublinear optimization for machine learning

Authors:
Kenneth L. Clarkson

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

,
Elad Hazan

Technion - Israel Institute of technology, Israel

Technion - Israel Institute of technology, Israel
View Profile

,
David P. Woodruff

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

Authors Info & Claims

Journal of the ACM Volume 59 Issue 5Article No.: 23pp 1–49https://doi.org/10.1145/2371656.2371658

Published:05 November 2012Publication History

Journal of the ACM

Abstract

In this article we describe and analyze sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L₂-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. We give lower bounds which show the running times of many of our algorithms to be nearly best possible in the unit-cost RAM model.

References

Blum, A., Frieze, A. M., Kannan, R., and Vempala, S. 1998. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica 22, 1/2, 35--52.Google Scholar
Bylander, T. 1994. Learning linear threshold functions in the presence of classification noise. In Proceedings of the 7th Annual Conference on Computational Learning Theory (COLT'94). ACM, New York, 340--347. Google ScholarDigital Library
Cesa-Bianchi, N., Conconi, A., and Gentile, C. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50, 9, 2050--2057. Google ScholarDigital Library
Cesa-Bianchi, N., Shalev-Shwartz, S., and Shamir, O. 2010. Online learning of noisy data with kernels. In Proceedings of the Annual Conference on Learning Theory (COLT), A. T. Kalai and M. Mohri, Eds. Omnipress, 218--230.Google Scholar
Clarkson, K. L. 2008. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. In Proceedings of the Proc. 19th ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA, 922--931. Google ScholarDigital Library
Cover, T. and Thomas, J. A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. Google ScholarDigital Library
Diaconis, P. and Freedman, D. 1980. Finite exchangeable sequences. Ann. Probab. 8, 745--764.Google ScholarCross Ref
Dunagan, J. and Vempala, S. 2004. A simple polynomial-time rescaling algorithm for solving linear programs. In Proceedings of the 36th Annual ACM Symposium on the Theory of Computing. ACM, New York, 315--320. Google ScholarDigital Library
Frank, M. and Wolfe, P. 1956. An algorithm for quadratic programming. Naval Res. Logis. Quart. 3, 95--110.Google ScholarCross Ref
Grigoriadis, M. D. and Khachiyan, L. G. 1995. A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18, 53--58. Google ScholarDigital Library
Hazan, E. 2011. The convex optimization approach to regret minimization. Optimiz. Mach. Learn. 1.Google Scholar
Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Mach. Lear. 69, 2-3, 169--192. Google ScholarDigital Library
Koufogiannakis, C. and Young, N. E. 2007. Beating simplex for fractional packing and covering linear programs. In Proceedings of the Conference on Foundations of Computer Science (FOCS). IEEE Computer Society, 494--504. Google ScholarDigital Library
Minsky, M. and Papert, S. 1988. Perceptrons: An Introduction to Computational Geometry. MIT Press Cambridge, MA.Google Scholar
Monemizadeh, M. and Woodruff, D. 2010. 1-pass relative error l_p-sampling with applications. In Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms. Google ScholarDigital Library
Motwani, R. and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press. Google ScholarDigital Library
Novikoff, A. B. 1963. On convergence proofs for perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12., 615--622.Google Scholar
Panconesi, A. and Srinivasan, A. 1997. Randomized distributed edge coloring via an extension of the chernoff-hoeffding bounds. SIAM J. Comput. 26, 2, 350--368. Google ScholarDigital Library
Plotkin, S. A., Shmoys, D. B., and Tardos, E. 1991. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science. IEEE Computer Society, Los Alamitos, CA, 495--504. Google ScholarDigital Library
Saha, A. and Vishwanathan, S. 2009. Efficient approximation algorithms for minimum enclosing convex shapes. arXiv:0909.1062v2.Google Scholar
Schölkopf, B. and Smola, A. J. 2003. A Short Introduction to Learning with Kernels. Springer, New York.Google Scholar
Servedio, R. A. 1999. On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In Proceedings of the 12th Annual Conference on Computational Learning Theory. ACM, New York, 296--307. Google ScholarDigital Library
Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML). 928--936.Google Scholar

Index Terms

Sublinear optimization for machine learning

Recommendations

Sublinear Optimization for Machine Learning
FOCS '10: Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science

We give sub linear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these ...
Read More
Sublinear bounds for randomized leader election

This paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete n-node networks that runs in O ( 1 ) rounds and (with high probability) uses only O ( n log 3 / 2 n ) ...
Read More
Almost-Optimal Sublinear Additive Spanners
STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of Computing

Given an undirected unweighted graph G = (V, E) on n vertices and m edges, a subgraph H⊆ G is a spanner of G with stretch function f: ℝ₊ → ℝ₊, iff for every pair s, t of vertices in V, dist_H(s, t)≤ f(dist_G(s, t)). When f(d) = d + o(d), H is called a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Journal of the ACM Volume 59, Issue 5
October 2012
204 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/2371656
Issue’s Table of Contents

Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 November 2012
- Accepted: 1 May 2012
- Revised: 1 December 2011
- Received: 1 April 2011
Published in jacm Volume 59, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 1,256
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sublinear optimization for machine learning

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Sublinear Optimization for Machine Learning

Sublinear bounds for randomized leader election

Almost-Optimal Sublinear Additive Spanners

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Sublinear optimization for machine learning

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Sublinear Optimization for Machine Learning

Sublinear bounds for randomized leader election

Almost-Optimal Sublinear Additive Spanners

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media