skip to main content
research-article

Sublinear optimization for machine learning

Published:05 November 2012Publication History
Skip Abstract Section

Abstract

In this article we describe and analyze sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. We give lower bounds which show the running times of many of our algorithms to be nearly best possible in the unit-cost RAM model.

References

  1. Blum, A., Frieze, A. M., Kannan, R., and Vempala, S. 1998. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica 22, 1/2, 35--52.Google ScholarGoogle Scholar
  2. Bylander, T. 1994. Learning linear threshold functions in the presence of classification noise. In Proceedings of the 7th Annual Conference on Computational Learning Theory (COLT'94). ACM, New York, 340--347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cesa-Bianchi, N., Conconi, A., and Gentile, C. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50, 9, 2050--2057. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cesa-Bianchi, N., Shalev-Shwartz, S., and Shamir, O. 2010. Online learning of noisy data with kernels. In Proceedings of the Annual Conference on Learning Theory (COLT), A. T. Kalai and M. Mohri, Eds. Omnipress, 218--230.Google ScholarGoogle Scholar
  5. Clarkson, K. L. 2008. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. In Proceedings of the Proc. 19th ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA, 922--931. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cover, T. and Thomas, J. A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Diaconis, P. and Freedman, D. 1980. Finite exchangeable sequences. Ann. Probab. 8, 745--764.Google ScholarGoogle ScholarCross RefCross Ref
  8. Dunagan, J. and Vempala, S. 2004. A simple polynomial-time rescaling algorithm for solving linear programs. In Proceedings of the 36th Annual ACM Symposium on the Theory of Computing. ACM, New York, 315--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Frank, M. and Wolfe, P. 1956. An algorithm for quadratic programming. Naval Res. Logis. Quart. 3, 95--110.Google ScholarGoogle ScholarCross RefCross Ref
  10. Grigoriadis, M. D. and Khachiyan, L. G. 1995. A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18, 53--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hazan, E. 2011. The convex optimization approach to regret minimization. Optimiz. Mach. Learn. 1.Google ScholarGoogle Scholar
  12. Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Mach. Lear. 69, 2-3, 169--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Koufogiannakis, C. and Young, N. E. 2007. Beating simplex for fractional packing and covering linear programs. In Proceedings of the Conference on Foundations of Computer Science (FOCS). IEEE Computer Society, 494--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Minsky, M. and Papert, S. 1988. Perceptrons: An Introduction to Computational Geometry. MIT Press Cambridge, MA.Google ScholarGoogle Scholar
  15. Monemizadeh, M. and Woodruff, D. 2010. 1-pass relative error lp-sampling with applications. In Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Motwani, R. and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Novikoff, A. B. 1963. On convergence proofs for perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12., 615--622.Google ScholarGoogle Scholar
  18. Panconesi, A. and Srinivasan, A. 1997. Randomized distributed edge coloring via an extension of the chernoff-hoeffding bounds. SIAM J. Comput. 26, 2, 350--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Plotkin, S. A., Shmoys, D. B., and Tardos, E. 1991. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science. IEEE Computer Society, Los Alamitos, CA, 495--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Saha, A. and Vishwanathan, S. 2009. Efficient approximation algorithms for minimum enclosing convex shapes. arXiv:0909.1062v2.Google ScholarGoogle Scholar
  21. Schölkopf, B. and Smola, A. J. 2003. A Short Introduction to Learning with Kernels. Springer, New York.Google ScholarGoogle Scholar
  22. Servedio, R. A. 1999. On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In Proceedings of the 12th Annual Conference on Computational Learning Theory. ACM, New York, 296--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML). 928--936.Google ScholarGoogle Scholar

Index Terms

  1. Sublinear optimization for machine learning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image Journal of the ACM
              Journal of the ACM  Volume 59, Issue 5
              October 2012
              204 pages
              ISSN:0004-5411
              EISSN:1557-735X
              DOI:10.1145/2371656
              Issue’s Table of Contents

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 5 November 2012
              • Accepted: 1 May 2012
              • Revised: 1 December 2011
              • Received: 1 April 2011
              Published in jacm Volume 59, Issue 5

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader