Abstract
In this article we describe and analyze sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. We give lower bounds which show the running times of many of our algorithms to be nearly best possible in the unit-cost RAM model.
- Blum, A., Frieze, A. M., Kannan, R., and Vempala, S. 1998. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica 22, 1/2, 35--52.Google Scholar
- Bylander, T. 1994. Learning linear threshold functions in the presence of classification noise. In Proceedings of the 7th Annual Conference on Computational Learning Theory (COLT'94). ACM, New York, 340--347. Google ScholarDigital Library
- Cesa-Bianchi, N., Conconi, A., and Gentile, C. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50, 9, 2050--2057. Google ScholarDigital Library
- Cesa-Bianchi, N., Shalev-Shwartz, S., and Shamir, O. 2010. Online learning of noisy data with kernels. In Proceedings of the Annual Conference on Learning Theory (COLT), A. T. Kalai and M. Mohri, Eds. Omnipress, 218--230.Google Scholar
- Clarkson, K. L. 2008. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. In Proceedings of the Proc. 19th ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA, 922--931. Google ScholarDigital Library
- Cover, T. and Thomas, J. A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. Google ScholarDigital Library
- Diaconis, P. and Freedman, D. 1980. Finite exchangeable sequences. Ann. Probab. 8, 745--764.Google ScholarCross Ref
- Dunagan, J. and Vempala, S. 2004. A simple polynomial-time rescaling algorithm for solving linear programs. In Proceedings of the 36th Annual ACM Symposium on the Theory of Computing. ACM, New York, 315--320. Google ScholarDigital Library
- Frank, M. and Wolfe, P. 1956. An algorithm for quadratic programming. Naval Res. Logis. Quart. 3, 95--110.Google ScholarCross Ref
- Grigoriadis, M. D. and Khachiyan, L. G. 1995. A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18, 53--58. Google ScholarDigital Library
- Hazan, E. 2011. The convex optimization approach to regret minimization. Optimiz. Mach. Learn. 1.Google Scholar
- Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Mach. Lear. 69, 2-3, 169--192. Google ScholarDigital Library
- Koufogiannakis, C. and Young, N. E. 2007. Beating simplex for fractional packing and covering linear programs. In Proceedings of the Conference on Foundations of Computer Science (FOCS). IEEE Computer Society, 494--504. Google ScholarDigital Library
- Minsky, M. and Papert, S. 1988. Perceptrons: An Introduction to Computational Geometry. MIT Press Cambridge, MA.Google Scholar
- Monemizadeh, M. and Woodruff, D. 2010. 1-pass relative error lp-sampling with applications. In Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms. Google ScholarDigital Library
- Motwani, R. and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press. Google ScholarDigital Library
- Novikoff, A. B. 1963. On convergence proofs for perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12., 615--622.Google Scholar
- Panconesi, A. and Srinivasan, A. 1997. Randomized distributed edge coloring via an extension of the chernoff-hoeffding bounds. SIAM J. Comput. 26, 2, 350--368. Google ScholarDigital Library
- Plotkin, S. A., Shmoys, D. B., and Tardos, E. 1991. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science. IEEE Computer Society, Los Alamitos, CA, 495--504. Google ScholarDigital Library
- Saha, A. and Vishwanathan, S. 2009. Efficient approximation algorithms for minimum enclosing convex shapes. arXiv:0909.1062v2.Google Scholar
- Schölkopf, B. and Smola, A. J. 2003. A Short Introduction to Learning with Kernels. Springer, New York.Google Scholar
- Servedio, R. A. 1999. On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In Proceedings of the 12th Annual Conference on Computational Learning Theory. ACM, New York, 296--307. Google ScholarDigital Library
- Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML). 928--936.Google Scholar
Index Terms
- Sublinear optimization for machine learning
Recommendations
Sublinear Optimization for Machine Learning
FOCS '10: Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer ScienceWe give sub linear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these ...
Sublinear bounds for randomized leader election
This paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete n-node networks that runs in O ( 1 ) rounds and (with high probability) uses only O ( n log 3 / 2 n ) ...
Almost-Optimal Sublinear Additive Spanners
STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of ComputingGiven an undirected unweighted graph G = (V, E) on n vertices and m edges, a subgraph H⊆ G is a spanner of G with stretch function f: ℝ+ → ℝ+, iff for every pair s, t of vertices in V, distH(s, t)≤ f(distG(s, t)). When f(d) = d + o(d), H is called a ...
Comments