Abstract
In this paper, we propose and analyze a trust-region model-based algorithm for solving unconstrained stochastic optimization problems. Our framework utilizes random models of an objective function f(x), obtained from stochastic observations of the function or its gradient. Our method also utilizes estimates of function values to gauge progress that is being made. The convergence analysis relies on requirements that these models and these estimates are sufficiently accurate with high enough, but fixed, probability. Beyond these conditions, no assumptions are made on how these models and estimates are generated. Under these general conditions we show an almost sure global convergence of the method to a first order stationary point. In the second part of the paper, we present examples of generating sufficiently accurate random models under biased or unbiased noise assumptions. Lastly, we present some computational results showing the benefits of the proposed method compared to existing approaches that are based on sample averaging or stochastic gradients.
Similar content being viewed by others
Notes
See [8] for details on well-poised sets and how they can be obtained.
References
Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a Meeting Held 12–14 December 2011, Granada, Spain, pp. 451–459 (2011)
Bandeira, A.S., Scheinberg, K., Vicente, L.N.: Convergence of trust-region methods based on probabilistic models. SIAM J. Optim. 24(3), 1238–1264 (2014)
Billups, S.C., Graf, P., Larson, J.: Derivative-free optimization of expensive functions with computational error using weighted regression. SIAM J. Optim. 23(1), 27–53 (2013)
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. Technical report. arXiv:1606.04838 (2016)
Chang, K.H., Li, M.K., Wan, H.: Stochastic trust-region response-surface method (strong)—a new response-surface framework for simulation optimization. INFORMS J. Comput. 25(2), 230–243 (2013)
Chen, R.: Stochastic derivative-free optimization of noisy functions. PhD thesis, Department of Industrial and Systems Engineering, Lehigh University, Bethlehem, USA (2015)
Conn, A.R., Scheinberg, K., Vicente, L.N.: Global convergence of general derivative-free trust-region algorithms to first- and second-order critical points. SIAM J. Optim. 20(1), 387–415 (2009)
Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2009)
Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1646–1654. Curran Associates Inc, Red Hook (2014)
Deng, G., Ferris, M.C.: Variable-number sample-path optimization. Math. Program. 117, 81–109 (2009)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Durrett, R.: Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics, p. 105. Cambridge University Press, Cambridge (2010)
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1), 59–99 (2016)
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341–2368 (2013)
Ghosh, S., Glynn, P.W., Hashemi, F., Pasupathy, R.: On sampling roles in stochastic recursion. SIAM J. Optim. (under review)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.), Advances in Neural Information Processing Systems (NIPS 2013), vol. 26, pp. 315–323 (2013)
Juditsky, A.B., Polyak, B.T.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 22(3), 462–466 (1952)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133, 365397 (2012)
Larson, J., Billups, S.C.: Stochastic derivative-free optimization using a trust region framework. Comput. Optim. Appl. 64(3), 619645 (2016)
Linderoth, J., Shapiro, A., Wright, S.: The empirical behavior of sampling methods for stochastic programming. Ann. Oper. Res. 142(1), 215–241 (2006)
Monro, S., Robbins, H.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)
Moré, J.J., Wild, S.M.: Benchmarking derivative-free optimization algorithms. SIAM J. Optim. 20(1), 172–191 (2009)
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Pasupathy, R., Ghosh, S.: Simulation optimization: a concise overview and implementation guide. In: Topaloglu, H., Smith, J. C. (eds.) TutORials in Operations Research, chapter 7, pp. 122–150. INFORMS, Catonsville (2013)
Powell, M.J.D.: UOBYQA: unconstrained optimization by quadratic approximation. Math. Program. 92(3), 555–582 (2002)
Richtarik, P., Takac, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1,2), 1–38 (2014)
Robinson, S.M.: Analysis of sample-path optimization. Math. Oper. Res. 21(3), 513–528 (1996)
Ruszczynski, A., Shapiro, A. (eds.): Stochastic Programming. Handbooks in Operations Research and Management Science, vol. 10. Elsevier, Amsterdam (2003)
Shashaani, S., Hashemi, F.S., Pasupathy, R.: Astro-DF: a class of adaptive sampling trust-region algorithms for derivative-free simulation optimization (2015) (under review)
Spall, J.C.: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992)
Spall, J.C.: Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans. Autom. Control 45(10), 1839–1853 (2000)
Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley Series in Discrete Mathematics and Optimization. Wiley, London (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
R. Chen: The work of this author was partially supported by NSF Grant CCF-1320137 and AFOSR Grant FA9550-11-1-0239. M. Menickelly: The work of this author is partially supported by NSF Grants DMS 13-19356 and CCF-1320137. K. Scheinberg: The work of this author is partially supported by NSF Grants DMS 10-16571, DMS 13-19356, CCF-1320137, AFOSR Grant FA9550-11-1-0239, and DARPA Grant FA 9550-12-1-0406 negotiated by AFOSR.
Rights and permissions
About this article
Cite this article
Chen, R., Menickelly, M. & Scheinberg, K. Stochastic optimization using a trust-region method and random models. Math. Program. 169, 447–487 (2018). https://doi.org/10.1007/s10107-017-1141-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-017-1141-8