Abstract
We consider the problem of minimizing a difference-of-convex (DC) function, which can be written as the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous possibly nonsmooth concave function. We refine the convergence analysis in Wen et al. (Comput Optim Appl 69, 297–324, 2018) for the proximal DC algorithm with extrapolation (\(\hbox {pDCA}_e\)) and show that the whole sequence generated by the algorithm is convergent without imposing differentiability assumptions in the concave part. Our analysis is based on a new potential function and we assume such a function is a Kurdyka–Łojasiewicz (KL) function. We also establish a relationship between our KL assumption and the one used in Wen et al. (2018). Finally, we demonstrate how the \(\hbox {pDCA}_e\) can be applied to a class of simultaneous sparse recovery and outlier detection problems arising from robust compressed sensing in signal processing and least trimmed squares regression in statistics. Specifically, we show that the objectives of these problems can be written as level-bounded DC functions whose concave parts are typically nonsmooth. Moreover, for a large class of loss functions and regularizers, the KL exponent of the corresponding potential function are shown to be 1/2, which implies that the \(\hbox {pDCA}_e\) is locally linearly convergent when applied to these problems. Our numerical experiments show that the \(\hbox {pDCA}_e\) usually outperforms the proximal DC algorithm with nonmonotone linesearch (Liu et al. in Math Program, 2018. https://doi.org/10.1007/s10107-018-1327-8, Appendix A) in both CPU time and solution quality for this particular application.
Similar content being viewed by others
Notes
The requirement \(h(\bar{\varvec{x}}) < h(\varvec{x})\) is dropped because (21) holds trivially when \(h(\bar{\varvec{x}}) \ge h(\varvec{x})\).
As mentioned before, with this choice of subgradient in Algorithm 1, the algorithm is equivalent to Algorithm 2.
Notice from \(\psi _i(s) = \frac{1}{2}(s-b_i)^2\), (33) and the definition of \(\varvec{z}^{k+1}\) that \(\varvec{A}^\top \varvec{z}^{k+1}\in \partial Q(\varvec{x}^k)\). This together with \({\varvec{\eta }}^{k+1}\in \partial {\mathcal {J}}_2({\varvec{x}}^k)\) and \(g = {\mathcal {J}}_2 + Q\) gives \(\varvec{\zeta }^k\in \partial g(\varvec{x}^k)\).
References
Alfons, A., Croux, C., Gelper, S.: Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Stat. 7, 226–248 (2013)
Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions invoving analytic features. Math. Program. 116, 5–16 (2009)
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137, 91–129 (2013)
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27, 265–274 (2009)
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146, 459–494 (2014)
Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215 (2005)
Candès, E.J., Romberg, J.K., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59, 1207–1223 (2006)
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing spasity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
Carrillo, R.E., Ramirez, A.B., Arce, G.R., Barner, K.E., Sadler, B.M.: Robust compressive sensing of sparse signals: a review. EURASIP J. Adv. Signal Process. 2016, 108 (2016)
Chambolle, A., Dossal, Ch.: On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”. J. Optim. Theory Appl. 166, 968–982 (2015)
Chartrand, R.: Exact reconstructions of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 14, 707–710 (2007)
Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3869–3872 (2008)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Foucart, S., Lai, M.J.: Sparsest solutions of underdetermined linear systems via \(\ell _q\)-minimization for \(0 < q \le 1\). Appl. Comput. Harmon. Anal. 26, 395–407 (2009)
Giloni, A., Padberg, M.: Least trimmed squares regression, least median squares regression, and mathematical programming. Math. Comput. Model. 35, 1043–1060 (2002)
Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: International Conference on Machine Learning, pp. 37–45 (2013)
Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. 169, 141–176 (2018)
Hoeting, J., Raftery, A.E., Madigan, D.: A method for simultaneous variable selection and outlier identification in linear regression. Comput. Stat. Data Anal. 22, 251–270 (1996)
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 18, 1199–1232 (2018)
Liu, T., Pong, T.K., Takeda, A.: A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems. Math. Program. (to appear). https://doi.org/10.1007/s10107-018-1327-8
Loh, P.-L.: Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. Ann. Stat. 45, 866–896 (2017)
Lu, Z., Zhang, Y.: Sparse approximation via penalty decomposition methods. SIAM J. Optim. 23, 2448–2478 (2013)
Menjoge, R.S., Welsch, R.E.: A diagnostic method for simultaneous feature selection and outlier identification in linear regression. Comput. Stat. Data Anal. 54, 3181–3193 (2010)
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24, 227–234 (1995)
Pham, D.T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22, 289–355 (1997)
Pham, D.T., Le Thi, H.A.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8, 476–505 (1998)
Polania, L.F., Carrillo, R.E., Blanco-Velasco, M., Barner, K.E.: Compressive sensing for ECG signals in the presence of electromyography noise. In: Proceedings of the 38th Annual Northeast Bioengineering Conference, pp. 295–296 (2012)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Rousseeuw, P.J.: Regression techniques with high breakdown point. Inst. Math. Stat. Bull. 12, 155 (1983)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Saab, R., Chartrand, R., Yilmaz, O.: Stable sparse approximations via nonconvex optimization. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3885–3888 (2008)
She, Y., Owen, A.B.: Outlier detection using nonconvex penalized regression. J. Am. Stat. Assoc. 106, 626–639 (2011)
Smucler, E., Yohai, V.J.: Robust and sparse estimators for linear regression models. Comput. Stat. Data Anal. 111, 116–130 (2017)
Tibshirani, R., Taylor, J.: The solution path of the generalized lasso. Ann. Stat. 39, 1335–1371 (2011)
Tuy, H.: Convex Analysis and Global Optimization. Springer, Berlin (2016)
Wang, Y., Luo, Z., Zhang, X.: New improved penalty methods for sparse reconstruction based on difference of two norms. https://doi.org/10.13140/RG.2.1.3256.3369
Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27, 124–145 (2017)
Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69, 297–324 (2018)
Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(\ell _{1-2}\) for compressed sensing. SIAM J. Sci. Comput. 37, A536–A563 (2015)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ting Kei Pong was supported in part by Hong Kong Research Grants Council PolyU153005/17p. Akiko Takeda was supported in part by JSPS KAKENHI Grant Number 15K00031.
Rights and permissions
About this article
Cite this article
Liu, T., Pong, T.K. & Takeda, A. A refined convergence analysis of \(\hbox {pDCA}_{e}\) with applications to simultaneous sparse recovery and outlier detection. Comput Optim Appl 73, 69–100 (2019). https://doi.org/10.1007/s10589-019-00067-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-019-00067-z