Abstract
Large-scale sparse precision matrix estimation has attracted wide interest from the statistics community. The convex partial correlation selection method (CONCORD) developed by Khare et al. (J R Stat Soc Ser B (Stat Methodol) 77(4):803–825, 2015) has recently been credited with some theoretical properties for estimating sparse precision matrices. The CONCORD obtains its solution by a coordinate descent algorithm (CONCORD-CD) based on the convexity of the objective function. However, since a coordinate-wise update in CONCORD-CD is inherently serial, a scale-up is nontrivial. In this paper, we propose a novel parallelization of CONCORD-CD, namely, CONCORD-PCD. CONCORD-PCD partitions the off-diagonal elements into several groups and updates each group simultaneously without harming the computational convergence of CONCORD-CD. We guarantee this by employing the notion of edge coloring in graph theory. Specifically, we establish a nontrivial correspondence between scheduling the updates of the off-diagonal elements in CONCORD-CD and coloring the edges of a complete graph. It turns out that CONCORD-PCD simultanoeusly updates off-diagonal elements in which the associated edges are colorable with the same color. As a result, the number of steps required for updating off-diagonal elements reduces from \(p(p-1)/2\) to \(p-1\) (for even p) or p (for odd p), where p denotes the number of variables. We prove that the number of such steps is irreducible In addition, CONCORD-PCD is tailored to single-instruction multiple-data (SIMD) parallelism. A numerical study shows that the SIMD-parallelized PCD algorithm implemented in graphics processing units boosts the CONCORD-CD algorithm multiple times. The method is available in the R package pcdconcord.
Similar content being viewed by others
References
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Bradley JK, Kyrola A, Bickson D, Guestrin C (1998) Parallel coordinate descent for L1-regularized loss minimization. In: Proceedings of the 28th international conference on machine learning, ICML 2011, pp 321–328
Cai T, Liu W, Luo X (2011) A constrained l1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106(494):594–607
Cai TT, Liu W, Zhou HH (2016) Estimating sparse precision matrix: optimal rates of convergence and adaptive estimation. Ann Stat 44(2):455–488
Danaher P, Wang P, Witten DM (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc Ser B Stat Methodol 76(2):373–397
Dinitz JH, Froncek D, Lamken,ER, Wallis WD (2006) Scheduling a tournament. In: Handbook of combinatorial designs, chapter VI.51, 2nd edn. Chapman & Hall/CRC, pp 591–606
Formanowicz P, Tanaś K (2012) A survey of graph coloring—its types, methods and applications. Found Comput Decis Sci 37(3):223–238
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Hsieh C-J (2014) QUIC?: quadratic approximation for sparse inverse covariance estimation. J Mach Learn Res 15:2911–2947
Hsieh C-J, Sustik MA, Dhillon IS, Ravikumar PK, Poldrack R (2013) BIG & QUIC: sparse inverse covariance estimation for a million variables. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc, Red Hook, pp 3165–3173
Khare K, Oh S-Y, Rajaratnam B (2015) A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J R Stat Soc Ser B (Stat Methodol) 77(4):803–825
Lawson C, Hanson R, Kincaid D, Krogh F (1979) Algorithm 539: basic linear algebra subprograms for Fortran usage. ACM Trans Math Softw 5(3):308–323
Mazumder R, Hastie T (2012) The graphical Lasso: new insights and alternatives. Electron J Stat 6(August):2125–2149
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34(3):1436–1462
Nakano S-I, Zhou X, Nishizeki T (1995) Edge-coloring algorithms. In: Computer science today. Lecture notes in computer science. Springer, Berlin, vol 1000, pp 172–183
Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256
Pang H, Liu H, Vanderbei R (2014) The FASTCLIME package for linear programming and large-scale precision matrix estimation in R. J Mach Learn Res 15:489–493
Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J Am Stat Assoc 104(486):735–746
Richtárik P, Takáč M (2016) Parallel coordinate descent methods for big data optimization, vol 156
Sun T, Zhang CH (2013) Sparse matrix inversion with scaled Lasso. J Mach Learn Res 14:3385–3418
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
Wang H, Banerjee A, Hsieh C-J, Ravikumar PK, Dhillon IS (2013) Large scale distributed sparse precision estimation. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc, Red Hook, pp 584–592
Witten DM, Friedman JH, Simon N (2011) New insights and faster computations for the graphical lasso. J Comput Graph Stat 20(4):892–900
Yu D, Lee SH, Lim J, Xiao G, Craddock RC, Biswal BB (2018) Fused lasso regression for identifying differential correlations in brain connectome graphs. Stat Anal Data Min ASA Data Sci J 11(5):203–226
Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94:19–35
Acknowledgements
This research was supported by the National Research Foundation of Korea (NRF-2018R1C1B6001108), Inha University Research Grant, and Sookmyung Women’s University Research Grant (No. 1-2003-2004).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Choi, YG., Lee, S. & Yu, D. An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units. Comput Stat 37, 419–443 (2022). https://doi.org/10.1007/s00180-021-01127-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-021-01127-x