Skip to main content
Log in

An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Large-scale sparse precision matrix estimation has attracted wide interest from the statistics community. The convex partial correlation selection method (CONCORD) developed by Khare et al. (J R Stat Soc Ser B (Stat Methodol) 77(4):803–825, 2015) has recently been credited with some theoretical properties for estimating sparse precision matrices. The CONCORD obtains its solution by a coordinate descent algorithm (CONCORD-CD) based on the convexity of the objective function. However, since a coordinate-wise update in CONCORD-CD is inherently serial, a scale-up is nontrivial. In this paper, we propose a novel parallelization of CONCORD-CD, namely, CONCORD-PCD. CONCORD-PCD partitions the off-diagonal elements into several groups and updates each group simultaneously without harming the computational convergence of CONCORD-CD. We guarantee this by employing the notion of edge coloring in graph theory. Specifically, we establish a nontrivial correspondence between scheduling the updates of the off-diagonal elements in CONCORD-CD and coloring the edges of a complete graph. It turns out that CONCORD-PCD simultanoeusly updates off-diagonal elements in which the associated edges are colorable with the same color. As a result, the number of steps required for updating off-diagonal elements reduces from \(p(p-1)/2\) to \(p-1\) (for even p) or p (for odd p), where p denotes the number of variables. We prove that the number of such steps is irreducible In addition, CONCORD-PCD is tailored to single-instruction multiple-data (SIMD) parallelism. A numerical study shows that the SIMD-parallelized PCD algorithm implemented in graphics processing units boosts the CONCORD-CD algorithm multiple times. The method is available in the R package pcdconcord.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  Google Scholar 

  • Bradley JK, Kyrola A, Bickson D, Guestrin C (1998) Parallel coordinate descent for L1-regularized loss minimization. In: Proceedings of the 28th international conference on machine learning, ICML 2011, pp 321–328

  • Cai T, Liu W, Luo X (2011) A constrained l1 minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106(494):594–607

    Article  Google Scholar 

  • Cai TT, Liu W, Zhou HH (2016) Estimating sparse precision matrix: optimal rates of convergence and adaptive estimation. Ann Stat 44(2):455–488

    MathSciNet  MATH  Google Scholar 

  • Danaher P, Wang P, Witten DM (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc Ser B Stat Methodol 76(2):373–397

    Article  MathSciNet  Google Scholar 

  • Dinitz JH, Froncek D, Lamken,ER, Wallis WD (2006) Scheduling a tournament. In: Handbook of combinatorial designs, chapter VI.51, 2nd edn. Chapman & Hall/CRC, pp 591–606

  • Formanowicz P, Tanaś K (2012) A survey of graph coloring—its types, methods and applications. Found Comput Decis Sci 37(3):223–238

    Article  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441

    Article  Google Scholar 

  • Hsieh C-J (2014) QUIC?: quadratic approximation for sparse inverse covariance estimation. J Mach Learn Res 15:2911–2947

    MathSciNet  MATH  Google Scholar 

  • Hsieh C-J, Sustik MA, Dhillon IS, Ravikumar PK, Poldrack R (2013) BIG & QUIC: sparse inverse covariance estimation for a million variables. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc, Red Hook, pp 3165–3173

    Google Scholar 

  • Khare K, Oh S-Y, Rajaratnam B (2015) A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J R Stat Soc Ser B (Stat Methodol) 77(4):803–825

    Article  MathSciNet  Google Scholar 

  • Lawson C, Hanson R, Kincaid D, Krogh F (1979) Algorithm 539: basic linear algebra subprograms for Fortran usage. ACM Trans Math Softw 5(3):308–323

    Article  Google Scholar 

  • Mazumder R, Hastie T (2012) The graphical Lasso: new insights and alternatives. Electron J Stat 6(August):2125–2149

    MathSciNet  MATH  Google Scholar 

  • Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34(3):1436–1462

    Article  MathSciNet  Google Scholar 

  • Nakano S-I, Zhou X, Nishizeki T (1995) Edge-coloring algorithms. In: Computer science today. Lecture notes in computer science. Springer, Berlin, vol 1000, pp 172–183

  • Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256

    Article  MathSciNet  Google Scholar 

  • Pang H, Liu H, Vanderbei R (2014) The FASTCLIME package for linear programming and large-scale precision matrix estimation in R. J Mach Learn Res 15:489–493

    MATH  Google Scholar 

  • Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J Am Stat Assoc 104(486):735–746

    Article  MathSciNet  Google Scholar 

  • Richtárik P, Takáč M (2016) Parallel coordinate descent methods for big data optimization, vol 156

  • Sun T, Zhang CH (2013) Sparse matrix inversion with scaled Lasso. J Mach Learn Res 14:3385–3418

    MathSciNet  MATH  Google Scholar 

  • Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494

    Article  MathSciNet  Google Scholar 

  • Wang H, Banerjee A, Hsieh C-J, Ravikumar PK, Dhillon IS (2013) Large scale distributed sparse precision estimation. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, vol 26. Curran Associates Inc, Red Hook, pp 584–592

    Google Scholar 

  • Witten DM, Friedman JH, Simon N (2011) New insights and faster computations for the graphical lasso. J Comput Graph Stat 20(4):892–900

    Article  MathSciNet  Google Scholar 

  • Yu D, Lee SH, Lim J, Xiao G, Craddock RC, Biswal BB (2018) Fused lasso regression for identifying differential correlations in brain connectome graphs. Stat Anal Data Min ASA Data Sci J 11(5):203–226

    Article  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94:19–35

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Research Foundation of Korea (NRF-2018R1C1B6001108), Inha University Research Grant, and Sookmyung Women’s University Research Grant (No. 1-2003-2004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donghyeon Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choi, YG., Lee, S. & Yu, D. An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units. Comput Stat 37, 419–443 (2022). https://doi.org/10.1007/s00180-021-01127-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-021-01127-x

Keywords

Navigation