Elsevier

Parallel Computing

Volume 27, Issue 7, June 2001, Pages 963-976
Parallel Computing

Parallel preconditioning of a sparse eigensolver

https://doi.org/10.1016/S0167-8191(01)00077-1Get rights and content

Abstract

We exploit an optimization method, called deflation-accelerated conjugate gradient (DACG), which sequentially computes the smallest eigenpairs of a symmetric, positive definite, generalized eigenproblem, by conjugate gradient (CG) minimizations of the Rayleigh quotient over deflated subspaces. We analyze the effectiveness of the AINV and FSAI approximate inverse preconditioners, to accelerate DACG for the solution of finite element and finite difference eigenproblems. Deflation is accomplished via CGS and MGS orthogonalization strategies whose accuracy and efficiency are tested. Numerical tests on a Cray T3E Supercomputer were performed, showing the high degree of parallelism attainable by the code. We found that for our DACG algorithm, AINV and FSAI are both effective preconditioners. They are more efficient than Block–Jacobi.

Introduction

An important task in many scientific applications is the computation of a small number of the leftmost eigenpairs (the smallest eigenvalues and corresponding eigenvectors) of the problem Ax=λBx, where A and B are large, sparse, symmetric positive definite matrices. Several techniques for solving this problem have been proposed: subspace iteration [1], [15], Lanczos method [7], [11], [14], and, more recently, restarted Arnoldi–Lanczos algorithm [12], Jacobi–Davidson method [17], and optimization methods by conjugate gradient (CG) schemes [3], [9], [16].

In this paper we analyze the performance of two preconditioning techniques, when applied to an optimization method, called deflation-accelerated conjugate gradient (DACG) [8]. DACG sequentially computes a number of eigenpairs by CG minimizations of the Rayleigh quotient over subspaces of decreasing size. When effectively preconditioned, we found [4] that the efficiency of DACG well compares with that of established packages, like ARPACK [13]. In a recent work [5], the performance of DACG has also been numerically compared with that of Jacobi–Davidson method, showing that their efficiency is comparable, when a small number of eigenpairs is to be computed.

We exploit three preconditioners, Block–Jacobi, FSAI [10] and AINV [2], the latter ones falling into the class of approximate inverse preconditioners. FSAI and AINV explicitly compute an approximation, say M to A−1, based on the sparse factorization of A−1. Preconditioning by a product of triangular factors performs better than other techniques, mainly because the fill-in of the preconditioner is reduced. Unlike many other approximate inverse techniques, AINV and FSAI preserve the positive definiteness of the problem, which is essential in our application. The FSAI algorithm requires an a priori sparsity pattern of the approximate factor, which is not easy to provide in unstructured sparse problems. We generated the FSAI preconditioner using the same pattern as the matrix A. On the other hand, AINV is based upon a drop tolerance, ε, which in principle is more convenient for our unstructured problems. The influence of the drop tolerance has been tested. Deflation is accomplished via B-orthogonalization of the search directions. We analyzed both classical (CGS) and modified (MGS) Gram–Schmidt, and tested the accuracy and efficiency of both strategies.

We have exploited the parallel Block–Jacobi-DACG, AINV–DACG and FSAI–DACG algorithms in the solution of finite element, mixed finite element, and finite difference eigenproblems, both in two and three dimensions. A parallel implementation of the DACG algorithm has been coded1 via a data-parallel approach, allowing preconditioning by any given approximate inverse. Ad-hoc data-distribution techniques allow for reducing the amount of communication among the processors, which could spoil the parallel performance of the ensuing code. An efficient routine for performing matrix-vector products was designed and implemented. Numerical tests on a Cray T3E Supercomputer show the high degree of parallelism attainable by the code, and its good scalability level.

Section snippets

AINV and FSAI preconditioners

Let A be a symmetric positive definite N×N matrix.

The approximate inverse preconditioner AINV, which was developed in [2] for linear systems, relies upon the following ideas. One can evaluate A−1 by a biconjugation process applied to an arbitrary set of linearly independent vectors. A convenient choice is the canonical basis (e1,…,eN). This process produces a unit upper triangular matrix Z̃, and a diagonal matrix D such that A−1=Z̃D−1Z̃t. Actually, even with sparse A, the factor Z̃ is usually

Parallel DACG algorithm

Our DACG algorithm sequentially computes the eigenpairs, starting from the leftmost one λ1,u1. To evaluate the jth eigenpair, j>1, DACG minimizes the Rayleigh quotient in a subspace orthogonal to the j−1 eigenvectors previously computed. More precisely, DACG minimizes:q(z)=ztAzztBz,wherez=x−UjUjtBx,Uj=u1,…,uj−1,x∈RN.The first eigenpair λ1,u1 is obtained by minimization of (1) with z=x (U1=∅). Let M be a preconditioning matrix. The s leftmost eigenpairs are computed by the following conjugate

Numerical tests

We now report clarifying numerical results obtained applying DACG procedure to a number of finite element, mixed finite element, and finite difference problems. The computations were performed on the T3E 1200 machine of the CINECA computing center, located in Bologna, Italy. The machine is a stand alone system made by a set of DEC-Alpha 21164 processing elements (PEs), performing at a peak rate of 1200 Mflop/s. The PEs are interconnected by a 3D toroidal network having a 480 MByte/s payload

Conclusions

The following points are worth emphasizing.

  • Choosing the nonzero pattern of A when computing FSAI, and setting ε=0.05 in evaluating the AINV factor, allowed for obtaining equally satisfactory preconditioners for our DACG procedure.

  • AINV–DACG and FSAI–DACG displayed comparable speedups. For p=32 processors, FSAI–DACG usually showed a slightly better parallelization level, in some cases better than with Jacobi-DACG, which however confirmed its good parallel performance.

  • The AINV and FSAI techniques

Acknowledgements

This work has been supported in part by the Italian MURST Project “Analisi Numerica: Metodi e Software Matematico”, and CNR contract 98.01022.CT01. Free accounting units on the T3E Supercomputer were given by CINECA, inside a frame research grant. We thank Rich Lehoucq for providing useful suggestions.

References (17)

  • G Gambolati et al.

    An orthogonal accelerated deflation technique for large symmetric eigenproblems

    Comp. Methods App. Mech. Eng.

    (1992)
  • F Sartoretto et al.

    Accelerated simultaneous iterations for large finite element eigenproblems

    J. Comp. Phys.

    (1989)
  • K.J Bathe et al.

    Solution methods for eigenvalue problems in structural dynamics

    Int. J. Numer. Methods Eng.

    (1973)
  • M Benzi et al.

    A sparse approximate inverse preconditioner for the conjugate gradient method

    SIAM J. Sci. Comput.

    (1996)
  • L Bergamaschi et al.

    Asymptotic convergence of conjugate gradient methods for the partial symmetric eigenproblem

    Numer. Lin. Alg. Appl.

    (1997)
  • L Bergamaschi et al.

    Approximate inverse preconditioning in the parallel solution of sparse eigenproblems

    Numer. Lin. Alg. Appl.

    (2000)
  • L. Bergamaschi, M. Putti, Numerical comparison of iterative methods for the eigensolution of large sparse symmetric...
  • L. Bergamaschi, M. Putti, Efficient parallelization of preconditioned conjugate gradient schemes for matrices arising...
There are more references available in the full text version of this article.

Cited by (11)

  • Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters

    2003, Parallel Computing
    Citation Excerpt :

    We perform tests using three matrices. The first matrix, FL3D268, is derived from a finite-element problem [32]. The second matrix, CFD2 [33], is derived from computational fluid dynamics.

  • Using the PRIMME eigensolver in materials science applications

    2006, Physica Status Solidi (B) Basic Research
View all citing articles on Scopus
View full text