Parallel preconditioning of a sparse eigensolver
Introduction
An important task in many scientific applications is the computation of a small number of the leftmost eigenpairs (the smallest eigenvalues and corresponding eigenvectors) of the problem , where A and B are large, sparse, symmetric positive definite matrices. Several techniques for solving this problem have been proposed: subspace iteration [1], [15], Lanczos method [7], [11], [14], and, more recently, restarted Arnoldi–Lanczos algorithm [12], Jacobi–Davidson method [17], and optimization methods by conjugate gradient (CG) schemes [3], [9], [16].
In this paper we analyze the performance of two preconditioning techniques, when applied to an optimization method, called deflation-accelerated conjugate gradient (DACG) [8]. DACG sequentially computes a number of eigenpairs by CG minimizations of the Rayleigh quotient over subspaces of decreasing size. When effectively preconditioned, we found [4] that the efficiency of DACG well compares with that of established packages, like ARPACK [13]. In a recent work [5], the performance of DACG has also been numerically compared with that of Jacobi–Davidson method, showing that their efficiency is comparable, when a small number of eigenpairs is to be computed.
We exploit three preconditioners, Block–Jacobi, FSAI [10] and AINV [2], the latter ones falling into the class of approximate inverse preconditioners. FSAI and AINV explicitly compute an approximation, say M to A−1, based on the sparse factorization of A−1. Preconditioning by a product of triangular factors performs better than other techniques, mainly because the fill-in of the preconditioner is reduced. Unlike many other approximate inverse techniques, AINV and FSAI preserve the positive definiteness of the problem, which is essential in our application. The FSAI algorithm requires an a priori sparsity pattern of the approximate factor, which is not easy to provide in unstructured sparse problems. We generated the FSAI preconditioner using the same pattern as the matrix A. On the other hand, AINV is based upon a drop tolerance, ε, which in principle is more convenient for our unstructured problems. The influence of the drop tolerance has been tested. Deflation is accomplished via B-orthogonalization of the search directions. We analyzed both classical (CGS) and modified (MGS) Gram–Schmidt, and tested the accuracy and efficiency of both strategies.
We have exploited the parallel Block–Jacobi-DACG, AINV–DACG and FSAI–DACG algorithms in the solution of finite element, mixed finite element, and finite difference eigenproblems, both in two and three dimensions. A parallel implementation of the DACG algorithm has been coded1 via a data-parallel approach, allowing preconditioning by any given approximate inverse. Ad-hoc data-distribution techniques allow for reducing the amount of communication among the processors, which could spoil the parallel performance of the ensuing code. An efficient routine for performing matrix-vector products was designed and implemented. Numerical tests on a Cray T3E Supercomputer show the high degree of parallelism attainable by the code, and its good scalability level.
Section snippets
AINV and FSAI preconditioners
Let A be a symmetric positive definite N×N matrix.
The approximate inverse preconditioner AINV, which was developed in [2] for linear systems, relies upon the following ideas. One can evaluate A−1 by a biconjugation process applied to an arbitrary set of linearly independent vectors. A convenient choice is the canonical basis . This process produces a unit upper triangular matrix , and a diagonal matrix D such that . Actually, even with sparse A, the factor is usually
Parallel DACG algorithm
Our DACG algorithm sequentially computes the eigenpairs, starting from the leftmost one . To evaluate the jth eigenpair, j>1, DACG minimizes the Rayleigh quotient in a subspace orthogonal to the j−1 eigenvectors previously computed. More precisely, DACG minimizes:whereThe first eigenpair is obtained by minimization of (1) with (U1=∅). Let M be a preconditioning matrix. The s leftmost eigenpairs are computed by the following conjugate
Numerical tests
We now report clarifying numerical results obtained applying DACG procedure to a number of finite element, mixed finite element, and finite difference problems. The computations were performed on the T3E 1200 machine of the CINECA computing center, located in Bologna, Italy. The machine is a stand alone system made by a set of DEC-Alpha 21164 processing elements (PEs), performing at a peak rate of 1200 Mflop/s. The PEs are interconnected by a 3D toroidal network having a 480 MByte/s payload
Conclusions
The following points are worth emphasizing.
- •
Choosing the nonzero pattern of A when computing FSAI, and setting ε=0.05 in evaluating the AINV factor, allowed for obtaining equally satisfactory preconditioners for our DACG procedure.
- •
AINV–DACG and FSAI–DACG displayed comparable speedups. For p=32 processors, FSAI–DACG usually showed a slightly better parallelization level, in some cases better than with Jacobi-DACG, which however confirmed its good parallel performance.
- •
The AINV and FSAI techniques
Acknowledgements
This work has been supported in part by the Italian MURST Project “Analisi Numerica: Metodi e Software Matematico”, and CNR contract 98.01022.CT01. Free accounting units on the T3E Supercomputer were given by CINECA, inside a frame research grant. We thank Rich Lehoucq for providing useful suggestions.
References (17)
- et al.
An orthogonal accelerated deflation technique for large symmetric eigenproblems
Comp. Methods App. Mech. Eng.
(1992) - et al.
Accelerated simultaneous iterations for large finite element eigenproblems
J. Comp. Phys.
(1989) - et al.
Solution methods for eigenvalue problems in structural dynamics
Int. J. Numer. Methods Eng.
(1973) - et al.
A sparse approximate inverse preconditioner for the conjugate gradient method
SIAM J. Sci. Comput.
(1996) - et al.
Asymptotic convergence of conjugate gradient methods for the partial symmetric eigenproblem
Numer. Lin. Alg. Appl.
(1997) - et al.
Approximate inverse preconditioning in the parallel solution of sparse eigenproblems
Numer. Lin. Alg. Appl.
(2000) - L. Bergamaschi, M. Putti, Numerical comparison of iterative methods for the eigensolution of large sparse symmetric...
- L. Bergamaschi, M. Putti, Efficient parallelization of preconditioned conjugate gradient schemes for matrices arising...
Cited by (11)
Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters
2003, Parallel ComputingCitation Excerpt :We perform tests using three matrices. The first matrix, FL3D268, is derived from a finite-element problem [32]. The second matrix, CFD2 [33], is derived from computational fluid dynamics.
Computational experience with sequential and parallel, preconditioned Jacobi-Davidson for large, sparse symmetric matrices
2003, Journal of Computational PhysicsPreconditioning techniques for large linear systems: A survey
2002, Journal of Computational PhysicsParallel Jacobi-Davidson with block FSAI preconditioning and controlled inner iterations
2016, Numerical Linear Algebra with ApplicationsEfficient parallel solution to large-size sparse eigenproblems with block FSAI preconditioning
2012, Numerical Linear Algebra with ApplicationsUsing the PRIMME eigensolver in materials science applications
2006, Physica Status Solidi (B) Basic Research