doi:10.1016/j.cpc.2004.10.005
Copyright © 2004 Elsevier B.V. All rights reserved.
Numerical methods for the QCD overlap operator: III. Nested iterations
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
N. Cundya,
,
, J. van den Eshofb, A. Frommerc, S. Kriega, Th. Lippertd and K. Schäferc
aDepartment of Physics, University of Wuppertal, Germany
bDepartment of Mathematics, University of Düsseldorf, Germany
cDepartment of Mathematics, University of Wuppertal, Germany
dCentral Institute for Applied Mathematics, Research Center Jülich, Germany
Received 14 June 2004;
revised 14 October 2004;
accepted 16 October 2004.
Available online 19 November 2004.
Abstract
The numerical and computational aspects of chiral fermions in lattice quantum chromodynamics are extremely demanding. In the overlap framework, the computation of the fermion propagator leads to a nested iteration where the matrix vector multiplications in each step of an outer iteration have to be accomplished by an inner iteration; the latter approximates the product of the sign function of the hermitian Wilson fermion matrix with a vector.
In this paper we investigate aspects of this nested paradigm. We examine several Krylov subspace methods to be used as an outer iteration for both propagator computations and the Hybrid Monte-Carlo scheme. We establish criteria on the accuracy of the inner iteration which allow to preserve an a priori given precision for the overall computation. It will turn out that the accuracy of the sign function can be relaxed as the outer iteration proceeds. Furthermore, we consider preconditioning strategies, where the preconditioner is built upon an inaccurate approximation to the sign function. Relaxation combined with preconditioning allows for considerable savings in computational efforts up to a factor of 4 as our numerical experiments illustrate. We also discuss the possibility of projecting the squared overlap operator into one chiral sector.
Keywords: Lattice quantum chromodynamics; Overlap fermions; Matrix sign function; Inner–outer iterations; Relaxation; Flexible Krylov subspace methods
PACS: 12.38; 02.60; 11.15.H; 12.38.G; 11.30.R
Fig. 1. Generic relaxed CG.
Fig. 2. Spectrum of the Wilson fermion matrix M for our 44 configuration (left), spectrum of Du for μ=0.3 (right).
Fig. 3. Relaxed and non-relaxed CG for (8). Left: μ=0.3. Right: μ=0.1.
Fig. 4. Overview of our recursive preconditioning computational scheme.
Fig. 6. Convergence history for unrelaxed CG, relaxed CG and relaxed GMRESR for one inversion of the squared equation (2). We plot the norm of the residual vs. the number of calls to the Wilson operator Q. The tics indicate each (outer) iteration: On the 44 lattice relaxed GMRESR needs 5 iterations to converge, whereas unrelaxed and relaxed CG both require 11 iterations.
Fig. 7. Convergence history for unrelaxed SUMR, relaxed SUMR and relaxed GMRESR(SUMR), two pass strategy for (2).
Table 1.
Advised Krylov subspace method and corresponding strategy for tuning the precision of the matrix–vector products as a function of the properties of the matrix A

Table 2.
Times (in seconds) for one inversion on the five 44 configurations with β=5.4, run on 1 processor of ALiCE. The number in brackets is the gain from the unrelaxed and unpreconditioned (CG) inversion

Table 3.
Times (in seconds) for one inversion on the three 84 configurations with β=5.6, run on 16 processors of ALiCE

Table 4.
Times (in seconds) for one inversion on the quenched 164 configuration at β=6.0, run on 16 processors of ALiCE

Table 5.
Times (in seconds) for the inversion of chiral fermions on the 84, μ=0.3 ensemble, run on 16 processors of ALiCE

Table 6.
Times (in seconds) for two SUMR inversions on the five 44 configurations at β=5.4, run on one processor of ALiCE

Table 7.
Times (in seconds) for two SUMR inversions on the 84 configurations at β=5.6, run on 16 processors of ALiCE

Table 8.
Times (in seconds) for two SUMR inversions on the quenched 164 configuration, run on 16 processors of ALiCE

Table 9.
The times (in seconds) needed to calculate one relGMRESR(CG) inversion of the overlap operator, and to calculate np eigenvalues of the Wilson operator for different values of np, on the 84 configuration 1 with μ=0.1

Table 10.
The times (in seconds) needed to calculate one relGMRESR(CG) inversion of the overlap operator, and to calculate np eigenvalues of the Wilson operator for different values of np, on the 44 configuration 1, with μ=0.1


Corresponding author.