doi:10.1016/j.cpc.2007.07.012
Copyright © 2007 Elsevier B.V. All rights reserved.
An iterative method to compute the sign function of a non-Hermitian matrix and its application to the overlap Dirac operator at nonzero chemical potential
J. Blocha,
,
, A. Frommerb, B. Langb and T. Wettiga
aInstitute for Theoretical Physics, University of Regensburg, 93040 Regensburg, Germany
bDepartment of Mathematics, University of Wuppertal, 42097 Wuppertal, Germany
Received 27 April 2007;
revised 18 June 2007;
accepted 23 July 2007.
Available online 10 August 2007.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
The overlap Dirac operator in lattice QCD requires the computation of the sign function of a matrix. While this matrix is usually Hermitian, it becomes non-Hermitian in the presence of a quark chemical potential. We show how the action of the sign function of a non-Hermitian matrix on an arbitrary vector can be computed efficiently on large lattices by an iterative method. A Krylov subspace approximation based on the Arnoldi algorithm is described for the evaluation of a generic matrix function. The efficiency of the method is spoiled when the matrix has eigenvalues close to a function discontinuity. This is cured by adding a small number of critical eigenvectors to the Krylov subspace, for which we propose two different deflation schemes. The ensuing modified Arnoldi method is then applied to the sign function, which has a discontinuity along the imaginary axis. The numerical results clearly show the improved efficiency of the method. Our modification is particularly effective when the action of the sign function of the same matrix has to be computed many times on different vectors, e.g., if the overlap Dirac operator is inverted using an iterative method.
Keywords: Overlap Dirac operator; Quark chemical potential; Sign function; Non-Hermitian matrix; Iterative methods
PACS classification codes: 02.60.Dc; 11.15.Ha; 12.38.Gc
Fig. 1. Spectrum of Hw(μ) in Eq. (1) for a 44 lattice (top pane) and a 64 lattice (bottom pane), with μ=0.3 and mw=−2. Note the difference in scale between real and imaginary axes. The gauge fields were generated using the Wilson plaquette action with gauge coupling βg=5.1 [7].
Fig. 2. Spectrum of
for a 44 lattice (top pane) and a 64 lattice (bottom pane), with μ=0.3 and βg=5.1.
Fig. 3. Accuracy of the modified Arnoldi method for y=sgn(A)x. The non-Hermitian matrix is A=γ5Dw(μ), where Dw(μ) is the Wilson–Dirac operator (2) at chemical potential μ=0.3, and x=(1,1,…,1). Top pane: 44 lattice with dim(A)=3072, bottom pane: 64 lattice with dim(A)=15552. The relative error
is shown as a function of the Krylov subspace size k for various numbers of deflated eigenvalues m using the LR-deflation. In order to compute the error
, the exact solution y was first determined using the spectral definition (4) and the full spectral decomposition computed with LAPACK.
Fig. 4. Comparison of the accuracies achieved with both deflation schemes and the exact projection of y on the composite space
. The relative error
is shown as a function of the Krylov subspace size k. For both lattice sizes the accuracy of the LR-deflation is slightly better than that of the Schur deflation. Furthermore, the accuracy of the modified Arnoldi approximation is very close to the best possible approximation in the composite subspace.
Table 1.
Ratio of largest deflated eigenvalue over largest eigenvalue for various numbers of deflated eigenvalues m for a 44 and a 64 lattice

Table 2.
CPU time (in seconds) for varying Krylov subspace size k. Top row: 44 lattice with m=32, bottom row: 64 lattice with m=128. Left panes: Schur deflation, right panes: LR-deflation. The time required by the initial calculation of the critical eigenvectors is given in the header of each block. The time needed to construct the Arnoldi basis in the Krylov subspace (column 2) is approximately proportional to Nk(k+2m) for the Schur deflation and Nk2 for the LR-deflation. The time used by Roberts' method (34) to compute the sign function of the Hessenberg matrix (column 3) is O(k3). The total time (column 4) also includes the evaluation of Eq. (33) for the Schur deflation and Eq. (41) for the LR-deflation. These timings were measured on an Intel Core 2 Duo 2.33 GHz computer using optimized ATLAS BLAS routines [34]
