doi:10.1016/j.cpc.2005.05.005
Copyright © 2005 Elsevier B.V. All rights reserved.
Computing charge densities with partially reorthogonalized Lanczos
aDepartment of Computer Science & Engineering, University of Minnesota, Twin Cities, Minneapolis, MN, USA
bDepartment of Chemical Engineering and Materials Science, Institute for the Theory of Advanced Materials in Information Technology, Digital Technology Center, University of Minnesota, Minneapolis, MN 55455, USA
cICES, 1 University Station, University of Texas at Austin, Austin, TX 78712, USA
Received 4 April 2005;
revised 27 April 2005;
accepted 9 May 2005.
Available online 13 June 2005.
Abstract
This paper considers the problem of computing charge densities in a density functional theory (DFT) framework. In contrast to traditional, diagonalization-based, methods, we utilize a technique which exploits a Lanczos basis, without explicit reference to individual eigenvectors. The key ingredient of this new approach is a partial reorthogonalization strategy whose goal is to ensure a good level of orthogonality of the basis vectors. The experiments reveal that the method can be a few times faster than ARPACK, the implicit restart Lanczos method. This is achievable by exploiting more memory and BLAS3 (dense) computations while avoiding the frequent updates of eigenvectors inherent to all restarted Lanczos methods.
Keywords: Density functional theory; Lanczos; Partial reorthogonalization; Charge densities
Fig. 1. The Heaviside function.
Fig. 2. The Lanczos algorithm. The inner product for vectors is denoted by
.,.
.
Fig. 3. The partially reorthogonalized Lanczos algorithm. The inner product for vectors is denoted by
.,.
.
Fig. 4. Levels of orthogonality of the Lanczos basis for the Hamiltonian (n=17 077) corresponding to
. Left: Lanczos without reorthogonalization. Right: Lanczos with partial reorthogonalization. The number of reorthogonalizations was 34 with an additional 3400 inner vector products.
Fig. 5. Algorithm for approximating charge densities by means of Partial Lanczos.
Fig. 6. BLAS 2 (left) and BLAS 3 (right) implementations for computing the rows of matrix QlVn0.
Fig. 7. Left: Structure of Hamiltonian (n=17 077, nnz = 875923) for
. Right: Structure of Hamiltonian (n=97 569, nnz=5156379) for
.
Fig. 8. Left: Structure of Hamiltonian (n=94 341, nnz=5963003) for the
cluster. Right: Structure of Hamiltonian (n=94 341, nnz=6332795) for the
cluster.
Table 1.
Costs for Partial Lanczos and ARPACK for 

Table 2.
Costs for Partial Lanczos and ARPACK for 

Table 3.
Costs for Partial Lanczos and ARPACK for 

Table 4.
Costs for Partial Lanczos and ARPACK for 

Work supported by NSF under grant DMR-0325218, by DOE under Grants DE-FG02-03ER25585, DE-FG02-03ER15491, and by the Minnesota Supercomputing Institute.

Corresponding author.