ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Computer Physics Communications
Volume 177, Issue 3, 1 August 2007, Pages 298-306
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (616 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.cpc.2007.03.004    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Quantum Monte Carlo on graphical processing units

Amos G. Andersona, Corresponding Author Contact Information, E-mail The Corresponding Author, William A. Goddard IIIa, E-mail The Corresponding Author and Peter Schröderb, E-mail The Corresponding Author

aMaterials and Process Simulation Center, Division of Chemistry and Chemical Engineering, California Institute of Technology (MC 139-74), Pasadena, CA 91125, USA bDepartment of Computer Science, California Institute of Technology, Pasadena, CA 91125, USA

Received 29 November 2006; 
revised 10 March 2007; 
accepted 14 March 2007. 
Available online 30 March 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Quantum Monte Carlo (QMC) is among the most accurate methods for solving the time independent Schrödinger equation. Unfortunately, the method is very expensive and requires a vast array of computing resources in order to obtain results of a reasonable convergence level. On the other hand, the method is not only easily parallelizable across CPU clusters, but as we report here, it also has a high degree of data parallelism. This facilitates the use of recent technological advances in Graphical Processing Units (GPUs), a powerful type of processor well known to computer gamers. In this paper we report on an end-to-end QMC application with core elements of the algorithm running on a GPU. With individual kernels achieving as much as 30× speed up, the overall application performs at up to 6× faster relative to an optimized CPU implementation, yet requires only a modest increase in hardware cost. This demonstrates the speedup improvements possible for QMC in running on advanced hardware, thus exploring a path toward providing QMC level accuracy as a more standard tool. The major current challenge in running codes of this type on the GPU arises from the lack of fully compliant IEEE floating point implementations. To achieve better accuracy we propose the use of the Kahan summation formula in matrix multiplications. While this drops overall performance, we demonstrate that the proposed new algorithm can match CPU single precision.

Keywords: Graphical processing units; Quantum Monte Carlo; Matrix multiplication; Floating point error; Kahan summation formula; De-normals

PACS classification codes: 07.05.Bx; 02.70.Ss; 02.60.Dc; 89.20.Ff

Article Outline

1. Introduction
2. Intro to Graphical Processing Units
3. Intro to Quantum Monte Carlo
4. Implementation on the GPU
4.1. Walker batch scheme
4.2. Basis function evaluation
4.2.1. Kernel 1: Data generation
4.2.2. Kernel 2: Layout conversion
4.3. Matrix multiplication
4.4. Jastrow functions
5. GPU floating point error
5.1. Underflow corrections
5.2. Kahan method
6. Results
7. Conclusion
Acknowledgements
References











 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.