Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

Alvermann, Andreas; Basermann, Achim; Bungartz, Hans-Joachim; Carbogno, Christian; Ernst, Dominik; Fehske, Holger; Futamura, Yasunori; Galgon, Martin; Hager, Georg; Huber, Sarah; Huckle, Thomas; Ida, Akihiro; Imakura, Akira; Kawai, Masatoshi; Köcher, Simone; Kreutzer, Moritz; Kus, Pavel; Lang, Bruno; Lederer, Hermann; Manin, Valeriy; Marek, Andreas; Nakajima, Kengo; Nemec, Lydia; Reuter, Karsten; Rippl, Michael; Röhrig-Zöllner, Melven; Sakurai, Tetsuya; Scheffler, Matthias; Scheurer, Christoph; Shahzad, Faisal; Simoes Brambila, Danilo; Thies, Jonas; Wellein, Gerhard

doi:10.1007/s13160-019-00360-8

Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

Special Feature: Original Paper
International Workshop on Eigenvalue Problems: Algorithms; Software and Applications, in Petascale Computing (EPASA2018)
Published: 27 April 2019

Volume 36, pages 699–717, (2019)
Cite this article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Andreas Alvermann¹,
Achim Basermann²,
Hans-Joachim Bungartz³,
Christian Carbogno⁴,
Dominik Ernst⁵,
Holger Fehske¹,
Yasunori Futamura⁶,
Martin Galgon⁷,
Georg Hager⁵,
Sarah Huber⁷,
Thomas Huckle³,
Akihiro Ida⁸,
Akira Imakura⁶,
Masatoshi Kawai⁸,
Simone Köcher⁹,
Moritz Kreutzer⁵,
Pavel Kus¹⁰,
Bruno Lang⁷,
Hermann Lederer¹⁰,
Valeriy Manin⁷,
Andreas Marek¹⁰,
Kengo Nakajima⁸,
Lydia Nemec⁹,
Karsten Reuter⁹,
Michael Rippl³,
Melven Röhrig-Zöllner²,
Tetsuya Sakurai⁶,
Matthias Scheffler⁴,
Christoph Scheurer⁹,
Faisal Shahzad⁵,
Danilo Simoes Brambila⁴,
Jonas Thies² &
…
Gerhard Wellein⁵

423 Accesses
10 Citations
Explore all metrics

Abstract

We first briefly report on the status and recent achievements of the ELPA-AEO (Eigen value Solvers for Petaflop Applications—Algorithmic Extensions and Optimizations) and ESSEX II (Equipping Sparse Solvers for Exascale) projects. In both collaboratory efforts, scientists from the application areas, mathematicians, and computer scientists work together to develop and make available efficient highly parallel methods for the solution of eigenvalue problems. Then we focus on a topic addressed in both projects, the use of mixed precision computations to enhance efficiency. We give a more detailed description of our approaches for benefiting from either lower or higher precision in three selected contexts and of the results thus obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Peridigm Meshfree Peridynamics Code

Article Open access 08 May 2023

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

Performance improvement of the triangular matrix product in commodity clusters

Article Open access 15 April 2024

References

Alvermann, A., Basermann, A., Fehske, H., Galgon, M., Hager, G., Kreutzer, M., Krämer, L., Lang, B., Pieper, A., Röhrig-Zöllner, M., Shahzad, F., Thies, J., Wellein, G.: ESSEX: Equipping sparse solvers for exascale. In: Lopes, L., et al. (eds.) Euro-Par 2014: Parallel Processing Workshops, LNCS, Springer, vol. 8806, pp. 577–588 (2014)
Auckenthaler, T., Blum, V., Bungartz, H.J., Huckle, T., Johanni, R., Krämer, L., Lang, B., Lederer, H., Willems, P.R.: Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations. Parallel Comput. 37(12), 783–794 (2011)
Article Google Scholar
Baboulin, M., Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, J., Luszczek, P., Tomov, S.: Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Comm. 180(12), 2526–2533 (2009)
Article MATH Google Scholar
Blum, V., Gehrke, R., Hanke, F., Havu, P., Havu, V., Ren, X., Reuter, K., Scheffler, M.: Ab initio molecular simulations with numeric atom-centered orbitals. Comput. Phys. Comm. 180, 2175–2196 (2009)
Article MATH Google Scholar
Cannon, L.E.: A cellular computer to implement the Kalman filter algorithm. Ph.D. thesis, Montana State University, Bozeman, MT (1969)
Carbogno, C., Levi, C.G., Van de Walle, C.G., Scheffler, M.: Ferroelastic switching of doped zirconia: modeling and understanding from first principles. Phys. Rev. B 90, 144109 (2014)
Article Google Scholar
Carbogno, C., Ramprasad, R., Scheffler, M.: Ab Initio Green–Kubo approach for the thermal conductivity of solids. Phys. Rev. Lett. 118(17), 175901 (2017)
Article Google Scholar
Demmel, J., Grigori, L., Hoemmen, M., Langou, J.: Communication-optimal parallel and sequential QR and LU factorizations. SIAM J. Sci. Comput. 34(1), A206–A239 (2012)
Article MathSciNet MATH Google Scholar
Galgon, M., Krämer, L., Lang, B.: Improving projection-based eigensolvers via adaptive techniques. Numer. Linear Algebra Appl. 25(1), e2124 (2017)
Article MathSciNet MATH Google Scholar
Gavin, B., Polizzi, E.: Krylov eigenvalue strategy using the FEAST algorithm with inexact system solves. Numer. Linear Algebra Appl. p. e2188 (2018)
Havu, V., Blum, V., Havu, P., Scheffler, M.: Efficient \(O(N)\) integration for all-electron electronic structure calculation using numeric basis functions. J. Comput. Phys. 228(22), 8367–8379 (2009)
Article MATH Google Scholar
Hoemmen, M.: Communication-avoiding Krylov subspace methods. Ph.D. thesis, University of California, Berkeley (2010)
Kreutzer, M., Hager, G., Wellein, G., Fehske, H., Bishop, A.R.: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM J. Sci. Comput. 36(5), C401–C423 (2014)
Article MathSciNet MATH Google Scholar
Kreutzer, M., Thies, J., Pieper, A., Alvermann, A., Galgon, M., Röhrig-Zöllner, M., Shahzad, F., Basermann, A., Bishop, A.R., Fehske, H., Hager, G., Lang, B., Wellein, G.: Performance engineering and energy efficiency of building blocks for large, sparse eigenvalue computations on heterogeneous supercomputers. In: Bungartz, H.J., Neumann, P., Nagel, W.E. (eds.) Software for Exascale Computing—SPPEXA 2013–2015, LNCSE, vol. 113, pp. 317–338. Springer, Switzerland (2016)
Google Scholar
Kreutzer, M., Thies, J., Röhrig-Zöllner, M., Pieper, A., Shahzad, F., Galgon, M., Basermann, A., Fehske, H., Hager, G., Wellein, G.: GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. Int. J. Parallel Prog. 45(5), 1046–1072 (2016)
Article Google Scholar
Kühne, T.D., Krack, M., Mohamed, F.R., Parrinello, M.: Efficient and accurate Car-Parrinello-like approach to Born-Oppenheimer molecular dynamics. Phys. Rev. Lett. 98(6), 066401 (2007)
Article Google Scholar
Lang, B.: Efficient reduction of banded hermitian positive definite generalized eigenvalue problems to banded standard eigenvalue problems. SIAM J. Sci. Comput. 41(1), C52–C72 (2019)
Article MathSciNet MATH Google Scholar
Manin, V., Lang, B.: Cannon-type triangular matrix multiplication for the reduction of generalized hpd eigenproblems to standard form (2018) (Submitted)
Marek, A., Blum, V., Johanni, R., Havu, V., Lang, B., Auckenthaler, T., Heinecke, A., Bungartz, H.J., Lederer, H.: The ELPA library: Scalable parallel eigenvalue solutions for electronic structure theory and computational science. J. Phys.: Condens. Matter 26(21), 213201 (2014)
Google Scholar
Muller, J.M., Brisebarre, N., de Dinechin, F., Jeannerod, C.P., Lefèvre, V., Melquiond, G., Revol, N., Stehlé, D., Torres, S.: Handbook of Floating-Point Arithmetic. Springer, Berlin (2010)
Book MATH Google Scholar
Nemec, L., Blum, V., Rinke, P., Scheffler, M.: Thermodynamic equilibrium conditions of graphene films on SiC. Phys. Rev. Lett. 111(6), 065502 (2013)
Article Google Scholar
Pieper, A., Kreutzer, M., Alvermann, A., Galgon, M., Fehske, H., Hager, G., Lang, B., Wellein, G.: High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. J. Comput. Phys. 325, 226–243 (2016)
Article MathSciNet MATH Google Scholar
Polizzi, E.: Density-matrix-based algorithm for solving eigenvalue problems. Phys. Rev. B 79(11), 115112 (2009)
Article Google Scholar
Röhrig-Zöllner, M., Thies, J., Kreutzer, M., Alvermann, A., Pieper, A., Basermann, A., Hager, G., Wellein, G., Fehske, H.: Increasing the performance of the Jacobi–Davidson method by blocking. SIAM J. Sci. Comput. 37(6), C697–C722 (2015)
Article MathSciNet MATH Google Scholar
Rouet, F.H., Li, X.S., Ghysels, P., Napov, A.: A distributed-memory package for dense hierarchically semi-separable matrix computations using randomization. ACM Trans. Math. Softw. 42(4), 27:1–27:35 (2016)
Saad, Y.: Numerical Methods for Large Eigenvalue Problems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (2011)
Book MATH Google Scholar
Sakurai, T., Sugiura, H.: A projection method for generalized eigenvalue problems using numerical integration. J. Comput. Appl. Math. 159(1), 119–128 (2003)
Article MathSciNet MATH Google Scholar
Sakurai, T., Tadano, H.: CIRR: a Rayleigh-Ritz type method with contour integral for generalized eigenvalue problems. Hokkaido Math. J. 36, 745–757 (2007)
Article MathSciNet MATH Google Scholar
Schönemann, P.H.: A generalized solution of the orthogonal Procrustes problem. Psychometrika 31(1), 1–10 (1966)
Article MathSciNet MATH Google Scholar
Shahzad, F., Thies, J., Kreutzer, M., Zeiser, T., Hager, G., Wellein, G.: CRAFT: A library for easier application-level checkpoint/restart and automatic fault tolerance (2017). Preprint: arXiv:1708.02030 (Submitted)
Song, W., Wubs, F., Thies, J., Baars, S.: Numerical bifurcation analysis of a 3D turing-type reaction-diffusion model. Commun. Nonlinear Sci. Numer. Simul. 60, 145–164 (2018)
Article MathSciNet Google Scholar
Stathopoulos, A., Wu, K.: A block orthogonalization procedure with constant synchronization requirements. SIAM J. Sci. Comput. 23(6), 2165–2182 (2002)
Article MathSciNet MATH Google Scholar
Stewart, G.W.: Block Gram–Schmidt orthogonalization. SIAM J. Sci. Comput. 31(1), 761–775 (2008)
Article MathSciNet MATH Google Scholar
Thies, J., Galgon, M., Shahzad, F., Alvermann, A., Kreutzer, M., Pieper, A., Röhrig-Zöllner, M., Basermann, A., Fehske, H., Hager, G., Lang, B., Wellein, G.: Towards an exascale enabled sparse solver repository. In: Bungartz, H.J., Neumann, P., Nagel, W.E. (eds.) Software for Exascale Computing—SPPEXA 2013–2015, LNCSE, vol. 113, pp. 295–316. Springer, Switzerland (2016)
Google Scholar
Yamamoto, Y., Nakatsukasa, Y., Yanagisawa, Y., Fukaya, T.: Roundoff error analysis of the Cholesky QR2 algorithm. Electron. Trans. Numer. Anal. 44, 306–326 (2015)
MathSciNet MATH Google Scholar
Yamazaki, I., Tomov, S., Dong, T., Dongarra, J.: Mixed-precision orthogonalization scheme and adaptive step size for improving the stability and performance of CA-GMRES on GPUs. In: Daydé, M.J., Marques, O., Nakajima, K. (eds.) High Performance Computing for Computational Science—VECPAR 2014—11th International Conference, Eugene, OR, USA, June 30–July 3, 2014, Revised Selected Papers, Lecture Notes in Computer Science, vol. 8969, pp. 17–30. Springer (2014)
Yamazaki, I., Tomov, S., Dongarra, J.: Mixed-precision Cholesky QR factorization and its case studies on multicore CPU with multiple GPUs. SIAM J. Sci. Comput. 37(3), C307–C330 (2015)
Article MathSciNet MATH Google Scholar
Yu, V.W., Corsetti, F., García, A., Huhn, W.P., Jacquelin, M., Jia, W., Lange, B., Lin, L., Lu, J., Mi, W., Seifitokaldani, A., Vázquez-Mayagoitia, Á., Yang, C., Yang, H., Blum, V.: ELSI: A unified software interface for Kohn-Sham electronic structure solvers. Comput. Phys. Comm. 222, 267–285 (2018)
Article Google Scholar

Download references

Acknowledgements

The authors thank the unknown referees for their valuable comments that helped to improve and clarify the presentation.

Author information

Authors and Affiliations

Institute of Physics, University of Greifswald, Greifswald, Germany
Andreas Alvermann & Holger Fehske
German Aerospace Center (DLR), Cologne, Germany
Achim Basermann, Melven Röhrig-Zöllner & Jonas Thies
Department of Informatics, Technical University of Munich, Munich, Germany
Hans-Joachim Bungartz, Thomas Huckle & Michael Rippl
Fritz Haber Institute of the Max Planck Society, Berlin, Germany
Christian Carbogno, Matthias Scheffler & Danilo Simoes Brambila
High Performance Computing, University of Erlangen-Nuremberg, Erlangen, Germany
Dominik Ernst, Georg Hager, Moritz Kreutzer, Faisal Shahzad & Gerhard Wellein
Applied Mathematics, University of Tsukuba, Tsukuba, Japan
Yasunori Futamura, Akira Imakura & Tetsuya Sakurai
Mathematics and Natural Sciences, University of Wuppertal, Wuppertal, Germany
Martin Galgon, Sarah Huber, Bruno Lang & Valeriy Manin
Computer Science, The University of Tokyo, Tokyo, Japan
Akihiro Ida, Masatoshi Kawai & Kengo Nakajima
Department of Theoretical Chemistry, Technical University of Munich, Munich, Germany
Simone Köcher, Lydia Nemec, Karsten Reuter & Christoph Scheurer
Max Planck Computing and Data Facility, Garching, Germany
Pavel Kus, Hermann Lederer & Andreas Marek

Authors

Andreas Alvermann
View author publications
You can also search for this author in PubMed Google Scholar
Achim Basermann
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Joachim Bungartz
View author publications
You can also search for this author in PubMed Google Scholar
Christian Carbogno
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Ernst
View author publications
You can also search for this author in PubMed Google Scholar
Holger Fehske
View author publications
You can also search for this author in PubMed Google Scholar
Yasunori Futamura
View author publications
You can also search for this author in PubMed Google Scholar
Martin Galgon
View author publications
You can also search for this author in PubMed Google Scholar
Georg Hager
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Huber
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Huckle
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Ida
View author publications
You can also search for this author in PubMed Google Scholar
Akira Imakura
View author publications
You can also search for this author in PubMed Google Scholar
Masatoshi Kawai
View author publications
You can also search for this author in PubMed Google Scholar
Simone Köcher
View author publications
You can also search for this author in PubMed Google Scholar
Moritz Kreutzer
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Kus
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Lang
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Lederer
View author publications
You can also search for this author in PubMed Google Scholar
Valeriy Manin
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Marek
View author publications
You can also search for this author in PubMed Google Scholar
Kengo Nakajima
View author publications
You can also search for this author in PubMed Google Scholar
Lydia Nemec
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Reuter
View author publications
You can also search for this author in PubMed Google Scholar
Michael Rippl
View author publications
You can also search for this author in PubMed Google Scholar
Melven Röhrig-Zöllner
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Sakurai
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Scheffler
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Scheurer
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Shahzad
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Simoes Brambila
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Thies
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Wellein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Lang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been supported by the Deutsche Forschungsgemeinschaft through the priority programme 1648 “Software for Exascale Computing” (SPPEXA) under the project ESSEX-II and by the Federal Ministry of Education and Research through the project “Eigenvalue soLvers for Petaflop Applications—Algorithmic Extensions and Optimizations” (ELPA-AEO) under Grant No. 01H15001.

About this article

Cite this article

Alvermann, A., Basermann, A., Bungartz, HJ. et al. Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects. Japan J. Indust. Appl. Math. 36, 699–717 (2019). https://doi.org/10.1007/s13160-019-00360-8

Download citation

Received: 30 May 2018
Revised: 12 January 2019
Published: 27 April 2019
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s13160-019-00360-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

Abstract

Access this article

Similar content being viewed by others

The Peridigm Meshfree Peridynamics Code

Parallelizing the dual revised simplex method

Performance improvement of the triangular matrix product in commodity clusters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

Abstract

Access this article

Similar content being viewed by others

The Peridigm Meshfree Peridynamics Code

Parallelizing the dual revised simplex method

Performance improvement of the triangular matrix product in commodity clusters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation