research-article

Public Access

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

Authors:
Wael Elwasif

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0003-0554-1036
View Profile

,
William Godoy

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-2590-5178
View Profile

,
Nick Hagerty

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0003-3001-4414
View Profile

,
J. Austin Harris

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0003-3023-7140
View Profile

,
Oscar Hernandez

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-5380-6951
View Profile

,
Balint Joo

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-4229-7960
View Profile

,
Paul Kent

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0001-5539-4017
View Profile

,
Damien Lebrun-Grandie

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0003-1952-7219
View Profile

,
Elijah Maccarthy

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0001-9940-1741
View Profile

,
Veronica Melesse Vergara

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-4333-4145
View Profile

,
Bronson Messer

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-5358-5415
View Profile

,
Ross Miller

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0002-2179-495X
View Profile

,
Sarp Oral

Oak Ridge National Laboratory, United States

Oak Ridge National Laboratory, United States

0000-0001-8745-7078
View Profile

,
Sergei Bastrakov

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0003-3396-6154
View Profile

,
Michael Bussmann

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0002-8258-3881
View Profile

,
Alexander Debus

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0002-3844-3697
View Profile

,
Klaus Steiniger

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0001-8965-1149
View Profile

,
Jan Stephan

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0001-7839-4386
View Profile

,
Rene Widera

Helmholtz-Zentrum Dresden-Rossendorf, Germany

Helmholtz-Zentrum Dresden-Rossendorf, Germany

0000-0003-1642-0459
View Profile

,
Spencer Bryngelson

Georgia Institute of Technology, United States

Georgia Institute of Technology, United States

0000-0003-1750-7265
View Profile

,
Henry Le Berre

Georgia Institute of Technology, United States

Georgia Institute of Technology, United States

0000-0002-4781-9502
View Profile

,
Anand Radhakrishnan

Georgia Institute of Technology, United States

Georgia Institute of Technology, United States

0000-0001-5127-2741
View Profile

,
Jeffrey Young

Georgia Institute of Technology, United States

Georgia Institute of Technology, United States

0000-0001-9841-4057
View Profile

,
Sunita Chandrasekaran

University of Delaware, United States

University of Delaware, United States

0000-0002-3560-9428
View Profile

,
Florina Ciorba

University of Basel, Switzerland

University of Basel, Switzerland

0000-0002-2773-4499
View Profile

,
Osman Simsek

University of Basel, Switzerland

University of Basel, Switzerland

0000-0002-2719-9174
View Profile

,
Kate Clark

NVIDIA Corporation, United States

NVIDIA Corporation, United States

0000-0001-5211-2002
View Profile

,
Filippo Spiga

NVIDIA Corporation, United States

NVIDIA Corporation, United States

0000-0003-1448-5304
View Profile

,
Jeff Hammond

NVIDIA Corporation, United States

NVIDIA Corporation, United States

0000-0003-3181-8190
View Profile

,
Stone John

NVIDIA Corporation, United States

NVIDIA Corporation, United States

0000-0001-7215-762X
View Profile

,
David Hardy

University of Illinois at Urbana-Champaign, United States

University of Illinois at Urbana-Champaign, United States

0000-0001-8533-1367
View Profile

,
Sebastian Keller

Swiss National Supercomputing Center, Switzerland

Swiss National Supercomputing Center, Switzerland

0000-0003-3540-1405
View Profile

,
Jean-Guillaume Piccinali

Swiss National Supercomputing Center, Switzerland

Swiss National Supercomputing Center, Switzerland

0000-0002-2549-9587
View Profile

,
Christian Trott

Sandia National Laboratories, United States

Sandia National Laboratories, United States

0000-0003-0661-5594
View Profile

HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 WorkshopsFebruary 2023Pages 35–49https://doi.org/10.1145/3581576.3581621

Published:27 February 2023Publication History

HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 Workshops

Pages 35–49

ABSTRACT

This paper assesses and reports the experience of ten teams working to port, validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems, each one equipped with a server-class Arm CPU from Ampere Computing and two data center GPUs from NVIDIA Corp. The systems are connected together using InfiniBand interconnect. The selected applications and mini-apps are written using several programming languages and use multiple accelerator-based programming models for GPUs such as CUDA, OpenACC, and OpenMP offloading. Working on application porting requires a robust and easy-to-access programming environment, including a variety of compilers and optimized scientific libraries. The goal of this work is to evaluate platform readiness and assess the effort required from developers to deploy well-established scientific workloads on current and future generation Arm-based GPU-accelerated HPC systems. The reported case studies demonstrate that the current level of maturity and diversity of software and tools is already adequate for large-scale production deployments.

References

Bilge Acun, David J. Hardy, Laxmikant Kale, Ke Li, James C. Phillips, and John E. Stone. 2018. Scalable Molecular Dynamics with NAMD on the Summit System. IBM Journal of Research and Development 62, 6 (2018), 4:1–4:9. https://doi.org/10.1147/JRD.2018.2888986Google ScholarDigital Library
Holger Brunst, Sunita Chandrasekaran, Florina Ciorba, Nick Hagerty, Robert Henschel, Guido Juckeland, Junjie Li, Veronica G. Melesse Vergara, Sandra Wienke, and Miguel Zavala. 2022. First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites. https://doi.org/10.48550/ARXIV.2203.06751Google ScholarCross Ref
S. H. Bryngelson, K. Schmidmayer, and T. Colonius. 2019. A quantitative comparison of phase-averaged models for bubbly, cavitating flows. International Journal of Multiphase Flow 115 (2019), 137–143. https://doi.org/10.1016/j.ijmultiphaseflow.2019.03.028Google ScholarCross Ref
Spencer H Bryngelson, Kevin Schmidmayer, Vedran Coralic, Jomela C Meng, Kazuki Maeda, and Tim Colonius. 2021. MFC: An open-source high-order multi-component, multi-phase, and multi-scale compressible flow solver. Computer Physics Communications 266 (2021), 107396.Google ScholarCross Ref
M. Bussmann, H. Burau, T. E. Cowan, A. Debus, A. Huebl, G. Juckeland, T. Kluge, W. E. Nagel, R. Pausch, F. Schmitt, U. Schramm, J. Schuchart, and R. Widera. 2013. Radiative Signatures of the Relativistic Kelvin–Helmholtz Instability. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (Denver, Colorado) (SC ’13). ACM, New York, NY, USA, Article 5, 12 pages. https://doi.org/10.1145/2503210.2504564Google ScholarDigital Library
Aurélien Cavelan, Rubén M. Cabezón, Michal Grabarczyk, and Florina M. Ciorba. 2020. A Smoothed Particle Hydrodynamics Mini-App for Exascale. In Proceedings of the Platform for Advanced Scientific Computing Conference (Geneva, Switzerland) (PASC ’20). Association for Computing Machinery, New York, NY, USA, Article 11, 11 pages. https://doi.org/10.1145/3394277.3401855Google ScholarDigital Library
A. Charalampopoulos, S. H. Bryngelson, T. Colonius, and T. P. Sapsis. 2022. Hybrid quadrature moment method for accurate and stable representation of non-Gaussian processes applied to bubble dynamics. Philosophical Transactions of the Royal Society A (2022).Google Scholar
M. A. Clark and A. D. Kennedy. 2007. Accelerating staggered-Fermion dynamics with the rational hybrid Monte Carlo algorithm. Physical Review D 75, 1 (2007). https://doi.org/10.1103/physrevd.75.011502Google ScholarCross Ref
Tom Deakin, Simon McIntosh-Smith, James Price, Andrei Poenaru, Patrick Atkinson, Codrin Popa, and Justin Salmon. 2019. Performance Portability across Diverse Computer Architectures. In 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC). 1–13. https://doi.org/10.1109/P3HPC49587.2019.00006Google ScholarCross Ref
Tom Deakin, Andrei Poenaru, Tom Lin, and Simon McIntosh-Smith. 2020. Tracking Performance Portability on the Yellow Brick Road to Exascale. In 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC). 1–13. https://doi.org/10.1109/P3HPC51967.2020.00006Google ScholarCross Ref
Wael Elwasif, William Godoy, Nick Hagerty, J. Austin Harris, Oscar Hernandez, Balint Joo, Paul Kent, Damien Lebrun-Grandie, Elijah Maccarthy, Veronica G. Melesse Vergara, Bronson Messer, Ross Miller, Sarp Opal, Sergei Bastrakov, Michael Bussmann, Alexander Debus, Klaus Steinger, Jan Stephan, Rene Widera, Spencer H. Bryngelson, Henry Le Berre, Anand Radhakrishnan, Jefferey Young, Sunita Chandrasekaran, Florina Ciorba, Osman Simsek, Kate Clark Filippo Spiga, Jeff Hammond, John E. Stone. David Hardy, Sebastian Keller, and Jean-Guillaume Piccinali. Christian Trott. 2022. Application Experiences on a GPU-Accelerated Arm-based HPC Testbed. https://doi.org/10.48550/ARXIV.2209.09731Google ScholarCross Ref
Catherine Feldman, Benjamin Michalowicz, Eva Siegmann, Tony Curtis, Alan Calder, and Robert Harrison. 2022. Experiences with Porting the FLASH Code to Ookami, an HPE Apollo 80 A64FX Platform. HPCAsia 2022 (to appear)(2022).Google Scholar
E. Follana, Q. Mason, C. Davies, K. Hornbostel, G. P. Lepage, J. Shigemitsu, H. Trottier, and K. Wong. 2007. Highly improved staggered quarks on the lattice with applications to charm physics. Physical Review D 75, 5 (mar 2007). https://doi.org/10.1103/physrevd.75.054502Google ScholarCross Ref
Nicholas Frontiere, J. D. Emberson, Michael Buehlmann, Joseph Adamo, Salman Habib, Katrin Heitmann, and Claude-AndrÃ© Faucher-GiguÃ¨re. 2022. Simulating Hydrodynamics in Cosmology with CRK-HACC. https://doi.org/10.48550/ARXIV.2202.02840Google ScholarCross Ref
Todd Gamblin, Matthew P. LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, and W. Scott Futral. 2015. The Spack Package Manager: Bringing order to HPC software chaos. In Supercomputing 2015 (SC’15). Austin, Texas.Google Scholar
J. Austin Harris, Ran Chu, Sean M Couch, Anshu Dubey, Eirik Endeve, Antigoni Georgiadou, Rajeev Jain, Daniel Kasen, M P Laiu, OE B Messer, Jared Oâ€™Neal, Michael A Sandoval, and Klaus Weide. 2022. Exascale models of stellar explosions: Quintessential multi-physics simulation. The International Journal of High Performance Computing Applications 36, 1(2022), 59–77. https://doi.org/10.1177/10943420211027937 arXiv:https://doi.org/10.1177/10943420211027937Google ScholarDigital Library
William Humphrey, Andrew Dalke, and Klaus Schulten. 1996. VMD – Visual Molecular Dynamics. Journal of Molecular Graphics 14, 1 (1996), 33–38. https://doi.org/10.1016/0263-7855(96)00018-5Google ScholarCross Ref
Laxmikant V. Kalé and Gengbin Zheng. 2013. Chapter 1: The Charm++ Programming Model. In Parallel Science and Engineering Applications: The Charm++ Approach (1st ed.), Laxmikant V. Kale and Abhinav Bhatele (Eds.). CRC Press, Inc., Boca Raton, FL, USA, Chapter 1, 1–16. https://doi.org/10.1201/b16251Google ScholarCross Ref
Jeffrey Kelling, Sergei Bastrakov, Alexander Debus, Thomas Kluge, Matt Leinhauser, Richard Pausch, Klaus Steiniger, Jan Stephan, René Widera, Jeff Young, 2021. Challenges Porting a C++ Template-Metaprogramming Abstraction Layer to Directive-based Offloading. arXiv preprint arXiv:2110.08650(2021).Google Scholar
P. R. C. Kent, Abdulgani Annaberdiyev, Anouar Benali, M. Chandler Bennett, Edgar JosuÃ© Landinez Borda, Peter Doak, Hongxia Hao, Kenneth D. Jordan, Jaron T. Krogel, Ilkka KylÃ¤npÃ¤Ã¤, Joonho Lee, Ye Luo, Fionn D. Malone, Cody A. Melton, Lubos Mitas, Miguel A. Morales, Eric Neuscamman, Fernando A. Reboredo, Brenda Rubenstein, Kayahan Saritas, Shiv Upadhyay, Guangming Wang, Shuai Zhang, and Luning Zhao. 2020. QMCPACK: Advances in the development, efficiency, and application of auxiliary field and real-space variational and diffusion quantum Monte Carlo. The Journal of Chemical Physics 152 (2020), 174105. https://doi.org/10.1063/5.0004860Google ScholarCross Ref
M. Paul Laiu, Eirik Endeve, Ran Chu, J. Austin Harris, and O. E. Bronson Messer. 2021. A DG-IMEX Method for Two-moment Neutrino Transport: Nonlinear Solvers for Neutrino-Matter Coupling. Astrophys. J., Suppl. Ser. 253, 2, Article 52 (April 2021), 52 pages. https://doi.org/10.3847/1538-4365/abe2a8 arxiv:2102.02186 [astro-ph.HE]Google ScholarCross Ref
Elijah A MacCarthy, Chengxin Zhang, Yang Zhang, and KC Dukka. 2022. GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool. Bioinformatics (2022).Google Scholar
Alexander Matthes, RenÃ© Widera, Erik Zenker, Benjamin Worpitz, Axel Huebl, and Michael Bussmann. 2017. Tuning and Optimization for a Variety of Many-Core Architectures Without Changing a Single Line of Implementation Code Using the Alpaka Library. In High Performance Computing, Julian M. Kunkel, Rio Yokota, Michela Taufer, and John Shalf (Eds.). Springer International Publishing, Cham, 496–514. https://doi.org/10.1007/978-3-319-67630-2_36Google ScholarCross Ref
Simon McIntosh-Smith, James Price, Andrei Poenaru, and Tom Deakin. 2020. Benchmarking the first generation of production quality Arm-based supercomputers. Concurrency and Computation: Practice and Experience 32, 20(2020), e5569.Google ScholarCross Ref
Marcelo C. R. Melo, Rafael C. Bernardi, Till Rudack, Maximilian Scheurer, Christoph Riplinger, James C. Phillips, Julio D. C. Maia, Gerd B. Rocha, João V. Ribeiro, John E. Stone, Frank Nesse, Klaus Schulten, and Zaida Luthey-Schulten. 2018. NAMD goes quantum: An integrative suite for hybrid simulations. Nature Methods 15(2018), 351–354.Google ScholarCross Ref
James C. Phillips, David J. Hardy, Julio D. C. Maia, John E. Stone, João V. Ribeiro, Rafael C. Bernardi, Ronak Buch, Giacomo Fiorin, Jérôme Hénin, Wei Jiang, Ryan McGreevy, Marcelo C. R. Melo, Brian Radak, Robert D. Skeel, Abhishek Singharoy, Yi Wang, Benoît Roux, Aleksei Aksimentiev, Zaida Luthey-Schulten, Laxmikant V. Kalé, Klaus Schulten, Christophe Chipot, and Emad Tajkhorshid. 2020. Scalable molecular dynamics on CPU and GPU architectures with NAMD. Journal of Chemical Physics 153 (2020), 044130. https://doi.org/10.1063/5.0014475Google ScholarCross Ref
Nikola Rajovic, Alejandro Rico, Nikola Puzovic, Chris Adeniyi-Jones, and Alex Ramirez. 2014. Tibidabo: Making the case for an ARM-based HPC system. Future Generation Computer Systems 36 (2014), 322–334.Google ScholarCross Ref
Mitsuhisa Sato, Yutaka Ishikawa, Hirofumi Tomita, Yuetsu Kodama, Tetsuya Odajima, Miwako Tsuji, Hisashi Yashiro, Masaki Aoki, Naoyuki Shida, Ikuo Miyoshi, Kouichi Hirai, Atsushi Furuya, Akira Asato, Kuniki Morita, and Toshiyuki Shimizu. 2020. Co-Design for A64FX Manycore Processor and “Fugaku”. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15. https://doi.org/10.1109/SC41405.2020.00051Google ScholarCross Ref
K. Schmidmayer, S. H. Bryngelson, and T. Colonius. 2020. An assessment of multicomponent flow models and interface capturing schemes for spherical bubble dynamics. J. Comput. Phys. 402(2020), 109080. https://doi.org/10.1016/j.jcp.2019.109080Google ScholarDigital Library
N. Stephens, S. Biles, M. Boettcher, J. Eapen, M. Eyole, G. Gabrielli, M. Horsnell, G. Magklis, A. Martinez, N. Premillieu, A. Reid, A. Rico, and P. Walker. 2017. The ARM Scalable Vector Extension. IEEE Micro 37, 02 (mar 2017), 26–39. https://doi.org/10.1109/MM.2017.35Google ScholarDigital Library
John E. Stone, Michael J. Hallock, James C. Phillips, Joseph R. Peterson, Zaida Luthey-Schulten, and Klaus Schulten. 2016. Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads. 2016 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)(2016), 89–100. https://doi.org/10.1109/IPDPSW.2016.130Google ScholarCross Ref
John E. Stone, David J. Hardy, Jan Saam, Kirby L. Vandivort, and Klaus Schulten. 2011. GPU-Accelerated Computation and Interactive Display of Molecular Orbitals. In GPU Computing Gems, Wen-mei Hwu (Ed.). Morgan Kaufmann Publishers, Chapter 1, 5–18.Google Scholar
John E. Stone, David J. Hardy, Ivan S. Ufimtsev, and Klaus Schulten. 2010. GPU-Accelerated Molecular Modeling Coming of Age. J. Molecular Graphics and Modelling 29 (2010), 116–125.Google ScholarCross Ref
John E. Stone, Antti-Pekka Hynninen, James C. Phillips, and Klaus Schulten. 2016. Early Experiences Porting the NAMD and VMD Molecular Simulation and Analysis Software to GPU-Accelerated OpenPOWER Platforms. International Workshop on OpenPOWER for HPC (IWOPH’16) (2016), 188–206.Google ScholarCross Ref
John E. Stone, Jan Saam, David J. Hardy, Kirby L. Vandivort, Wen-mei W. Hwu, and Klaus Schulten. 2009. High Performance Computation and Interactive Display of Molecular Orbitals on GPUs and Multi-core CPUs. In Proceedings of the 2nd Workshop on General-Purpose Processing on Graphics Processing Units, ACM International Conference Proceeding Series, Vol. 383. ACM, New York, NY, USA, 9–18.Google ScholarDigital Library
A. P. Thompson, H. M. Aktulga, R. Berger, D. S. Bolintineanu, W. M. Brown, P. S. Crozier, P. J. in ’t Veld, A. Kohlmeyer, S. G. Moore, T. D. Nguyen, R. Shan, M. J. Stevens, J. Tranchida, C. Trott, and S. J. Plimpton. 2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comp. Phys. Comm. 271(2022), 108171. https://doi.org/10.1016/j.cpc.2021.108171Google ScholarCross Ref
Christian R. Trott, Damien Lebrun-GrandiÃ©, Daniel Arndt, Jan Ciesko, Vinh Dang, Nathan Ellingwood, Rahulkumar Gayatri, Evan Harvey, Daisy S. Hollman, Dan Ibanez, Nevin Liber, Jonathan Madsen, Jeff Miles, David Poliakoff, Amy Powell, Sivasankaran Rajamanickam, Mikael Simberg, Dan Sunderland, Bruno Turcksin, and Jeremiah Wilke. 2022. Kokkos 3: Programming Model Extensions for the Exascale Era. IEEE Transactions on Parallel and Distributed Systems 33, 4 (2022), 805–817. https://doi.org/10.1109/TPDS.2021.3097283Google ScholarCross Ref
Verónica G Vergara Larrea, Wayne Joubert, Michael J Brim, Reuben D Budiardja, Don Maxwell, Matt Ezell, Christopher Zimmer, Swen Boehm, Wael Elwasif, Sarp Oral, 2019. Scaling the summit: deploying the worldâ€™s fastest supercomputer. In International Conference on High Performance Computing. Springer, 330–351.Google ScholarDigital Library
Wei Zheng, Chengxin Zhang, Eric W Bell, and Yang Zhang. 2019. I-TASSER gateway: a protein structure and function prediction server powered by XSEDE. Future Generation Computer Systems 99 (2019), 73–85.Google ScholarDigital Library

Index Terms

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

Index terms have been assigned to the content through auto-classification.

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
Highlights
- Generate parallel CUDA code from sequential C input code using a compiler-based tool for key operators in Geometric Multigrid.
Abstract
GPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model ...
Read More
Optimizing linpack benchmark on GPU-accelerated petascale supercomputer
Special issue on Community Analysis and Information Recommendation

In this paper we present the programming of the Linpack benchmark on TianHe-1 system, the first petascale supercomputer system of China, and the largest GPU-accelerated heterogeneous system ever attempted before. A hybrid programming model consisting of ...
Read More
A GPU accelerated storage system
HPDC '10: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing

Massively multicore processors, like, for example, Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 Workshops
February 2023
101 pages
ISBN:9781450399890
DOI:10.1145/3581576

Copyright © 2023 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 February 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
HPCAsia '23 Workshops Paper Acceptance Rate9of10submissions,90%Overall Acceptance Rate69of143submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 292
  Total Downloads
- Downloads (Last 12 months)196
- Downloads (Last 6 weeks)45
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

Optimizing linpack benchmark on GPU-accelerated petascale supercomputer

A GPU accelerated storage system

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers

Optimizing linpack benchmark on GPU-accelerated petascale supercomputer

A GPU accelerated storage system

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media