skip to main content
10.1145/2462902.2462931acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
poster

A framework for auto-tuning HDF5 applications

Authors Info & Claims
Published:25 October 2018Publication History

ABSTRACT

The modern parallel I/O stack consists of several software layers with complex inter-dependencies and performance characteristics. While each layer exposes tunable parameters, it is often unclear to users how different parameter settings interact with each other and affect overall I/O performance. As a result, users often resort to default system settings, which typically obtain poor I/O bandwidth. In this research, we develop a benchmark guided auto-tuning framework for tuning the HDF5, MPI-IO, and Lustre layers on production supercomputing facilities. Our framework consists of three main components. H5Tuner uses a control file to adjust I/O parameters without modifying or recompiling the application. H5PerfCapture records performance metrics for HDF5 and MPI-IO. H5Evolve uses a genetic algorithm to explore the parameter space to determine well-performing configurations. We demonstrate I/O performance results for three HDF5 application-based benchmarks on a Sun HPC system. All the benchmarks running on 512 MPI processes perform 3X to 5.5X faster with the auto-tuned I/O parameters compared to a configuration with default system parameters.

References

  1. P. Carns et al. Understanding and improving computational science storage access through continuous characterization. In 27th IEEE Conference on Mass Storage Systems and Technologies, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Howison et al. Tuning HDF5 for Lustre File Systems. In Proceedings of 2010 Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS10), 2010.Google ScholarGoogle Scholar
  3. R. Vuduc, J. Demmel, and K. Yelick. Oski: A library of automatically tuned sparse matrix kernels. In Proceedings of SciDAC 2005, Journal of Physics: Conference Series, 2005.Google ScholarGoogle Scholar
  4. R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1--2):3--35, 2001.Google ScholarGoogle Scholar
  5. S. Williams et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In 2007 ACM/IEEE conference on Supercomputing, SC '07, pages 38:1--38:12, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. You, Q. Liu, Z. Li, and S. Moore. The design of an auto-tuning i/o framework on cray xt5 system.Google ScholarGoogle Scholar
  7. W. Yu et al. Performance characterization and optimization of parallel i/o on the cray xt. In IPDPS 2008., pages 1--11, april 2008.Google ScholarGoogle Scholar

Index Terms

  1. A framework for auto-tuning HDF5 applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader