skip to main content
10.1145/3079079.3079101acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

libPRISM: an intelligent adaptation of prefetch and SMT levels

Published:14 June 2017Publication History

ABSTRACT

Current microprocessors include several knobs to modify the hardware behavior in order to improve performance under different workload demands. An impractical and time consuming offline profiling is needed to evaluate the design space to find the optimal knob configuration. Different knobs are typically configured in a decoupled manner to avoid the time-consuming offline profiling process. This can often lead to underperforming configurations and sometimes to conflicting decisions that jeopardize system power- performance efficiency. Thus, a dynamic management of the different hardware knobs is necessary to find the knob configuration that maximizes system power-performance efficiency without the burden of offline profiling.

In this paper, we propose libPRISM, an infrastructure that enables the transparent management of multiple hardware knobs in order to adapt the system to the evolving demands of hardware resources in different workloads. We use libPRISM to implement a policy that maximizes system performance without degrading energy efficiency by dynamically managing the SMT level and prefetcher hardware knobs of an IBM POWER8 system. We evaluate our solution using 24 applications from 3 different parallel benchmarks suites without the need of offline profiling or workload modification. Overall, the solution increases performance up to 220% (15.4% on average) and reduces dynamic power consumption up to 13% (2.0% on average) when compared to the static default knob configuration.

References

  1. Boneti, C., et al. Balancing HPC applications through smart allocation of resources in MT processors. IPDPS'08.Google ScholarGoogle Scholar
  2. Boneti, C., et al. A Dynamic Scheduler for Balancing HPC Applications. SC'08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Boneti, C., et al. Software-Controlled Priority Characterization of POWER5 Processor. ISCA'08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Casas, M., et al. Runtime-Aware Architectures. Euro-Par'15.Google ScholarGoogle Scholar
  5. Cazorla, F., et al. Dynamically controlled resource allocation in SMT processors. MICRO'04. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cazorla, F., et al. Improving Memory Latency Aware Fetch Policies for SMT Processors. ISHPC'03.Google ScholarGoogle Scholar
  7. Cazorla, F., et al. Predictable Performance in SMT Processors: Synergy Between the OS and SMTs. TC'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chilimbi, T., et al. Dynamic Hot Data Stream Prefetching for General-purpose Programs. PLDI'02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. CORAL Benchmarks. Https://asc.llnl.gov/coral-benchmarks/.Google ScholarGoogle Scholar
  10. Creech, T., et al. Efficient Multiprogramming for Multicores with SCAF. MI- CRO'13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. de Melo, A. C. The new linux perf tools. 2010.Google ScholarGoogle Scholar
  12. Ebrahimi, E., et al. Coordinated Control of Multiple Prefetchers in Multi-core Systems. MICRO'42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ebrahimi, E., et al. Prefetch-aware Shared Resource Management for Multi-core Systems. ISCA'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ebrahimi, E., et al. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems. HPCA'09.Google ScholarGoogle Scholar
  15. Everman, S., et al. A Memory-Level Parallelism Aware Fetch Policy for SMT Processors. HPCA'07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Fatahalian, K., et al. Sequoia: Programming the Memory Hierarchy. SC'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Feliu, J., et al. Addressing Fairness in SMT Multicores with a Progress-Aware Scheduler. IPDPS'16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Feliu, J., et al. Symbiotic job scheduling on the IBM POWER8. HPCA'15.Google ScholarGoogle Scholar
  19. Floyd, M., et al. Adaptive energy-management features of the IBM POWER7 chip. 2015.Google ScholarGoogle Scholar
  20. Hall, B., et al. Performance Optimization and Tuning Techniques for IBM Power Systems Processors Including IBM POWER8. 2015.Google ScholarGoogle Scholar
  21. Heirman, W., et al. Automatic SMT Threading for OpenMP Applications on the Intel Xeon Phi Co-processor. ROSS'14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hur, I., et al. Memory Prefetching Using Adaptive Stream Detection. MICRO'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jia, Z., et al. Auto-tuning Spark Big Data Workloads on POWER8: Prediction-Based Dynamic SMT Threading. PACT'16.Google ScholarGoogle Scholar
  24. Jimenez, V., et al. Increasing multicore system efficiency through intelligent bandwidth shifting. HPCA'15.Google ScholarGoogle Scholar
  25. Jiménez, V., et al. Making Data Prefetch Smarter: Adaptive Prefetching on POWER7. PACT'12.Google ScholarGoogle Scholar
  26. Jin, H., et al. The OpenMP implementation of NAS parallel benchmarks and its performance. 1999.Google ScholarGoogle Scholar
  27. Khan, M., et al. A case for resource efficient prefetching in multicores. ISPASS'14.Google ScholarGoogle Scholar
  28. Li, M., et al. PATer: A Hardware Prefetching Automatic Tuner on IBM POWER8 Processor. CAL'16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Luk, C., et al. Ispike: a post-link optimizer for the Intel reg; Itanium reg; architecture. CGO'04. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Manivannan, M., et al. Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures. IPDPS'14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mericas, A., et al. IBM POWER8 performance features and evaluation. IBM Journal of Research and Development (2015).Google ScholarGoogle Scholar
  32. Moseley, T., et al. Methods for modeling resource contention on simultaneous multithreading processors. ICCD'05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Müller, M., et al. OpenMP in a Heterogeneous World: 8th International Workshop on OpenMP. IWOMP'12.Google ScholarGoogle Scholar
  34. OpenMP Architecture Review Board. OpenMP Application Program Interface Version 4.5.Google ScholarGoogle Scholar
  35. Prat, D., et al. Adaptive and application dependent runtime guided hardware prefetcher reconfiguration on the IBM POWER7. CoRR'15.Google ScholarGoogle Scholar
  36. Snavely, A., et al. Symbiotic Jobs cheduling for a Simultaneous Multithreaded Processor. ASPLOS IX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tembey, P., et al. Smt Switch: Software Mechanisms for Power Shifting. CAL'13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Valero, M., et al. Runtime-Aware Architectures: A First Approach. International Journal on Supercomputing Frontiers and Innovations 1, 1 (June 2014), 29--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Vega, A., et al. Crank It Up or Dial It Down: Coordinated Multiprocessor Frequency and Folding Control. MICRO'13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Wang, Z., et al. Guided Region Prefetching: A Cooperative Hardware/Software Approach. ISCA'03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wu, C., et al. PACMan: Prefetch-aware Cache Management for High Performance Caching. MICRO'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zhang, Y., et al. An Adaptive OpenMP Loop Scheduler for Hyperthreaded SMPs. PDCS'04.Google ScholarGoogle Scholar
  43. Zhang, Y., et al. Runtime Empirical Selection of Loop Schedulers on Hyperthreaded SMPs. IPDPS'05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Zhuang, X., et al. Reducing Cache Pollution via Dynamic Data Prefetch Filtering. TC'07. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. libPRISM: an intelligent adaptation of prefetch and SMT levels

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ICS '17: Proceedings of the International Conference on Supercomputing
            June 2017
            300 pages
            ISBN:9781450350204
            DOI:10.1145/3079079
            • General Chairs:
            • William D. Gropp,
            • Pete Beckman,
            • Program Chairs:
            • Zhiyuan Li,
            • Francisco J. Cazorla

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 June 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate584of2,055submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader