Skip to main content

Overhead Analysis for Performance Monitoring Counters Multiplexing

  • Conference paper
  • First Online:
Supercomputing (RuSCDays 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13708))

Included in the following conference series:

Abstract

To analyze the efficiency of supercomputer functioning, it is useful to collect information from performance monitoring counters available in all modern processors. However, the ability to obtain such data is very limited—usually no more than 4 counters can be accessed simultaneously. To overcome this, multiplexing technology can be used, which allows collecting more data thanks to switching between counters—at any time, data from a specific set of counters is collected, and such sets repeatedly alternate. However, the use of this technology comes at a price of growing overheads—the execution time of supercomputer applications increases. Unfortunately, this topic has not been sufficiently studied so far. In this paper, we have carried out a detailed analysis and comparison of overheads caused by three different variants of multiplexing implemented using PAPI and LIKWID libraries. The obtained results show that the average overhead is \(\sim \)3–5%, and manual multiplexing using LIKWID has the least impact.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/RRZE-HPC/likwid/wiki/LikwidAPI-and-MarkerAPI.

  2. 2.

    https://www.nas.nasa.gov/software/npb_problem_sizes.html.

References

  1. Azimi, R., Stumm, M., Wisniewski, R.W.: Online performance analysis by statistical sampling of microprocessor performance counters. In: Proceedings of the International Conference on Supercomputing, pp. 101–110 (2005). https://doi.org/10.1145/1088149.1088163

  2. Bailey, D., Harris, T., Saphir, W., Van Der Wijngaart, R., Woo, A., Yarrow, M.: The NAS parallel benchmarks 2.0. Technical report, Technical Report NAS-95-020, NASA Ames Research Center (1995)

    Google Scholar 

  3. Browne, S.V., Dongarra, J.J., Garner, N., Ho, G., Mucci, P.J.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000). https://doi.org/10.1177/109434200001400303

  4. Dimakopoulou, M., Eranian, S., Koziris, N., Bambos, N.: Reliable and efficient performance monitoring in Linux. In: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 396–408. IEEE (2016). https://doi.org/10.1109/SC.2016.33, http://ieeexplore.ieee.org/document/7877112/

  5. Documentation on Zabbix software. http://www.zabbix.com/ru/documentation/

  6. Infrastructure monitoring system Nagios. https://www.nagios.org/

  7. Khudoleeva, A.A., Stefanov, K.S.: A study on the influence of monitoring system noise on MPI collective operations. In: Malyshkin, V. (ed.) PaCT 2021. LNCS, vol. 12942, pp. 132–142. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86359-3_10

    Chapter  Google Scholar 

  8. Mathur, W., Cook, J.: Improved estimation for software multiplexing of performance counters. In: Proceedings of the IEEE Computer Society’s Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS 2005, pp. 23–32 (2005). https://doi.org/10.1109/MASCOTS.2005.34

  9. May, J.: MPX: Software for multiplexing hardware performance counters in multithreaded programs. In: Proceedings of the 15th International Parallel and Distributed Processing Symposium, IPDPS 2001, p. 8. IEEE Computing Society (2001). https://doi.org/10.1109/IPDPS.2001.924955, http://ieeexplore.ieee.org/document/924955/

  10. de Melo, A.C.: The New Linux ’perf’ tools. In: Linux Kongress, Nuremberg, Germany (2010). http://vger.kernel.org/~acme/perf/lk2010-perf-acme.pdf

  11. Moore, S.V.: A comparison of counting and sampling modes of using performance monitoring hardware. In: Sloot, P.M.A., Hoekstra, A.G., Tan, C.J.K., Dongarra, J.J. (eds.) ICCS 2002. LNCS, vol. 2330, pp. 904–912. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46080-2_95

    Chapter  Google Scholar 

  12. Mytkowicz, T., Sweeney, P.F., Hauswirth, M., Diwan, A.: Time interpolation: so many metrics, so few registers. In: Proceedings of the Annual International Symposium on Microarchitecture, MICRO, pp. 286–298 (2007). https://doi.org/10.1109/MICRO.2007.27

  13. Ojha, A.K.: Technique in least-intrusive computer system performance monitoring. In: Conference Proceedings - IEEE SOUTHEASTCON, pp. 150–154 (2001). https://doi.org/10.1109/SECON.2001.923105

  14. PCL - The Performance Counter Library (1999). http://www.fz-juelich.de/zam/PCL/

  15. Röhl, T., Treibig, J., Hager, G., Wellein, G.: Overhead analysis of performance counter measurements. In: Proceedings of the International Conference on Parallel Processing Workshops, May 2015, pp. 176–185 (2015). https://doi.org/10.1109/ICPPW.2014.34

  16. Stefanov, K., Voevodin, V., Zhumatiy, S., Voevodin, V.: Dynamically Reconfigurable Distributed Modular Monitoring System for Supercomputers (DiMMon). Procedia Comput. Sci. 66, 625–634 (2015). https://doi.org/10.1016/j.procs.2015.11.071. In: Sloot, P., Boukhanovsky, A., Athanassoulis, G., Klimentov, A. (eds.) 4th International Young Scientist Conference on Computational Science. http://www.sciencedirect.com/science/journal/18770509/66/supp/C, http://linkinghub.elsevier.com/retrieve/pii/S1877050915034201

  17. Treibig, J., Hager, G., Wellein, G.: LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops, pp. 207–216. IEEE (2010). https://doi.org/10.1109/ICPPW.2010.38., http://ieeexplore.ieee.org/document/5599200/

  18. Voevodin, V., Zhumatiy, S.: Universal assessment system for analyzing the quality of supercomputer resources usage. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2021. CCIS, vol. 1510, pp. 427–442. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92864-3_33

    Chapter  Google Scholar 

  19. Voevodin, V.V., et al.: Supercomputer Lomonosov-2: large scale, deep monitoring and fine analytics for the user community. Supercomput. Front. Innovations 6(2), 4–11 (2019). https://doi.org/10.14529/jsfi190201

    Article  Google Scholar 

  20. Voevodin, V.V., et al.: Administration, monitoring and analysis of supercomputers in Russia: a survey of 10 HPC centers. Supercomput. Front. Innovations 8(3), 82–103 (2021). https://doi.org/10.14529/jsfi210305

    Article  Google Scholar 

  21. Weaver, V.: Linux perf event features and overhead. In: Fastpath 2013 - Second International Workshop on Performance Analysis of Workload Optimized Systems, Austin (2013). https://s3.us.cloud-object-storage.appdomain.cloud/res-files/1946-FastPath_Weaver_Talk.pdf

  22. Weaver, V., Dongarra, J.: Can hardware performance counters produce expected, deterministic results. In: Proceedings of Third Workshop on Functionality of Hardware Performance Monitoring (2010). http://icl.cs.utk.edu/news_pub/submissions/fhpm2010_weaver.pdf, http://web.eece.maine.edu/$sim$vweaver/projects/deterministic/fhpm2010.pdf

Download references

Acknowledgments

The results described in this paper were achieved at Lomonosov Moscow State University with the financial support of the Russian Science Foundation, agreement No. 21-71-30003. The research is carried out using the equipment of shared research facilities of HPC computing resources at Lomonosov Moscow State University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vadim Voevodin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Voevodin, V., Stefanov, K., Zhumatiy, S. (2022). Overhead Analysis for Performance Monitoring Counters Multiplexing. In: Voevodin, V., Sobolev, S., Yakobovskiy, M., Shagaliev, R. (eds) Supercomputing. RuSCDays 2022. Lecture Notes in Computer Science, vol 13708. Springer, Cham. https://doi.org/10.1007/978-3-031-22941-1_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22941-1_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22940-4

  • Online ISBN: 978-3-031-22941-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics