Abstract
To analyze the efficiency of supercomputer functioning, it is useful to collect information from performance monitoring counters available in all modern processors. However, the ability to obtain such data is very limited—usually no more than 4 counters can be accessed simultaneously. To overcome this, multiplexing technology can be used, which allows collecting more data thanks to switching between counters—at any time, data from a specific set of counters is collected, and such sets repeatedly alternate. However, the use of this technology comes at a price of growing overheads—the execution time of supercomputer applications increases. Unfortunately, this topic has not been sufficiently studied so far. In this paper, we have carried out a detailed analysis and comparison of overheads caused by three different variants of multiplexing implemented using PAPI and LIKWID libraries. The obtained results show that the average overhead is \(\sim \)3–5%, and manual multiplexing using LIKWID has the least impact.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Azimi, R., Stumm, M., Wisniewski, R.W.: Online performance analysis by statistical sampling of microprocessor performance counters. In: Proceedings of the International Conference on Supercomputing, pp. 101–110 (2005). https://doi.org/10.1145/1088149.1088163
Bailey, D., Harris, T., Saphir, W., Van Der Wijngaart, R., Woo, A., Yarrow, M.: The NAS parallel benchmarks 2.0. Technical report, Technical Report NAS-95-020, NASA Ames Research Center (1995)
Browne, S.V., Dongarra, J.J., Garner, N., Ho, G., Mucci, P.J.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000). https://doi.org/10.1177/109434200001400303
Dimakopoulou, M., Eranian, S., Koziris, N., Bambos, N.: Reliable and efficient performance monitoring in Linux. In: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 396–408. IEEE (2016). https://doi.org/10.1109/SC.2016.33, http://ieeexplore.ieee.org/document/7877112/
Documentation on Zabbix software. http://www.zabbix.com/ru/documentation/
Infrastructure monitoring system Nagios. https://www.nagios.org/
Khudoleeva, A.A., Stefanov, K.S.: A study on the influence of monitoring system noise on MPI collective operations. In: Malyshkin, V. (ed.) PaCT 2021. LNCS, vol. 12942, pp. 132–142. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86359-3_10
Mathur, W., Cook, J.: Improved estimation for software multiplexing of performance counters. In: Proceedings of the IEEE Computer Society’s Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS 2005, pp. 23–32 (2005). https://doi.org/10.1109/MASCOTS.2005.34
May, J.: MPX: Software for multiplexing hardware performance counters in multithreaded programs. In: Proceedings of the 15th International Parallel and Distributed Processing Symposium, IPDPS 2001, p. 8. IEEE Computing Society (2001). https://doi.org/10.1109/IPDPS.2001.924955, http://ieeexplore.ieee.org/document/924955/
de Melo, A.C.: The New Linux ’perf’ tools. In: Linux Kongress, Nuremberg, Germany (2010). http://vger.kernel.org/~acme/perf/lk2010-perf-acme.pdf
Moore, S.V.: A comparison of counting and sampling modes of using performance monitoring hardware. In: Sloot, P.M.A., Hoekstra, A.G., Tan, C.J.K., Dongarra, J.J. (eds.) ICCS 2002. LNCS, vol. 2330, pp. 904–912. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46080-2_95
Mytkowicz, T., Sweeney, P.F., Hauswirth, M., Diwan, A.: Time interpolation: so many metrics, so few registers. In: Proceedings of the Annual International Symposium on Microarchitecture, MICRO, pp. 286–298 (2007). https://doi.org/10.1109/MICRO.2007.27
Ojha, A.K.: Technique in least-intrusive computer system performance monitoring. In: Conference Proceedings - IEEE SOUTHEASTCON, pp. 150–154 (2001). https://doi.org/10.1109/SECON.2001.923105
PCL - The Performance Counter Library (1999). http://www.fz-juelich.de/zam/PCL/
Röhl, T., Treibig, J., Hager, G., Wellein, G.: Overhead analysis of performance counter measurements. In: Proceedings of the International Conference on Parallel Processing Workshops, May 2015, pp. 176–185 (2015). https://doi.org/10.1109/ICPPW.2014.34
Stefanov, K., Voevodin, V., Zhumatiy, S., Voevodin, V.: Dynamically Reconfigurable Distributed Modular Monitoring System for Supercomputers (DiMMon). Procedia Comput. Sci. 66, 625–634 (2015). https://doi.org/10.1016/j.procs.2015.11.071. In: Sloot, P., Boukhanovsky, A., Athanassoulis, G., Klimentov, A. (eds.) 4th International Young Scientist Conference on Computational Science. http://www.sciencedirect.com/science/journal/18770509/66/supp/C, http://linkinghub.elsevier.com/retrieve/pii/S1877050915034201
Treibig, J., Hager, G., Wellein, G.: LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops, pp. 207–216. IEEE (2010). https://doi.org/10.1109/ICPPW.2010.38., http://ieeexplore.ieee.org/document/5599200/
Voevodin, V., Zhumatiy, S.: Universal assessment system for analyzing the quality of supercomputer resources usage. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2021. CCIS, vol. 1510, pp. 427–442. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92864-3_33
Voevodin, V.V., et al.: Supercomputer Lomonosov-2: large scale, deep monitoring and fine analytics for the user community. Supercomput. Front. Innovations 6(2), 4–11 (2019). https://doi.org/10.14529/jsfi190201
Voevodin, V.V., et al.: Administration, monitoring and analysis of supercomputers in Russia: a survey of 10 HPC centers. Supercomput. Front. Innovations 8(3), 82–103 (2021). https://doi.org/10.14529/jsfi210305
Weaver, V.: Linux perf event features and overhead. In: Fastpath 2013 - Second International Workshop on Performance Analysis of Workload Optimized Systems, Austin (2013). https://s3.us.cloud-object-storage.appdomain.cloud/res-files/1946-FastPath_Weaver_Talk.pdf
Weaver, V., Dongarra, J.: Can hardware performance counters produce expected, deterministic results. In: Proceedings of Third Workshop on Functionality of Hardware Performance Monitoring (2010). http://icl.cs.utk.edu/news_pub/submissions/fhpm2010_weaver.pdf, http://web.eece.maine.edu/$sim$vweaver/projects/deterministic/fhpm2010.pdf
Acknowledgments
The results described in this paper were achieved at Lomonosov Moscow State University with the financial support of the Russian Science Foundation, agreement No. 21-71-30003. The research is carried out using the equipment of shared research facilities of HPC computing resources at Lomonosov Moscow State University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Voevodin, V., Stefanov, K., Zhumatiy, S. (2022). Overhead Analysis for Performance Monitoring Counters Multiplexing. In: Voevodin, V., Sobolev, S., Yakobovskiy, M., Shagaliev, R. (eds) Supercomputing. RuSCDays 2022. Lecture Notes in Computer Science, vol 13708. Springer, Cham. https://doi.org/10.1007/978-3-031-22941-1_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-22941-1_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22940-4
Online ISBN: 978-3-031-22941-1
eBook Packages: Computer ScienceComputer Science (R0)