skip to main content
10.1145/3578245.3584358acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

Performance Analysis Tools for MPI Applications and their Use in Programming Education

Published:15 April 2023Publication History

ABSTRACT

Performance analysis tools are frequently used to support the development of parallel MPI applications. They facilitate the detection of errors, bottlenecks, or inefficiencies but differ substantially in their instrumentation, measurement, and type of feedback. Especially, tools that provide visual feedback are helpful for educational purposes. They provide a visual abstraction of program behavior, supporting learners to identify and understand performance issues and write more efficient code. However, existing professional tools for performance analysis are very complex, and their use in beginner courses can be very demanding. Foremost, their instrumentation and measurement require deep knowledge and take a long time. Immediate, as well as straightforward feedback, is essential to motivate learners. This paper provides an extensive overview of performance analysis tools for parallel MPI applications, which experienced developers broadly use today. It also gives an overview of existing educational tools for parallel programming with MPI and shows their shortcomings compared to professional tools. Using tools for performance analysis of MPI programs in educational scenarios can promote the understanding of program behavior in large HPC systems and support learning parallel programming. At the same time, the complexity of the programs and the lack of infrastructure in educational institutions are barriers. These aspects will be considered and discussed in detail.

References

  1. 1998. GNU gprof - The GNU profiler. https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html. Accessed: 2022--11--16.Google ScholarGoogle Scholar
  2. 2000--2022. Valgrind. https://valgrind.org. Accessed: 2022--11--16.Google ScholarGoogle Scholar
  3. 2020. mpiP 3.5. https://github.com/LLNL/mpiP. Accessed: 2022--11-04.Google ScholarGoogle Scholar
  4. 2022. Open|Speedshop. https://openspeedshop.org. Accessed: 2022--10--26.Google ScholarGoogle Scholar
  5. 2022. SAUCE - System for AUtomated Code Evaluation. https://github.com/moschlar/SAUCE. Accessed: 2022--11-02.Google ScholarGoogle Scholar
  6. Dorian C. Arnold, Dong H. Ahn, Bronis R. de Supinski, Gregory L. Lee, Barton P. Miller, and Martin Schulz. 2007. Stack Trace Analysis for Large Scale Debugging. In 21th International Parallel and Distributed Processing Symposium (IPDPS 2007). IEEE, 1--10. https://doi.org/10.1109/IPDPS.2007.370254Google ScholarGoogle ScholarCross RefCross Ref
  7. Juelich Supercomputing Centre at Forschungszentrum Juelich and Innovative Computing Laboratory at the University of Tennessee. 2022. KOJAK. https://icl.utk.edu/kojak/index.html. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  8. Jean-Baptiste Besnard, Marc Pérache, and William Jalby. 2013. Event Streaming for Online Performance Measurements Reduction. In 42nd International Conference on Parallel Processing (ICPP 2013). IEEE Computer Society, 985--994. https://doi.org/10.1109/ICPP.2013.117Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. David Boehme. 2015--2021. Caliper: A Performance Analysis Toolbox in a Library. http://software.llnl.gov/Caliper/. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  10. David Boehme. 2020. Tool Time: Caliper - A Performance Analysis Toolbox in a Library. https://pop-coe.eu/blog/tool-time-caliper-a-performance-analysis-toolbox-in-a-library.Google ScholarGoogle Scholar
  11. David Böhme, Pascal Aschwanden, Olga Pearce, Kenneth Weiss, and Matthew P. LeGendre. 2021. Ubiquitous Performance Analysis. In High Performance Computing - 36th International Conference (ISC High Performance 2021) (Lecture Notes in Computer Science, Vol. 12728). Springer, 431--449. https://doi.org/10.1007/978--3-030--78713--4_23Google ScholarGoogle ScholarCross RefCross Ref
  12. David Böhme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, Alfredo Giménez, Matthew P. LeGendre, Olga Pearce, and Martin Schulz. 2016. Caliper: performance introspection for HPC software stacks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2016). IEEE Computer Society, 550--560. https://doi.org/10.1109/SC.2016.46Google ScholarGoogle ScholarCross RefCross Ref
  13. BSC. 2022. Paraver. https://tools.bsc.es/paraver. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  14. Henri Casanova, Arnaud Legrand, Martin Quinson, and Frédéric Suter. 2018. SMPI Courseware: Teaching Distributed-Memory Computing with MPI in Simulation. In 2018 IEEE/ACM Workshop on Education for High- Performance Computing (EduHPC@SC 2018). IEEE, 21--30. https://doi.org/10.1109/EduHPC.2018.00006Google ScholarGoogle ScholarCross RefCross Ref
  15. Intel Corporation. [n.d.]. Intel Trace Analyzer and Collector (ITAC). https://www.intel.com/content/www/us/en/developer/tools/oneapi/trace-analyzer.html#gs.ijzdvr. Accessed: 2022--11--18.Google ScholarGoogle Scholar
  16. Association Curricula. 2013. Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science. (2013). https://doi.org/10.1145/2534860Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Technische Universtiaet Darmstadt and ETH Zurich. 2020. Extra-P. https://github.com/extra-p/extrap. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  18. Constantinos T. Delistavrou and Konstantinos G. Margaritis. 2010. Survey of Software Environments for Parallel Distributed Processing: Parallel Programming Education on Real Life Target Systems Using Production Oriented Software Tools. In 14th Panhellenic Conference on Informatics (PCI 2010). IEEE Computer Society, 231--236. https://doi.org/10.1109/PCI.2010.26Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Constantinos T. Delistavrou and Konstantinos G. Margaritis. 2011. Towards an Integrated Teaching Environment for Parallel Programming. In 15th Panhellenic Conference on Informatics (PCI 2011). IEEE Computer Society, 3--7. https://doi.org/10.1109/PCI.2011.16Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Eclipse Foundation. 2022. Eclipse Parallel Tools Platform (PTP). https://www.eclipse.org/ptp/. Accessed: 2022--11--16.Google ScholarGoogle Scholar
  21. Markus Geimer, Felix Wolf, Brian J. N. Wylie, Erika Ábrahám, Daniel Becker, and Bernd Mohr. 2010. The Scalasca performance toolset architecture. Concurr. Comput. Pract. Exp. 22, 6 (2010), 702--719. https://doi.org/10.1002/cpe.1556Google ScholarGoogle ScholarCross RefCross Ref
  22. Victor Gergel, Evgeny Kozinov, Alexey Linev, and Anton Shtanyuk. 2016. Educational and Research Systems for Evaluating the Efficiency of Parallel Computations. In Algorithms and Architectures for Parallel Processing (ICA3PP 2016) (Lecture Notes in Computer Science, Vol. 10049). Springer, 278--290. https://doi.org/10.1007/978--3--319--49956--7_22Google ScholarGoogle ScholarCross RefCross Ref
  23. Michael Gerndt, Ventsislav Petkov, and Yuri Oleynik. 2010. Performance analysis with Periscope. https://www.vi-hps.org/cms/upload/material/tw10/vi-hps-tw10-Periscope_Overview.pdf. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  24. GWT-TUD GmbH. 2022. Vampir. https://vampir.eu. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  25. Marjan Gusev, Sasko Ristov, Goran Velkoski, and Bisera Ivanovska. 2014. E-learning and Benchmarking Platform for Parallel and Distributed Computing. Int. J. Emerg. Technol. Learn. 9, 2 (2014), 17--21. https://doi.org/10.3991/ijet.v9i2.3215Google ScholarGoogle ScholarCross RefCross Ref
  26. Tobias Hilbrich. 2014. Runtime MPI Correctness Checking with a Scalable Tools Infrastructure. Ph. D. Dissertation. Dresden University of Technology. https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa-175472Google ScholarGoogle Scholar
  27. Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, and Matthias S. Müller. 2012. MPI runtime error detection with MUST: advances in deadlock detection. In SC Conference on High Performance Computing Networking, Storage and Analysis (SC 2012). IEEE/ACM, 30. https://doi.org/10.1109/SC.2012.79Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rice University Houston. 2000--2022. HPCToolkit. http://hpctoolkit.org/index.html. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  29. Alan Humphrey, Christopher Derrick, Ganesh Gopalakrishnan, and Beth Tibbitts. 2010. GEM: Graphical Explorer of MPI Programs. In 39th International Conference on Parallel Processing (ICPP Workshops 2010). IEEE Computer Society, 161--168. https://doi.org/10.1109/ICPPW.2010.33Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. David A. Joiner, Paul Gray, Thomas Murphy, and Charles Peck. 2006. Teaching parallel computing to science faculty: best practices and common pitfalls. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 2006). ACM, 239--246. https://doi.org/10.1145/1122971.1123007Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Forschungszentrum Juelich. [n.d.]. Score-P, scalable performance measurement infrastructure for parallel codes. https://scorepci.pages.jsc.fz-juelich.de/scorep-pipelines/docs/scorep-6.0/html/index.html. Accessed: 2022--10--16.Google ScholarGoogle Scholar
  32. Forschungszentrum Juelich. [n.d.]. Score-P, Scalable performance measurement infrastructure for parallel codes. https://scorepci.pages.jsc.fz-juelich.de/scorep-pipelines/docs/scorep-6.0/html/index.html. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  33. Forschungszentrum Juelich and Technische Universitaet Darmstadt. 2022. Scalasca. https://www.scalasca.orgl. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  34. Torsten Kempf, Kingshuk Karuri, and Lei Gao. 2008. Software Instrumentation. In Wiley Encyclopedia of Computer Science and Engineering. John Wiley & Sons, Inc. https://doi.org/10.1002/9780470050118.ecse386Google ScholarGoogle ScholarCross RefCross Ref
  35. Michael Knobloch and Bernd Mohr. 2020. Tools for GPU Computing - Debugging and Performance Analysis of Heterogenous HPC Applications. Supercomput. Front. Innov. 7, 1 (2020), 91--111. https://doi.org/10.14529/jsfi200105Google ScholarGoogle ScholarCross RefCross Ref
  36. Andreas Knüpfer, Christian Rössel, Dieter an Mey, Scott Biersdorff, Kai Diethelm, Dominic Eschweiler, Markus Geimer, Michael Gerndt, Daniel Lorenz, Allen D. Malony, Wolfgang E. Nagel, Yury Oleynik, Peter Philippen, Pavel Saviankou, Dirk Schmidl, Sameer Shende, Ronny Tschüter, Michael Wagner, Bert Wesarg, and Felix Wolf. 2011. Score-P: A Joint Performance Measurement Run- Time Infrastructure for Periscope, Scalasca, TAU, and Vampir. In Tools for High Performance Computing 2011 - Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing. Springer, 79--91. https://doi.org/10.1007/978--3--642--31476--6_7Google ScholarGoogle ScholarCross RefCross Ref
  37. Eileen T. Kraemer and John T. Stasko. 1993. The Visualization of Parallel Systems: An Overview. J. Parallel Distributed Comput. 18, 2 (1993), 105--117. https://doi.org/10.1006/jpdc.1993.1050Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. B. Krammer, K. Bidmon, M.S. Müller, and M.M. Resch. 2004. MARMOT: An MPI analysis and checking tool. In Parallel Computing. Advances in Parallel Computing, Vol. 13. North-Holland, 493--500. https://doi.org/10.1016/S0927--5452(04)80063--7Google ScholarGoogle ScholarCross RefCross Ref
  39. Lawrence Livermore National Laboratory. [n.d.]. STAT: Stack Trace Analysis Tool. https://hpc.llnl.gov/software/development-environment-software/stat-stack-trace-analysis-tool. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  40. Chee Wai Lee, Allen D. Malony, and Alan Morris. 2010. TAUmon: Scalable Online Performance Data Analysis in TAU. In Euro-Par 2010 Parallel Processing Workshops - HeteroPar, HPCC, HiBB, CoreGrid, UCHPC, HPCF, PROPER, CCPI, VHPC (Lecture Notes in Computer Science, Vol. 6586). Springer, 493--499. https://doi.org/10.1007/978--3--642--21878--1_61Google ScholarGoogle ScholarCross RefCross Ref
  41. Arm Limited. 2022. ARM DDT, The Number One Debugger for C, C and Fortran, Threaded and Parallel Code. https://www.arm.com/products/development-tools/server-and-hpc/forge/ddt. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  42. Arm Limited. 2022. ARM Performance Reports. https://developer.arm.com/tools-and-software/server-and-hpc/debug-and-profile/arm-forge/arm-performance-reports. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  43. Preeti Malakar. 2019. Experiences of Teaching Parallel Computing to Undergraduates and Post-Graduates. In 26th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW 2019). IEEE, 40--47. https://doi.org/10.1109/HiPCW.2019.00020Google ScholarGoogle ScholarCross RefCross Ref
  44. John Mellor-Crummey, Nathan R. Tallent, Mike Fagan, and Jan Odegard. 2007. Application performance profiling on the Cray XD1 using HPCToolkit. In Proc. of the Cray User's Group.Google ScholarGoogle Scholar
  45. Robert Mijakovic, Michael Firbach, and Michael Gerndt. 2016. An architecture for flexible auto-tuning: The Periscope Tuning Framework 2.0. In 2nd International Conference on Green High Performance Computing (ICGHPC 2016). IEEE, 1--9. https://doi.org/10.1109/ICGHPC.2016.7508066Google ScholarGoogle ScholarCross RefCross Ref
  46. Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille, Jeffrey K. Hollingsworth, R. Bruce Irvin, Karen L. Karavanic, Krishna Kunchithapadam, and Tia Newhall. 1995. The Paradyn Parallel Performance Measurement Tool. Computer 28, 11 (1995), 37--46. https://doi.org/10.1109/2.471178Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Bernd Mohr. 2014. Scalable parallel performance measurement and analysis tools - state-of-the-art and future challenges. Supercomput. Front. Innov. 1, 2 (2014), 108--123. https://doi.org/10.14529/jsfi140207Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Shirley Moore, David Cronk, Kevin S. London, and Jack J. Dongarra. 2001. Review of Performance Analysis Tools for MPI Parallel Programs. In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 8th European PVM/MPI Users' Group Meeting (Lecture Notes in Computer Science, Vol. 2131). Springer, 241--248. https://doi.org/10.1007/3--540--45417--9_34Google ScholarGoogle ScholarCross RefCross Ref
  49. Aroon Nataraj, Matthew J. Sottile, Alan Morris, Allen D. Malony, and Sameer Shende. 2007. TAUoverSupermon : Low-Overhead Online Parallel Performance Monitoring. In Euro-Par 2007, Parallel Processing, 13th International Euro-Par Conference (Lecture Notes in Computer Science, Vol. 4641). Springer, 85--96. https://doi.org/10.1007/978--3--540--74466--5_11Google ScholarGoogle ScholarCross RefCross Ref
  50. Department of Computer and Information Science University of Oregon. 1997- 2020. TAU, Tuning and Analysis Utilities. http://www.tau.uoregon.edu. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  51. University of Versailles St Quentin. 2004--2021. Maqao (Modular Assembly Quality Analyzer and Optimizer). http://http://www.maqao.org. Accessed: 2022--10--26.Google ScholarGoogle Scholar
  52. Inc. Perforce Software. 2022. TotalView HPC Debugging Software. https://totalview.io/products/totalview. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  53. Sushil K. Prasad, Almadena Yu. Chtchelkanova, Sajal K. Das, Frank Dehne, Mohamed G. Gouda, Anshul Gupta, Joseph F. JáJá, Krishna Kant, Anita La Salle, Richard LeBlanc, Manish Lumsdaine, David A. Padua, Manish Parashar, Viktor K. Prasanna, Yves Robert, Arnold L. Rosenberg, Sartaj Sahni, Behrooz A. Shirazi, Alan Sussman, Charles C. Weems, and Jie Wu. 2011. NSF/IEEE-TCPP curriculum initiative on parallel and distributed computing: core topics for undergraduates. In Proceedings of the 42nd ACM technical symposium on Computer science education (SIGCSE 2011). ACM, 617--618. https://doi.org/10.1145/1953163.1953336Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Joachim Protze, Tobias Hilbrich, Martin Schulz, Bronis R. de Supinski, Wolfgang E. Nagel, and Matthias S. Müller. 2014. MPI Runtime Error Detection with MUST: A Scalable and Crash-Safe Approach. In 43rd International Conference on Parallel Processing Workshops, (ICPPW 2014). IEEE Computer Society, 206--215. https://doi.org/10.1109/ICPPW.2014.37Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Readex. 2020. Periscope Tuning Framework. https://www.readex.eu/index.php/periscope-tuning-framework/p. Accessed: 2022--10--24.Google ScholarGoogle Scholar
  56. Sasko Ristov, Marjan Gusev, Blagoj Atanasovski, and Nenad Anchev. 2013. Using EDUCache Simulator for the Computer Architecture and Organization Course. Int. J. Eng. Pedagog. 3, 3 (2013), 47--56. https://doi.org/10.3991/ijep.v3i3.2784Google ScholarGoogle ScholarCross RefCross Ref
  57. Sasko Ristov, Marjan Gusev, and Goran Velkoski. 2014. Cloud E-learning and Benchmarking Platform for the Parallel and Distributed Computing Course. In 2014 IEEE Global Engineering Education Conference (EDUCON 2014). IEEE, 645--651. https://doi.org/10.1109/EDUCON.2014.6826161Google ScholarGoogle ScholarCross RefCross Ref
  58. Utah School of Computing. [n.d.]. GEM - Graphical Explorer of MPI Programs. http://formalverification.cs.utah.edu/GEM/. Accessed: 2022--11-04.Google ScholarGoogle Scholar
  59. Utah School of Computing. [n.d.]. ISP (In-situ Partial Order): a dynamic verifier for MPI Programs. http://formalverification.cs.utah.edu/ISP-release/. Accessed: 2022--11-04.Google ScholarGoogle Scholar
  60. Martin Schulz, Jim Galarowicz, Don Maghrak, William Hachfeld, David Montoya, and Scott Cranford. 2008. Open | SpeedShop: An open source infrastructure for parallel performance analysis. Sci. Program. 16, 2--3 (2008), 105--121. https://doi.org/10.3233/SPR-2008-0256Google ScholarGoogle ScholarCross RefCross Ref
  61. Martin Schulz, Jim Galarowicz, Don Maghrak, William Hachfeld, David Montoya, and Scott Cranford. 2009. Analyzing the performance of Scientific Applications with Open|SpeedShop. In Parallel Computational Fluid Dynamics. 151--159.Google ScholarGoogle Scholar
  62. Sameer Shende. 1999. Profiling and tracing in linux. In In Proceedings of Extreme Linux Workshop.Google ScholarGoogle Scholar
  63. Sameer Shende and Allen D. Malony. 2006. The Tau Parallel Performance System. Int. J. High Perform. Comput. Appl. 20, 2 (2006), 287--311. https://doi.org/10.1177/1094342006064482Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Elizabeth Shoop, Richard A. Brown, Eric Biggers, Malcolm Kane, Devry Lin, and Maura Warner. 2012. Virtual clusters for parallel and distributed education. In Proceedings of the 43rd ACM technical symposium on Computer science education (SIGCSE 2012). ACM, 517--522. https://doi.org/10.1145/2157136.2157287Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. BSC Tools. 2022. Extrae. https://tools.bsc.es/extrae. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  66. Lobachevsky University. 2022. ParaLab. https://hpc-education.unn.ru/en/trainings/teachware/paralab. Accessed: 2022--11-02.Google ScholarGoogle Scholar
  67. Lobachevsky University. 2022. ParaLib -- Parallel Computational Methods Library. https://hpc-education.unn.ru/en/trainings/teachware/paralib. Accessed: 2022--11-02.Google ScholarGoogle Scholar
  68. RTWH Aachen University. 2022. MUST - MPI Runtime Correctness Analysis. https://itc.rwth-aachen.de/must/. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  69. University of Wisconsin University of Maryland. 2019. Dyninst. https://www.dyninst.org. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  70. Computer Sciences Department University of Wisconsin. 2020. Paradyn. http://www.paradyn.org/overview/screen-shots.html. Accessed: 2022--10--20.Google ScholarGoogle Scholar
  71. Sarvani S. Vakkalanka, Subodh Sharma, Ganesh Gopalakrishnan, and Robert M. Kirby. 2008. ISP: a tool for model checking MPI programs. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 2008). ACM, 285--286. https://doi.org/10.1145/1345206.1345258Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Cédric Valensi, William Jalby, Mathieu Tribalat, Emmanuel Oseret, Salah Ibnamar, and Kevin Camus. 2019. Using MAQAO to Analyse and Optimise an Application. In 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2019). 423--424. https://doi.org/10.1109/MASCOTS.2019.00052Google ScholarGoogle ScholarCross RefCross Ref
  73. Jeffrey Vetter and Chris Chambreau. 2014. mpiP: Lightweight, Scalable MPI Profiling. http://gec.di.uminho.pt/Discip/MInf/cpd1415/PCP/MPI/mpiP_%20Lightweight,%20Scalable%20MPI%20Profiling.pdf. Accessed: 2022--11-04.Google ScholarGoogle Scholar
  74. Jeffrey S. Vetter and Bronis R. de Supinski. 2000. Dynamic Software Testing of MPI Applications with Umpire. In Proceedings Supercomputing 2000. IEEE Computer Society, 51. https://doi.org/10.1109/SC.2000.10055Google ScholarGoogle ScholarCross RefCross Ref
  75. Jack Whitham. 2016. Profiling versus Tracing. https://www.jwhitham.org/2016/02/profiling-versus-tracing.html. Accessed: 2022--10--17.Google ScholarGoogle Scholar
  76. Ali Yazici, Alok Mishra, and Ziya Karakaya. 2016. Teaching Parallel Computing Concepts Using Real-Life Applications*. International Journal of Engineering Education 32 (03 2016), 772--781.Google ScholarGoogle Scholar
  77. Gonzalo Zarza, Diego Lugones, Daniel Franco, and Emilio Luque. 2012. An Innovative Teaching Strategy to Understand High-Performance Systems through Performance Evaluation. In Proceedings of the International Conference on Computational Science (ICCS 2012) (Procedia Computer Science, Vol. 9). Elsevier, 1733--1742. https://doi.org/10.1016/j.procs.2012.04.191Google ScholarGoogle ScholarCross RefCross Ref
  78. Yuxiao Zhang, Jiang Li, Di Wu, and Yunfei Du. 2018. Improving Student Skills on Parallel Programming via Code Evaluation and Feedback Debugging. In IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE 2018). IEEE, 1069--1073. https://doi.org/10.1109/TALE.2018.8615351Google ScholarGoogle ScholarCross RefCross Ref
  79. Ilya Zhukov, Christian Feld, Markus Geimer, Bernd Mohr, Michael Knobloch, and Pavel Saviankou. 2015. Scalasca v2: Back to the Future. In Tools for High Performance Computing 2014. Springer International Publishing, 1--24. https://doi.org/10.1007/978--3--319--16012--2_1Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Performance Analysis Tools for MPI Applications and their Use in Programming Education

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering
            April 2023
            421 pages
            ISBN:9798400700729
            DOI:10.1145/3578245

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 15 April 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate252of851submissions,30%

            Upcoming Conference

          • Article Metrics

            • Downloads (Last 12 months)107
            • Downloads (Last 6 weeks)3

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader