Abstract
Automatic trace analysis is an effective method for identifying complex performance phenomena in parallel applications. However, as the size of parallel systems and the number of processors used by individual applications is continuously raised, the traditional approach of analyzing a single global trace file, as done by kojak’s expert trace analyzer, becomes increasingly constrained by the large number of events. In this article, we present a scalable version of the expert analysis based on analyzing separate local trace files with a parallel tool which ‘replays’ the target application’s communication behavior. We describe the new parallel analyzer architecture and discuss first empirical results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Nagel, W., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer 63, XII(1), 69–80 (1996)
Labarta, J., Girona, S., Pillet, V., Cortes, T., Gregoris, L.: DiP: A Parallel Program Development Environment. In: Proc. 2nd Int’l Euro-Par Conf., Lyon, France. Springer, Heidelberg (1996)
Wolf, F., Mohr, B.: Automatic performance analysis of hybrid MPI/OpenMP applications. Journal of Systems Architecture 49(10–11), 421–439 (2003)
Wolf, F., Mohr, B., Dongarra, J., Moore, S.: Efficient Pattern Search in Large Traces through Successive Refinement. In: Proc. European Conf. on Parallel Computing (Euro-Par 2004), Pisa, Italy. Springer, Heidelberg (2004)
Wolf, F., Freitag, F., Mohr, B., Moore, S., Wylie, B.: Large Event Traces in Parallel Performance Analysis. In: Proc. 8th Workshop on Parallel Systems and Algorithms (PASA 2006), Frankfurt/Main, Germany. Lecture Notes in Informatics, Gesellschaft für Informatik (2006)
Freitag, F., Caubet, J., Labarta, J.: On the scalability of tracing mechanisms. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, p. 97. Springer, Heidelberg (2002)
Wu, C.E., Bolmarcich, A., Snir, M., Wootton, D., Parpia, F., Chan, A., Lusk, E., Gropp, W.: From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems. In: Reich, S., Anderson, K.M. (eds.) OHS 2000 and SC 2000. LNCS, vol. 1903. Springer, Heidelberg (2000)
Brunst, H., Nagel, W.E.: Scalable Performance Analysis of Parallel Systems: Concepts and Experiences. In: Parallel Computing: Software Technology, Algorithms, Architectures and Applications, pp. 737–744. Elsevier, Amsterdam (2004)
Knüpfer, A., Nagel, W.E.: Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis. In: Proc. of the International Conference on Parallel Processing (ICCP 2005), Oslo, Norway, pp. 165–172. IEEE Computer Society, Los Alamitos (2005)
Roth, P.C., Miller, B.P.: On-line automated performance diagnosis on thousands of processes. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2006), New York City, NY, USA (2006)
Fürlinger, K., Gerndt, M.: Distributed Application Monitoring for Clustered SMP Architectures. In: Proc. 9th Int’l Euro-Par Conf., Klagenfurt, Austria. Springer, Heidelberg (2003)
Fahringer, T., Gerndt, M., Mohr, B., Wolf, F., Riley, G., Träff, J.L.: Knowledge Specification for Automatic Performance Analysis. Technical Report FZJ-ZAM-IB-2001-08, ESPRIT IV Working Group APART, Forschungszentrum Jülich (2001) (Revised version)
Fahringer, T., Seragiotto Jr., C.: Modelling and Detecting Performance Problems for Distributed and Parallel Programs with JavaPSL. In: Proc. SC 2001, Denver, CO, USA (2001)
Jorba, J., Margalef, T., Luque, E.: Performance Analysis of Parallel Applications with KappaPI 2. In: Proc. Parallel Computing 2005, ParCo, Málaga, Spain (2006)
Song, F., Wolf, F., Bhatia, N., Dongarra, J., Moore, S.: An Algebra for Cross-Experiment Performance Analysis. In: Proc. Int’l Conf. on Parallel Processing (ICPP 2004), Montreal, Canada. IEEE Computer Society, Los Alamitos (2004)
Wolf, F.: Automatic Performance Analysis on Parallel Computers with SMP Nodes. PhD thesis, RWTH Aachen, Forschungszentrum Jülich (2003) ISBN 3-00-010003-2
The BlueGene/L Team at IBM and LLNL: An overview of the BlueGene/L supercomputer. In: Proc. SC 2002, Baltimore, MD, USA. IEEE Computer Society, Los Alamitos (2002)
Advanced Simulation and Computing Program: The ASC SMG 2000 Benchmark Code (2001), http://www.llnl.gov/asc/purple/benchmarks/limited/smg/
Gibbon, P.: PEPC: A Multi-Purpose Parallel Tree-Code (2005), http://www.fz-juelich.de/zam/pepc/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B. (2006). Scalable Parallel Trace-Based Performance Analysis. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science, vol 4192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846802_43
Download citation
DOI: https://doi.org/10.1007/11846802_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39110-4
Online ISBN: 978-3-540-39112-8
eBook Packages: Computer ScienceComputer Science (R0)