ABSTRACT
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance bottlenecks. Existing performance tools don't adequately support this process in one or more dimensions. We discuss some of the critical utility and usability issues for application-level performance analysis tools in the context of two performance tools, MHSim and HPCView, that we built to support our own work on data layout and optimizing compilers. MHsim is a memory hierarchy simulator that produces source-level information not otherwise available about memory hierarchy utilization and the causes of cache conflicts. HPCView is a tool that combines data from arbitrary sets of instrumentation sources and correlates it with program source code. Both tools report their results in scope-hierarchy views of the corresponding source code and produce their output as HTML databases that can be analyzed portably and collaboratively using a commodity browser. In addition to daily use within our group, the tools are being used successfully by several code development teams in DoD and DoE laboratories.
- 1.D. Callahan, J. Cocke, and K. Kennedy. Estimating interlock and improving balance for pipelined machines. Journal of Parallel and Distributed Computing, 5(4):334-358, August 1988.]] Google ScholarDigital Library
- 2.Carnival Web Site. http://www.cs.rochester.edu/u/leblanc/prediction.html.]]Google Scholar
- 3.H. Davis, S. Goldschmidt, and J. Hennessy. Tango: A Multiprocessor Simulation and Tracing System. In Proceedings of the International Conference on Parallel Processing, pages 99-107, August 1991.]]Google Scholar
- 4.Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, and George Chrysos. ProfileMe: Hardware support for instruction-level profiling on out-of-order processors. In Proceedings of the 30th Annual International Symposium on Microarchitecture (Micro '97), December 1997.]] Google ScholarDigital Library
- 5.A. J. Goldberg and J. Hennessy. MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs. In Proceedings of the International Conference on Parallel Processing, pages 251-257, August 1991.]]Google Scholar
- 6.W3C Math Working Group. Mathematical markup language (mathml) 1.01 specification, July 1999. http://www.w3.org/TR/REC-MathML.]]Google Scholar
- 7.E. Schnarr J. Larus. EEL: Machine-Independent Executable Editing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 291-300, June 1995.]] Google ScholarDigital Library
- 8.C. Janssen. The Visual Profiler. http://aros.ca.sandia.gov/~cljanss/perf/vprof/doc/README.html.]]Google Scholar
- 9.D. Reed L. DeRose, Y. Zhang. SvPablo: A Multi-Language Performance Analysis System. In 10th International Conference on Performance Tools, pages 352-355, September 1998.]] Google ScholarDigital Library
- 10.A. Lebeck and D. Wood. Cache profiling and the spec benchmarks: A case study. IEEE Computer, October 1994.]] Google ScholarDigital Library
- 11.T. LeBlanc M. Crovella. Parallel Performance Prediction Using Lost Cycles. In Proceedings Supercomputing '94, pages 600-610, November 1994.]]Google Scholar
- 12.D. Ofelt M. Martonosi and M. Heinrich. Integrating Performance Monitoring and Communication in Parallel Computers. In ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages 138-147, May 1996.]] Google ScholarDigital Library
- 13.M. Martonosi, A. Gupta, and T. Anderson. MemSpy: Analyzing Memory System Bottlenecks in Programs. In ACM SIGMETRICS and PERFORMANCE '92 International Conference on Measurement and Modeling of Computer Systems, pages 1-12, June 1992.]] Google ScholarDigital Library
- 14.M. Rosenblum, E. Bugnion, S. Devine, and S. Herrod. Using the SimOS machine simulator to study complex systems. ACM Transactions on Modelling and Computer Simulation, 7:78-103, January 1997.]] Google ScholarDigital Library
- 15.M. Zagha, B. Larson, S. Turner, and M. Itzkowitz. Performance Analysis Using the MIPS R10000 Performance Counters. In Proceedings Supercomputing '96, November 1996.]] Google ScholarDigital Library
Index Terms
- Tools for application-oriented performance tuning
Recommendations
Studying the effectiveness of application performance management (APM) tools for detecting performance regressions for web applications: an experience report
MSR '16: Proceedings of the 13th International Conference on Mining Software RepositoriesPerformance regressions, such as a higher CPU utilization than in the previous version of an application, are caused by software application updates that negatively affect the performance of an application. Although a plethora of mining software ...
Comments