ABSTRACT
Call path profiling associates resource consumption with the calling context in which resources were consumed. We describe the design and implementation of a low-overhead call path profiler based on stack sampling. The profiler uses a novel sample-driven strategy for collecting frequency counts for call graph edges without instrumenting every procedure's code to count them. The data structures and algorithms used are efficient enough to construct the complete calling context tree exposed during sampling. The profiler leverages information recorded by compilers for debugging or exception handling to record call path profiles even for highly-optimized code. We describe an implementation for the Tru64/Alpha platform. Experiments profiling the SPEC CPU2000 benchmark suite demonstrate the low (2%-7%) overhead of this profiler. A comparison with instrumentation-based profilers, such as gprof, shows that for call-intensive programs, our sampling-based strategy for call path profiling has over an order of magnitude lower overhead.
- G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In SIGPLAN Conference on Programming Language Design and Implementation, pages 85--96, 1997. Google ScholarDigital Library
- G. Ammons, J.-D. Choi, M. Gupta, and N. Swamy. Finding and removing performance bottlenecks in large systems. In Proceedings of the 2004 European Conference on Object-Oriented Programming, pages 172--196, 2004.Google ScholarCross Ref
- J. M. Anderson, L. M. Berc, J. Dean, S. Ghemawat, M. R. Henzinger, S.-T. A. Leung, R. L. Sites, M. T. Vandevoorde, C. A. Waldspurger, and W. E. Weihl. Continuous profiling: where have all the cycles gone? In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, pages 1--14. ACM Press, 1997. Google ScholarDigital Library
- Apple Computer. Shark. http://developer.apple.com/performance/.Google Scholar
- M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In SIGPLAN Conference on Programming Language Design and Implementation, pages 168--179, 2001. Google ScholarDigital Library
- M. Arnold and P. F. Sweeney. Approximating the calling context tree via sampling. Technical Report 21789, IBM, 1999.Google Scholar
- A. R. Bernat and B. P. Miller. Incremental call-path profiling. Technical report, University of Wisconsin, 2004.Google Scholar
- H.-P. Company. Calling standard for Alpha systems. http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51B_HTML/ARH9MCT%E/TITLETXT.HTM. 29 April 2005.Google Scholar
- T. C. Conway and Z. Somogyi. Deep profiling: engineering a profiler for a declarative programming language. Technical Report 24, University of Melbourne, Australia, 2001.Google Scholar
- S. J. Drew, K. J. Gough, and J. Ledermann. Implementing zero overhead exception handling. Technical Report 95--12, Queensland University of Technology, 1995.Google Scholar
- E. R. Gansner and S. C. North. An open graph visualization system and its applications to software engineering. Software: Practice and Experience, 29(5), 1999. Google ScholarDigital Library
- S. L. Graham, P. B. Kessler, and M. K. McKusick. gprof: a call graph execution profiler. In SIGPLAN Symposium on Compiler Construction, pages 120--126, 1982. Google ScholarDigital Library
- W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing, 22(6):789--828, Sept. 1996. Google ScholarDigital Library
- R. J. Hall. Call path refinement profiles. In IEEE Transactions on Software Engineering, volume no. 6, 1995. Google ScholarDigital Library
- R. J. Hall and A. J. Goldberg. Call path profiling of monotonic program resources in UNIX. In Proceedings of the USENIX Summer Technical Conference, 1993.Google ScholarDigital Library
- Intel Corporation. Intel vtune performance analyzers. http://www.intel.com/software/products/vtune/.Google Scholar
- J. Mellor-Crummey, R. Fowler, G. Marin, and N. Tallent. HPCView: A tool for top-down analysis of node performance. The Journal of Supercomputing, 23:81--101, 2002. Special Issue with selected papers from the Los Alamos Computer Science Institute Symposium. Google ScholarDigital Library
- C. Ponder and R. J. Fateman. Inaccuracies in program profilers. Software: Practice and Experience, 18(5), 1988. Google ScholarDigital Library
- G. Sander. Graph layout through the VCG tool. In R. Tamassia and I. G. Tollis, editors, Proc. DIMACS Int. Work. Graph Drawing, GD, number 894, pages 194--205, Berlin, Germany, 10--12 1994. Springer-Verlag. Google ScholarDigital Library
- M. Spivey. Fast, accurate call graph profiling. Software: Practice and Experience, 34(3):249--264, 2004. Google ScholarDigital Library
- D. A. Varley. Practical experience of the limitations of gprof. Software: Practice and Experience, 23(4):461--463, 1993. Google ScholarDigital Library
- O. Waddell and J. M. Ashley. Visualizing the performance of higher-order programs. In Proceedings of the 1998 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pages 75--82. ACM Press, 1998. Google ScholarDigital Library
- J. Whaley. A portable sampling-based profiler for Java virtual machines. In Java Grande, pages 78--87, 2000. Google ScholarDigital Library
Recommendations
Low overhead program monitoring and profiling
Program instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program ...
Low overhead program monitoring and profiling
PASTE '05: Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineeringProgram instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program ...
Zero-overhead profiling via EM emanations
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisThis paper presents an approach for zero-overhead profiling (ZOP). ZOP accomplishes accurate program profiling with no modification to the program or system during profiling and no dedicated hardware features. To do so, ZOP records the electromagnetic (...
Comments