ABSTRACT
Dynamic instrumentation systems are gaining popularity as means of constructing customized program profiling and analysis tools. However, dynamic instrumentation based analysis tools still suffer from performance problems. The overhead of such systems can be broken down into two components - the overhead of dynamic instrumentation and the time consumed in the user-defined analysis tools. While important progress has been made in reducing the performance penalty of the dynamic instrumentation itself, less attention has been paid to the user-defined component. In this paper, we present PiPA - Pipelined Profiling and Analysis, which is a novel technique for parallelizing dynamic program profiling and analysis by taking advantage of multi-core systems. We implemented a prototype of PiPA using the dynamic instrumentation system DynamoRIO. Our experiments show that PiPA is able to speed up the overall profiling and analysis tasks significantly. Compared to the more than 100x slowdown of Cachegrind and the 32x slowdown of Pin dcache, we achieved a mere 10.5x slowdown on an 8-core system.
- M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In SIGPLAN Conference on Programming Language Design and Implementation, pages 168--179, 2001. Google ScholarDigital Library
- V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In PLDI '00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, pages 1--12, 2000. Google ScholarDigital Library
- D. Bruening. Efficient, Transparent, and Comprehensive Runtime Code Manipulation. PhD thesis, Massachusetts Institute of Technology, Sep. 2004. http://www.cag.csail.mit.edu/rio/. Google ScholarDigital Library
- H. Chen, W.-C. Hsu, J. Lu, P.-C. Yew, and D.-Y. Chen. Dynamic trace selection using performance monitoring hardware sampling. In Proceedings of the First International Symposium on Code Generation and Optimization, pages 79--90, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarDigital Library
- K. Ebcioglu and E. R. Altman. Daisy: dynamic compilation for 100% architectural compatibility. In Proceedings of the 24th International Symposium on Computer Architecture, pages 26--37, New York, NY, USA, 1997. ACM Press. Google ScholarDigital Library
- V. Kiriansky, D. Bruening, and S. P. Amarasinghe. Secure Execution via Program Shepherding. In Proceedings of the 11th USENIX Security Symposium, pages 191--206, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarDigital Library
- J. R. Larus. Whole program paths. In PLDI '99: Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, pages 259--269, 1999. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser,Google Scholar
- G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 190--200, New York, NY, USA, 2005. Press. Google ScholarDigital Library
- T. Moseley, A. Shye, V. Reddi, D. Grunwald, and R. Peri. Shadow profiling: Hiding instrumentation costs with parallelism. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, pages 198--208, 2007. Google ScholarDigital Library
- N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the SIGPLAN 2007 Conference on Programming Language Design and Implementation, pages 89--100, New York, NY, USA, 2007. ACM Press. Google ScholarDigital Library
- G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. Swift: Software implemented fault tolerance. In Proceedings of the Third International Symposium on Code Generation and Optimization, pages 243--254, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
- Standard Performance Evaluation Corporation. SPEC CPU2000 benchmark suite, 2000. http://www.spec.org/osg/cpu2000/.Google Scholar
- S. Tallam, R. Gupta, and X. Zhang. Extended whole program paths. In PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, pages 17--26, 2005. Google ScholarDigital Library
- S. Wallace and K. Hazelwood. Superpin: Parallelizing dynamic instrumentation for real--time performance. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, pages 209--220, 2007. Google ScholarDigital Library
- Q. Zhao, R. Rabbah, S. Amarasinghe, L. Rudolph, and W.-F. Wong. Ubiquitous memory introspection. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, pages 299--311, 2007. Google ScholarDigital Library
- Q. Zhao, J. E. Sim, W.-F. Wong, and L. Rudolph. DEP: Detailed Execution Profile. In PACT '06: Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, pages 154--163, 2006. Google ScholarDigital Library
Index Terms
- Pipa: pipelined profiling and analysis on multi-core systems
Recommendations
PiPA: Pipelined profiling and analysis on multicore systems
Profiling and online analysis are important tasks in program understanding and feedback-directed optimization. However, fine-grained profiling and online analysis tend to seriously slow down the application. To cope with the slowdown, one may have to ...
Low overhead program monitoring and profiling
Program instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program ...
Low overhead program monitoring and profiling
PASTE '05: Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineeringProgram instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program ...
Comments