skip to main content
10.1145/781027.781029acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

DiST: a simple, reliable and scalable method to significantly reduce processor architecture simulation time

Published:10 June 2003Publication History

ABSTRACT

While architecture simulation is often treated as a methodology issue, it is at the core of most processor architecture research works, and simulation speed is often the bottleneck of the typical trial-and-error research process. To speedup simulation during this research process and get trends faster, researchers usually reduce the trace size. More sophisticated techniques like trace sampling or distributed simulation are scarcely used because they are considered unreliable and complex due to their impact on accuracy and the associated warm-up issues.In this article, we present DiST, a practical distributed simulation scheme where, unlike in other simulation techniques that trade accuracy for speed, the user is relieved from most accuracy issues thanks to an automatic and dynamic mechanism for adjusting the warm-up interval size. Moreover, the mechanism is designed so as to always privilege accuracy over speedup. The speedup scales with the amount of available computing resources, bringing an average 7.35 speedup on 10 machines with an average IPC error of 1.81% and a maximum IPC error of 5.06%.Besides proposing a solution to the warm-up issues in distributed simulation, we experimentally show that our technique is significantly more accurate than trace size reduction or trace sampling for identical speedups. We also show that not only the error always remains small for IPC and other metrics, but that a researcher can reliably base research decisions on DiST simulation results. Finally, we explain how the DiST tool is designed to be easily pluggable into existing architecture simulators with very few modifications.

References

  1. J. Anderson, L. Berc, J. Dean, S. Ghemawat, M. Henzinger, S. Leung, D. Sites, M. Vandevoorde, C. Waldspurger, and W. Weihl. Continuous profiling: Where have all the cycles gone, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Bose and T. M. Conte. Performance analysis and its impact on design. IEEE Computer, pages 41--49, May 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Burger and T. Austin. The simplescalar tool set, version 2.0. Technical Report CS-TR-97-1342, Department of Computer Sciences, University of Wisconsin, June 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chatterjee and S. Sen. Cache-efficient matrix transposition. In Sixth International Symposium on High-Performance Computer Architecture, pages 195--205, Toulouse, France, 2000.Google ScholarGoogle Scholar
  5. T. Conte, M. Hirsch, and K. Menezes. Reducing state loss for effective trace sampling of superscalar processors. In International Conference on Computer Design, pages 468--477, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Dean, J. E. Hicks, C. A. Waldspurger, W. E. Weihl, and G. Z. Chrysos. ProfileMe : Hardware support for instruction-level profiling on out-of-order processors. In International Symposium on Microarchitecture, pages 292--302, Research Triangle Park, North Carolina, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Desikan, D. Burger, and S. W. Keckler. Measuring experimental error in microprocessor simulation. In The 28th Annual Intl. Symposium on Computer Architecture, pages 266--277, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Eeckhout, K. DeBousschere, and H. Neefs. Performance analysis through synthetic trace generation. In Int. Symp. on Performance Analysis of Systems and Software, Liege, Belgium, April 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Haskins and K. Skadron. Minimal subset evaluation: Rapid warm-up for simulated hardware state. In Proc. of the 2001 International Conference on Computer Design, Austin, Texas, September 2001.Google ScholarGoogle ScholarCross RefCross Ref
  10. V. S. Iyengar and L. H. Trevillyan. Evaluation and generation of reduced traces for benchmarks. Technical Report RC20610, IBM T. J. Watson, Oct 1996.Google ScholarGoogle Scholar
  11. A. KleinOsowski, J. Flynn, N. Meares, and D. Lilja. Adapting the SPEC 2000 benchmark suite for simulation-based computer architecture research. In Proceedings of the Third IEEE Annual Workshop on Workload Characterization, International Conference on Computer Design (ICCD),, pages 73--82, September 2000.Google ScholarGoogle Scholar
  12. T. Lafage, A. Seznec, E. Rohou, and F. Bodin. Code cloning tracing: A "pay per trace" approach. In EuroPar'99 Parallel Processing, Toulouse, France, August 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. J. Litzkow, M. Livny, and M. W. Mutka. Condor - a hunter of idle workstations. In Proc. of the 8th Intl. Conf. on Distributed Computing Systems, pages 104--111, San Jose, Calif., June 1988.Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Martonosi, A. Gupta, and T. Anderson. Effectiveness of trace sampling for performance debugging tools. In Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems, pages 248--259. ACM Press, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Nguyen, M. Michael, A. Nanda, K. Ekanadham, and P. Bose. Accuracy and speed-up of parallel trace-driven architectural simulation. In Proc. Int'l Parallel Processing Symp., IEEE Computer Soc. Press,, pages 39--44, Geneva, Switzerland, April 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. B. Noonburg and J. P. Shen. A framework for statistical modeling of superscalar processor performance. In Proc. Thrird In. Symp. On High Perf. Computer Architecture, San Antonio, Texas, February 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Nussbaum and J. Smith. Modeling superscalar processors via statistical simulation. In PACT '01, International Conference on Parallel Architectures and Compilation Techniques, Barcelona, September 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Parello, O. Temam, and J.-M. Verdun. On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance - matrix-multiply revisited. In Supercomputing 2002, Baltimore, November 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. Rajesh and R. Moona. Processor modeling for hardware software codesign. In International Conference on VLSI Design, Goa, India, January 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Sherwood, E. Perelman, and B. Calder. Basic block distribution analysis to find periodic behavior and simulation points in applications. In International Conference on Parallel Architecture and Compilation Techniques, Barcelona, Spain, September 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proc. of Tenth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, Calif., October 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Synopsys. SystemC. http://www.systemc.org, 2000-2002.Google ScholarGoogle Scholar
  23. X. Vera, M. Hogskola, and J. Xue. Let's study whole-program cache behaviour analytically. In Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), Boston, Massachusettes, February 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Z. Wang, K. Pierce, and S. McFarling. BMAT --- a binary matching tool for stale profile propagation. Journal of Instruction-Level Parallelism, 2(1--6), 2000.Google ScholarGoogle Scholar

Index Terms

  1. DiST: a simple, reliable and scalable method to significantly reduce processor architecture simulation time

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SIGMETRICS '03: Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
              June 2003
              338 pages
              ISBN:1581136641
              DOI:10.1145/781027
              • cover image ACM SIGMETRICS Performance Evaluation Review
                ACM SIGMETRICS Performance Evaluation Review  Volume 31, Issue 1
                June 2003
                325 pages
                ISSN:0163-5999
                DOI:10.1145/885651
                Issue’s Table of Contents

              Copyright © 2003 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 10 June 2003

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              SIGMETRICS '03 Paper Acceptance Rate26of222submissions,12%Overall Acceptance Rate459of2,691submissions,17%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader