skip to main content
10.1145/1152154.1152161acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
Article

Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

Published:16 September 2006Publication History

ABSTRACT

As chip multiprocessors (CMPs) become increasingly mainstream, architects have likewise become more interested in how best to share a cache hierarchy among multiple simultaneous threads of execution. The complexity of this problem is exacerbated as the number of simultaneous threads grows from two or four to the tens or hundreds. However, there is no consensus in the architectural community on what "best" means in this context. Some papers in the literature seek to equalize each thread's performance loss due to sharing, while others emphasize maximizing overall system performance. Furthermore, the specific effect of these goals varies depending on the metric used to define "performance".In this paper we label equal performance targets as Communist cache policies and overall performance targets as Utilitarian cache policies. We compare both of these models to the most common current model of a free-for-all cache (a Capitalist policy). We consider various performance metrics, including miss rates, bandwidth usage, and IPC, including both absolute and relative values of each metric. Using analytical models and behavioral cache simulation, we find that the optimal partitioning of a shared cache can vary greatly as different but reasonable definitions of optimality are applied. We also find that, although Communist and Utilitarian targets are generally compatible, each policy has workloads for which it provides poor overall performance or poor fairness, respectively. Finally, we find that simple policies like LRU replacement and static uniform partitioning are not sufficient to provide near-optimal performance under any reasonable definition, indicating that some thread-aware cache resource allocation mechanism is required.

References

  1. D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proc. 11th Int'l Symp. on High-Performance Computer Architecture (HPCA), pages 340--351, Feb. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Chiou, P. Jain, S. Devadas, and L. Rudolph. Dynamic cache partitioning via columnization. In Proceedings of Design Automation Conference, Los Angeles, June 2000.Google ScholarGoogle Scholar
  3. A. Fedorova, M. Seltzer, C. Small, and D. Nussbaum. Performance of multithreaded chip multiprocessors and implications for operating system design. In Proc. 2005 USENIX Technical Conference, pages 395--398, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Goodwins. Does hyperthreading hurt server performance? http://news.com.com/Does+hyperthreading+hurt+server+performance/2100-1006_3-5965435.html?tag=nefd.top, Nov. 2005.Google ScholarGoogle Scholar
  5. J. Huh, C. Kim, H. Shafi, L. Zhang, D. Burger, and S. W. Keckler. A nuca substrate for flexible cmp cache sharing. In Proc. 2005 Int'l Conf. on Supercomputing, pages 31--40, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Intel Corp. Next leap in microprocessor architecture: Intel core duo. White paper. http://ces2006.akamai.com.edgesuite.net/yonahassets/CoreDuo_WhitePaper.pdf.Google ScholarGoogle Scholar
  7. R. R. Iyer. On modeling and analyzing cache hierarchies using CASPER. In Proc. 11th Int'l Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, pages 182--187, Oct. 2003.Google ScholarGoogle ScholarCross RefCross Ref
  8. R. R. Iyer. CQoS: a framework for enabling QoS in shared caches of CMP platforms. In Proc. 2004 Int'l Conf. on Supercomputing, pages 257--266, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Kalla, B. Sinharoy, and J. M. Tendler. Ibm power5 chip: A dual-core multithreaded processor. IEEE Micro, 24(2):40--47, Mar. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Kim, D. Chandra, and Y. Solihin. Fair caching in a chip multiprocessor architecture. In Proc. 13th Ann. Int'l Conf. on Parallel Architectures and Compilation Techniques, pages 111--122, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded sparc processor. IEEE Micro, 25(2):21--29, March/April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. R. Kunkel, R. J. Eickemeyer, M. H. Lipasti, T. J. Mullins, B. O'Krafka, H. Rosenberg, S. P. VanderWiel, P. L. Vitale, and L. D. Whitley. A performance methodology for commercial servers. IBM Journal of Research and Development, 44(6):851--871, November 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M5 Development Team. The M5 Simulator. http://m5.eecs.umich.edu.Google ScholarGoogle Scholar
  14. A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreading processor. In Proc. Ninth Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS IX), pages 234--244, Nov. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. S. Stone, J. Turek, and J. L. Wolf. Optimal partitioning of cache memory. IEEE Trans. Computers, 41(9):1054--1068, Sept. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. E. Suh, S. Devadas, and L. Rudolph. Analytical cache models with applications to cache partitioning. In Proc. 2001 Int'l Conf. on Supercomputing, pages 1--12, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. E. Suh, S. Devadas, and L. Rudolph. A new memory monitoring scheme for memory-aware scheduling and partitioning. In Proc. 8th Int'l Symp. on High-Performance Computer Architecture (HPCA), Feb. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. E. Suh, L. Rudolph, and S. Devadas. Dynamic cache partitioning for simultaneous multithreading systems. In Proc. 13th IASTED Int'l Conference on Parallel and Distributed Computing Systems, 2001.Google ScholarGoogle Scholar
  19. D. Thiebaut, H. S. Stone, and J. L. Wolf. Improving disk cache hit-ratios through cache partitioning. 41(6):665--676, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. A. Waldspurger. Memory resource management in vmware esx server. In Proc. 2002 USENIX Technical Conference, pages 181--194, Dec. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. A. Wood, M. D. Hill, and R. E. Kessler. A model for estimating trace-sample miss ratios. In Proc. 1991 ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, pages 79--89, May 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques
      September 2006
      308 pages
      ISBN:159593264X
      DOI:10.1145/1152154

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 September 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate121of471submissions,26%

      Upcoming Conference

      PACT '24
      International Conference on Parallel Architectures and Compilation Techniques
      October 14 - 16, 2024
      Southern California , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader