skip to main content
research-article
Free Access

Modeling the impact of permanent faults in caches

Published:01 December 2013Publication History
Skip Abstract Section

Abstract

The traditional performance cost benefits we have enjoyed for decades from technology scaling are challenged by several critical constraints including reliability. Increases in static and dynamic variations are leading to higher probability of parametric and wear-out failures and are elevating reliability into a prime design constraint. In particular, SRAM cells used to build caches that dominate the processor area are usually minimum sized and more prone to failure. It is therefore of paramount importance to develop effective methodologies that facilitate the exploration of reliability techniques for caches.

To this end, we present an analytical model that can determine for a given cache configuration, address trace, and random probability of permanent cell failure the exact expected miss rate and its standard deviation when blocks with faulty bits are disabled. What distinguishes our model is that it is fully analytical, it avoids the use of fault maps, and yet, it is both exact and simpler than previous approaches. The analytical model is used to produce the miss-rate trends (expected miss-rate) for future technology nodes for both uncorrelated and clustered faults. Some of the key findings based on the proposed model are (i) block disabling has a negligible impact on the expected miss-rate unless probability of failure is equal or greater than 2.6e-4, (ii) the fault map methodology can accurately calculate the expected miss-rate as long as 1,000 to 10,000 fault maps are used, and (iii) the expected miss-rate for execution of parallel applications increases with the number of threads and is more pronounced for a given probability of failure as compared to sequential execution.

References

  1. Agarwal, A., Hennessy, J., and Horowitz, M. 1989. An analytical cache model. ACM Trans. Comput. Syst. 7, 2, 184--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agarwal, K. and Nassif, S. 2006. Statistical analysis of SRAM cell stability. In Proceedings of the 43rd Annual Design Automation Conference. ACM, New York, 57--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ansari, A., Gupta, S., Feng, S., and Mahlke, S. 2009. ZerehCache: Armoring cache architectures in high defect density technologies. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'09). 100--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. ACM, New York, 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Borkar, S. 1999. Design challenges of technology scaling. IEEE Micro 19, 4, 23--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Borkar, S., Karnik, T., Narendra, S., Tschanz, J., Keshavarzi, A., and De, V. 2003. Parameter variations and impact on circuits and microarchitecture. In Proceedings of the 40th Annual Design Automation Conference. ACM, New York, 338--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bowman, K. A., Alameldeen, A. R., Srinivasan, S. T., and Wilkerson, C. B. 2007. Impact of die-to-die and within-die parameter variations on the throughput distribution of multi-core processors. In Proceedings of the 2007 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED'07). 50--55. DOI: http://dx.doi.org/10.1145/1283780.1283792 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bowman, K. A., Duvall, S. G., and Meindl, J. D. 2002. Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration. IEEE J. Solid-State Circuits 37, 2, 183--190.Google ScholarGoogle ScholarCross RefCross Ref
  9. Bowman, K., Tschanz, J., Wilkerson, C., Lu, S.-L., Karnik, T., De, V., and Borkar, S. 2009. Circuit techniques for dynamic variation tolerance. In Proceedings of the 46th Annual Design Automation Conference. ACM, New York, 4--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cheng, L., Gupta, P., Spanos, C. J., Qian, K., and He, L. 2011. Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 30, 3, 388--401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Frank, D. J. 2002. Power-constrained CMOS scaling limits. IBM J. Res. Dev. 46, 2/3, 235--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Henning, J. L. 2006. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News 34, 4, 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hill, M. D. and Smith, A. J. 1989. Evaluating associativity in CPU caches. IEEE Trans. Comput. 38, 12, 1612--1630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ishihara, T. and Fallah, F. 2005. A cache-defect-aware code placement algorithm for improving the performance of processors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. 995--1001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kim, T.-H., Liu, J., Keane, J., and Kim, C. H. 2008. A 0.2 V, 480 kb subthreshold SRAM with 1 k cells per bitline for ultra-low-voltage computing. IEEE J. Solid-State Circuits 43, 2, 518--529.Google ScholarGoogle ScholarCross RefCross Ref
  16. Koh, C.-K., Wong, W.-F., Chen, Y., and Li, H. 2009. Tolerating process variations in large, set-associative caches: The buddy cache. ACM Trans. Archit. Code Optim. 6, 2, 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Koren, I., Koren, Z., and Stepper, C. H. 1993. A unified negative-binomial distribution for yield analysis of defect-tolerant circuits. IEEE Trans. Comput. 42, 6, 724--734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ladas, N., Sazeides, Y., and Desmet, V. 2010. Performance-effective operation below Vcc-min. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems Software. 223--234.Google ScholarGoogle Scholar
  19. Le, H. Q., Starke, W. J., Fields, J. S., O'Connell, F. P., Nguyen, D. Q., Ronchetti, B. J., Sauer, W. M., Schwarz, E. M., and Vaden, M. T. 2007. IBM POWER6 microarchitecture. IBM J. Res. Dev. 51, 6, 639--662. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lee, H., Cho, S., and Childers, B. R. 2007a. Exploring the interplay of yield, area, and performance in processor caches. In Proceedings of the 25th International Conference on Computer Design. 216--223.Google ScholarGoogle Scholar
  21. Lee, H., Cho, S., and Childers, B. R. 2007b. Performance of graceful degradation for cache faults. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI. 409--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lee, H., Cho, S., and Childers, B. R. 2011. DEFCAM: A design and evaluation framework for defect-tolerant cache memories. ACM Trans. Archit. Code Optim. 8, 3, 17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B., and Werner, B. 2002. Simics: A full system simulation platform. Computer 35, 2, 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. 2005. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. McNairy, C. and Mayfield, J. 2005. Montecito error protection and mitigation. In Proceedings of the 1st Workshop on High Performance Computing Reliability Issues, in Conjunction with HPCA'05.Google ScholarGoogle Scholar
  26. Nassif, S. R., Mehta, N., and Cao, Y. 2010. A resilience roadmap. In Proceedings of the Design, Automation, and Test Conference in Europe. 1011--1016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Patterson, D. A., Garrison, P., Hill, M., Lioupis, D., Nyberg, C., Sippel, T., and Van Dyke, K. 1983. Architecture of a VLSI instruction cache for a RISC. In Proceedings of the 10th Annual International Symposium on Computer Architecture. ACM, New York, 108--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Pour, A. F. and Hill, M. D. 1993. Performance implications of tolerating cache faults. IEEE Trans. Comput. 42, 3, 257--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Roberts, D., Kim, N. S., and Mudge, T. 2007. On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology. In Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools. 570--578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sánchez, D., Sazeides, Y., Aragón, J. L., and Garcia, J. M. 2011. An analytical model for the calculation of the expected miss ratio in faulty caches. In IOLTS. 252--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shirvani, P. P. and McCluskey, E. J. 1999. PADded cache: A new fault-tolerance technique for cache memories. In Proceedings of the 17th IEEE VLSI Test Symposium. IEEE Computer Society, 440--445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sohi, G. S. 1989. Cache memory organization to enhance the yield of high performance VLSI processors. IEEE Trans. Comput. 38, 4, 484--492. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Song, F., Moore, S., and Dongarra, J. 2007. L2 cache modeling for scientific applications on chip multi-processors. In Proceedings of the 2007 International Conference on Parallel Processing (ICPP'07). 51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Stapper, C. H., Armstrong, F. M., and Saji, K. 1983. Integrated circuit yield statistics. Proc. IEEE 71, 4, 453--470.Google ScholarGoogle ScholarCross RefCross Ref
  35. Taur, Y. 2002. CMOS design near to the limit of scaling. IBM J. Res. Dev. 46, 2/3, 213--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Unsal, O. S., Tschanz, J. W., Bowman, K., De, V., Vera, X., Gonzalez, A., and Ergin, O. 2006. Impact of parameter variations on circuits and microarchitecture. IEEE Micro 26, 6, 30--39. DOI: http://dx.doi.org/10.1109/MM.2006.122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Verma, N. and Chandrakasan, A. P. 2008. A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy. IEEE J. Solid-State Circuits 43, 1, 141--149.Google ScholarGoogle ScholarCross RefCross Ref
  38. Wilkerson, C., Gao, H., Alameldeen, A. R., Chishti, Z., Khellah, M., and Lu, S.-L. 2008. Trading off cache capacity for reliability to enable low voltage operation. In Proceedings of the 35th Annual International Symposium on Computer Architecture. 203--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Woo, S. C., Ohara, M., Torrie, E., Singh, J. P., and Gupta, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture. ACM, New York, 24--36 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yamaoka, M., Osada, K., Tsuchiya, R., Horiuchi, M., Kimura, S., and Kawahara, T. 2004. Low power SRAM menu for SOC application using Yin-Yang-feedback memory cell technology. In Proceedings of the Symposium on VLSI Circuits. 288--291.Google ScholarGoogle Scholar
  41. Yao, S. B. 1977. Approximating block accesses in database organizations. Commun. ACM 20, 4, 260--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zhang, K., Bhattacharya, U., Chen, Z., Hamzaoglu, F., Murray, D., Vallepalli, N., Wang, Y., Zheng, B., and Bohr, M. 2004. SRAM design on 65nm CMOS technology with integrated leakage reduction scheme. In Proceedings of the Symposium on VLSI Circuits. 294--295.Google ScholarGoogle Scholar

Index Terms

  1. Modeling the impact of permanent faults in caches

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization  Volume 10, Issue 4
      December 2013
      1046 pages
      ISSN:1544-3566
      EISSN:1544-3973
      DOI:10.1145/2541228
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 December 2013
      • Accepted: 1 July 2013
      • Revised: 1 June 2013
      • Received: 1 February 2013
      Published in taco Volume 10, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader