research-article

Free Access

Modeling the impact of permanent faults in caches

Authors:
Daniel Sánchez

University of Murcia, Murcia, Spain

University of Murcia, Murcia, Spain
View Profile

,
Yiannakis Sazeides

University of Cyprus, Nicosia, Cyprus

University of Cyprus, Nicosia, Cyprus
View Profile

,
Juan M. Cebrián

University of Murcia, Murcia, Spain

University of Murcia, Murcia, Spain
View Profile

,
José M. García

University of Murcia, Murcia, Spain

University of Murcia, Murcia, Spain
View Profile

,
Juan L. Aragón

University of Murcia, Murcia, Spain

University of Murcia, Murcia, Spain
View Profile

ACM Transactions on Architecture and Code Optimization Volume 10 Issue 4Article No.: 29pp 1–23https://doi.org/10.1145/2541228.2541236

Published:01 December 2013Publication History

ACM Transactions on Architecture and Code Optimization

Abstract

The traditional performance cost benefits we have enjoyed for decades from technology scaling are challenged by several critical constraints including reliability. Increases in static and dynamic variations are leading to higher probability of parametric and wear-out failures and are elevating reliability into a prime design constraint. In particular, SRAM cells used to build caches that dominate the processor area are usually minimum sized and more prone to failure. It is therefore of paramount importance to develop effective methodologies that facilitate the exploration of reliability techniques for caches.

To this end, we present an analytical model that can determine for a given cache configuration, address trace, and random probability of permanent cell failure the exact expected miss rate and its standard deviation when blocks with faulty bits are disabled. What distinguishes our model is that it is fully analytical, it avoids the use of fault maps, and yet, it is both exact and simpler than previous approaches. The analytical model is used to produce the miss-rate trends (expected miss-rate) for future technology nodes for both uncorrelated and clustered faults. Some of the key findings based on the proposed model are (i) block disabling has a negligible impact on the expected miss-rate unless probability of failure is equal or greater than 2.6e-4, (ii) the fault map methodology can accurately calculate the expected miss-rate as long as 1,000 to 10,000 fault maps are used, and (iii) the expected miss-rate for execution of parallel applications increases with the number of threads and is more pronounced for a given probability of failure as compared to sequential execution.

References

Agarwal, A., Hennessy, J., and Horowitz, M. 1989. An analytical cache model. ACM Trans. Comput. Syst. 7, 2, 184--215. Google ScholarDigital Library
Agarwal, K. and Nassif, S. 2006. Statistical analysis of SRAM cell stability. In Proceedings of the 43rd Annual Design Automation Conference. ACM, New York, 57--62. Google ScholarDigital Library
Ansari, A., Gupta, S., Feng, S., and Mahlke, S. 2009. ZerehCache: Armoring cache architectures in high defect density technologies. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'09). 100--110. Google ScholarDigital Library
Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. ACM, New York, 72--81. Google ScholarDigital Library
Borkar, S. 1999. Design challenges of technology scaling. IEEE Micro 19, 4, 23--29. Google ScholarDigital Library
Borkar, S., Karnik, T., Narendra, S., Tschanz, J., Keshavarzi, A., and De, V. 2003. Parameter variations and impact on circuits and microarchitecture. In Proceedings of the 40th Annual Design Automation Conference. ACM, New York, 338--342. Google ScholarDigital Library
Bowman, K. A., Alameldeen, A. R., Srinivasan, S. T., and Wilkerson, C. B. 2007. Impact of die-to-die and within-die parameter variations on the throughput distribution of multi-core processors. In Proceedings of the 2007 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED'07). 50--55. DOI: http://dx.doi.org/10.1145/1283780.1283792 Google ScholarDigital Library
Bowman, K. A., Duvall, S. G., and Meindl, J. D. 2002. Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration. IEEE J. Solid-State Circuits 37, 2, 183--190.Google ScholarCross Ref
Bowman, K., Tschanz, J., Wilkerson, C., Lu, S.-L., Karnik, T., De, V., and Borkar, S. 2009. Circuit techniques for dynamic variation tolerance. In Proceedings of the 46th Annual Design Automation Conference. ACM, New York, 4--7. Google ScholarDigital Library
Cheng, L., Gupta, P., Spanos, C. J., Qian, K., and He, L. 2011. Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 30, 3, 388--401. Google ScholarDigital Library
Frank, D. J. 2002. Power-constrained CMOS scaling limits. IBM J. Res. Dev. 46, 2/3, 235--244. Google ScholarDigital Library
Henning, J. L. 2006. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News 34, 4, 1--17. Google ScholarDigital Library
Hill, M. D. and Smith, A. J. 1989. Evaluating associativity in CPU caches. IEEE Trans. Comput. 38, 12, 1612--1630. Google ScholarDigital Library
Ishihara, T. and Fallah, F. 2005. A cache-defect-aware code placement algorithm for improving the performance of processors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. 995--1001. Google ScholarDigital Library
Kim, T.-H., Liu, J., Keane, J., and Kim, C. H. 2008. A 0.2 V, 480 kb subthreshold SRAM with 1 k cells per bitline for ultra-low-voltage computing. IEEE J. Solid-State Circuits 43, 2, 518--529.Google ScholarCross Ref
Koh, C.-K., Wong, W.-F., Chen, Y., and Li, H. 2009. Tolerating process variations in large, set-associative caches: The buddy cache. ACM Trans. Archit. Code Optim. 6, 2, 8. Google ScholarDigital Library
Koren, I., Koren, Z., and Stepper, C. H. 1993. A unified negative-binomial distribution for yield analysis of defect-tolerant circuits. IEEE Trans. Comput. 42, 6, 724--734. Google ScholarDigital Library
Ladas, N., Sazeides, Y., and Desmet, V. 2010. Performance-effective operation below Vcc-min. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems Software. 223--234.Google Scholar
Le, H. Q., Starke, W. J., Fields, J. S., O'Connell, F. P., Nguyen, D. Q., Ronchetti, B. J., Sauer, W. M., Schwarz, E. M., and Vaden, M. T. 2007. IBM POWER6 microarchitecture. IBM J. Res. Dev. 51, 6, 639--662. Google ScholarDigital Library
Lee, H., Cho, S., and Childers, B. R. 2007a. Exploring the interplay of yield, area, and performance in processor caches. In Proceedings of the 25th International Conference on Computer Design. 216--223.Google Scholar
Lee, H., Cho, S., and Childers, B. R. 2007b. Performance of graceful degradation for cache faults. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI. 409--415. Google ScholarDigital Library
Lee, H., Cho, S., and Childers, B. R. 2011. DEFCAM: A design and evaluation framework for defect-tolerant cache memories. ACM Trans. Archit. Code Optim. 8, 3, 17. Google ScholarDigital Library
Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B., and Werner, B. 2002. Simics: A full system simulation platform. Computer 35, 2, 50--58. Google ScholarDigital Library
Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. 2005. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4. Google ScholarDigital Library
McNairy, C. and Mayfield, J. 2005. Montecito error protection and mitigation. In Proceedings of the 1st Workshop on High Performance Computing Reliability Issues, in Conjunction with HPCA'05.Google Scholar
Nassif, S. R., Mehta, N., and Cao, Y. 2010. A resilience roadmap. In Proceedings of the Design, Automation, and Test Conference in Europe. 1011--1016. Google ScholarDigital Library
Patterson, D. A., Garrison, P., Hill, M., Lioupis, D., Nyberg, C., Sippel, T., and Van Dyke, K. 1983. Architecture of a VLSI instruction cache for a RISC. In Proceedings of the 10th Annual International Symposium on Computer Architecture. ACM, New York, 108--116. Google ScholarDigital Library
Pour, A. F. and Hill, M. D. 1993. Performance implications of tolerating cache faults. IEEE Trans. Comput. 42, 3, 257--267. Google ScholarDigital Library
Roberts, D., Kim, N. S., and Mudge, T. 2007. On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology. In Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools. 570--578. Google ScholarDigital Library
Sánchez, D., Sazeides, Y., Aragón, J. L., and Garcia, J. M. 2011. An analytical model for the calculation of the expected miss ratio in faulty caches. In IOLTS. 252--257. Google ScholarDigital Library
Shirvani, P. P. and McCluskey, E. J. 1999. PADded cache: A new fault-tolerance technique for cache memories. In Proceedings of the 17th IEEE VLSI Test Symposium. IEEE Computer Society, 440--445. Google ScholarDigital Library
Sohi, G. S. 1989. Cache memory organization to enhance the yield of high performance VLSI processors. IEEE Trans. Comput. 38, 4, 484--492. Google ScholarDigital Library
Song, F., Moore, S., and Dongarra, J. 2007. L2 cache modeling for scientific applications on chip multi-processors. In Proceedings of the 2007 International Conference on Parallel Processing (ICPP'07). 51. Google ScholarDigital Library
Stapper, C. H., Armstrong, F. M., and Saji, K. 1983. Integrated circuit yield statistics. Proc. IEEE 71, 4, 453--470.Google ScholarCross Ref
Taur, Y. 2002. CMOS design near to the limit of scaling. IBM J. Res. Dev. 46, 2/3, 213--222. Google ScholarDigital Library
Unsal, O. S., Tschanz, J. W., Bowman, K., De, V., Vera, X., Gonzalez, A., and Ergin, O. 2006. Impact of parameter variations on circuits and microarchitecture. IEEE Micro 26, 6, 30--39. DOI: http://dx.doi.org/10.1109/MM.2006.122. Google ScholarDigital Library
Verma, N. and Chandrakasan, A. P. 2008. A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy. IEEE J. Solid-State Circuits 43, 1, 141--149.Google ScholarCross Ref
Wilkerson, C., Gao, H., Alameldeen, A. R., Chishti, Z., Khellah, M., and Lu, S.-L. 2008. Trading off cache capacity for reliability to enable low voltage operation. In Proceedings of the 35th Annual International Symposium on Computer Architecture. 203--214. Google ScholarDigital Library
Woo, S. C., Ohara, M., Torrie, E., Singh, J. P., and Gupta, A. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture. ACM, New York, 24--36 Google ScholarDigital Library
Yamaoka, M., Osada, K., Tsuchiya, R., Horiuchi, M., Kimura, S., and Kawahara, T. 2004. Low power SRAM menu for SOC application using Yin-Yang-feedback memory cell technology. In Proceedings of the Symposium on VLSI Circuits. 288--291.Google Scholar
Yao, S. B. 1977. Approximating block accesses in database organizations. Commun. ACM 20, 4, 260--261. Google ScholarDigital Library
Zhang, K., Bhattacharya, U., Chen, Z., Hamzaoglu, F., Murray, D., Vallepalli, N., Wang, Y., Zheng, B., and Bohr, M. 2004. SRAM design on 65nm CMOS technology with integrated leakage reduction scheme. In Proceedings of the Symposium on VLSI Circuits. 294--295.Google Scholar

Index Terms

Modeling the impact of permanent faults in caches
1. Hardware
  1. Hardware test
    1. Memory test and repair

Recommendations

Reducing traffic generated by conflict misses in caches
CF '04: Proceedings of the 1st conference on Computing frontiers

Off-chip memory accesses are a major source of power consumption in embedded processors. In order to reduce the amount of traffic between the processor and the off-chip memory as well as to hide the memory latency, nearly all embedded processors have a ...
Read More
Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties

As transistors keep shrinking and on-chip caches keep growing, static power dissipation resulting from leakage of caches takes an increasing fraction of total power in processors. Several techniques have already been proposed to reduce leakage power by ...
Read More
Reliability Analysis of N-Modular Redundancy Systems with Intermittent and Permanent Faults

It is well known that static redundancy techniques are very efficient against intermittent (transient) faults which constitute a large portion of logic faults in digital systems. However, very little theoretical work has been done in evaluating the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Architecture and Code Optimization Volume 10, Issue 4
December 2013
1046 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2541228
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2013
- Accepted: 1 July 2013
- Revised: 1 June 2013
- Received: 1 February 2013
Published in taco Volume 10, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Reliability
caches
fault tolerance
yield
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 354
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling the impact of permanent faults in caches

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Reducing traffic generated by conflict misses in caches

Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties

Reliability Analysis of N-Modular Redundancy Systems with Intermittent and Permanent Faults

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Modeling the impact of permanent faults in caches

ACM Transactions on Architecture and Code Optimization

Abstract

References

Cited By

Index Terms

Recommendations

Reducing traffic generated by conflict misses in caches

Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties

Reliability Analysis of N-Modular Redundancy Systems with Intermittent and Permanent Faults

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media