Abstract
One important trend in today's microprocessor architectures is the increase in size of the processor caches. These caches also tend to be set associative. As technology scales, process variations are expected to increase the fault rates of the SRAM cells that compose such caches. As an important component of the processor, the parametric yield of SRAM cells is crucial to the overall performance and yield of the microchip. In this article, we propose a microarchitectural solution, called the buddy cache that permits large, set-associative caches to tolerate faults in SRAM cells due to process variations. In essence, instead of disabling a faulty cache block in a set (as is the current practice), it is paired with another faulty cache block in the same set—the buddy. Although both cache blocks are faulty, if the faults of the two blocks do not overlap, then instead of losing two blocks, buddying will yield a functional block from the nonfaulty portions of the two blocks. We found that with buddying, caches can better mitigate the negative impacts of process variations on performance and yield, gracefully downgrading performance as opposed to catastrophic failure. We will describe the details of the buddy cache and give insights as to why it is both more performance and yield resilient to faults.
- Agarwal, A., Paul, B. C., Mahmoodi, H., Datta, A., and Roy, K. 2005. A process-tolerant cache architecture for improved yield in nanoscale technologies. IEEE Trans. VLSI Syst. 13, 1, 27--38. Google ScholarDigital Library
- Arimilli, R. K., Dodson, J. S., Lewis, J. D., and Skergan, T. M. 1999. Cache array defect functional bypassing using repair mask. U.S. Patent Number 5,958,068.Google Scholar
- Borkar, S. 2005. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE MICRO 25, 6, 10--16. Google ScholarDigital Library
- Borkar, S., Karnik, T., Narendra, S., Tschanz, J., Keshavarzi, A., and De, V. 2003. Parameter variations and impact on circuits and micro-architecture. In Proceedings of the 40th Design Automation Conference (DAC'03). 338--342. Google ScholarDigital Library
- Cao, Y., Sato, T., Sylvester, D., Orshansky, M., and Hu, C. 2000. New paradigm of predictive MOSFET and interconnect modeling for early circuit design. In Proceedings of the IEEE Custom Integrated Circuits Conference. 201--204.Google Scholar
- Choi, S. H., Paul, B. C., and Roy, K. 2004. Novel sizing algorithm for yield improvement under process variation in nanometer technology. In Proceedings of the 41st Design Automation Conference (DAC'04). 454--459. Google ScholarDigital Library
- Datta, A., Bhunia, S., Mukhopadhyay, S., Banerjee, N., and Roy, K. 2005. Statistical modeling of pipeline delay and design of pipeline under process variation to enhance yield in sub-100nm technologies. In Proceedings of Design, Automation and Test in Europe (DATE'05). 926--931. Google ScholarDigital Library
- Fischer, T., Olbrich, A., Georgakos, G., Lemaitre, B., and Schmitt-Ladsiedel, D. 2007. Impact of process variations and long term degradation on 6T-SRAM cells. In Advances in Radio Science 5, 321--325.Google ScholarCross Ref
- Fujimoto, Y. 2000. Cache memory having flags for inhibiting rewrite of replacement algorithm area corresponding to fault cell and information processing system having such a cache memory. U.S. Patent Number 6,145,055.Google Scholar
- Hennessy, J. L. and Patterson, D. A. 2006. Computer Architecture: A Quantitative Approach 4th Ed. Morgan-Kaufmann, San Francisco, CA. Google ScholarDigital Library
- Kurdahi, F. J., Eltawil, A. M., Park, Y.-H., Kanj, R. N., and Nassif, S. R. 2006. System-level SRAM yield enhancement. In Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED'06). 179--184. Google ScholarDigital Library
- Lee, H., Cho, S., and Childers, B. R. 2007. Performance of graceful degradation for cache faults. In Proceedings of the IEEE Symposium on VLSI (ISVLSI'07). 409--415. Google ScholarDigital Library
- Liang, X. and Brooks, D. 2006. Microarchitecture parameter selection to optimize system performance under process variation. In Proceedings of the International Conference on Computer-Aided Design (ICCAD'06). 429--436. Google ScholarDigital Library
- Marculescu, D. and Talpes, E. 2005. Variability and energy awareness: a micro-architecture-level perspective. In Proceedings of the 42nd Design Automation Conference (DAC'05). 11--16. Google ScholarDigital Library
- McClure, D. 1997. Method and system for bypassing a faulty line of data or its associated tag of a set associative cache memory. U.S. Patent Number 5,666,482.Google Scholar
- Mizuno, T., Okumtura, J., and Toriumi, A. 1994. Experimental study of threshold voltage fluctuation due to statistical variation of channel dopant number in MOSFET's. IEEE Trans. Electron Devices 41, 11, 2216--2221.Google ScholarCross Ref
- Mukhopadhyay, S., Mahmoodi, H., and Roy, K. 2004. Statistical design and optimization of SRAM cell for yield enhancement. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'04). 10--13. Google ScholarDigital Library
- Ozdemir, S., Sinha, D., Memik, G., Adams, J., and Zhou, H. 2006. Yield-aware cache architectures. In Proceedings of the 39th International Symposium on Micro-Architecture (MICRO-39). 15--25. Google ScholarDigital Library
- Pellston 2006. Intel's newest Quad Xeon MP versus HP's DL585 Quad Opteron. http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=2872&p=2.Google Scholar
- Powell, M. D., Agarwal, A., Vijaykumar, T. N., Falsafi, B., and Roy, K. 2001. Reducing set-associative cache energy via way-prediction and selective direct-mapping. In Proceedings of the 34th International Symposium on Micro-Architecture (MICRO-34). 54--65. Google ScholarDigital Library
- Rabaey, J. M., Chandrakasan, A., and Nikoli&cgrave;, B. 2003. Digital Integrated Circuits: A Design Perspective 2nd Ed. Prentice-Hall, Upper Saddle River, NJ. Google ScholarDigital Library
- Shirvani, P. P. and McCluskey, E. J. 1999. PADded cache: A new fault-tolerance technique for cache memories. In Proceedings of the 17th IEEE VLSI Test Symposium (VTS'99) IEEE Computer Society, USA, 440. Google ScholarDigital Library
- SimpleScalar LLC. The SimpleScalar toolset. http://www.simplescalar.com.Google Scholar
- Standard Performance Evaluation Corp. SPEC CPU2000 V1.3. http://www.spec.org/cpu2000.Google Scholar
- Tarjan, D., Thoziyoor, S., and Jouppi, N. P. 2006. CACTI 4.0. Tech. rep., HP Western Research Labs.Google Scholar
- Tschanz, J., Bowman, K., and De, V. 2005. Variation-tolerant circuits: Circuit solutions and techniques. In Proceedings of the 42nd Design Automation Conference (DAC'05). 762--763. Google ScholarDigital Library
- University of Minnesota ARCTiC Labs. Minnespec: A new SPEC benchmark workload for simulation-based computer architecture research. http://www.arctic.umn.edu/minnespec/index.shtml.Google Scholar
- Wilkerson, C., Gao, H., Alameldeen, A. R., Chishti, Z., Khellah, M., and Lu, S.-L. 2008. Trading off cache capacity for reliability to enable low voltage operation. SIGARCH Comput. Archit. News 36, 3, 203--214. Google ScholarDigital Library
- Xiong, J., Tam, K., and He, L. 2005. Buffer insertion considering process variation. In Proceedings of Design, Automation and Test in Europe (DATE'05). 970--975. Google ScholarDigital Library
- Zhang, C., Vahid, F., Yang, J., and Najjar, W. 2005. A way-halting cache for low-energy high-performance systems. ACM Trans. Archit. Code Optim. 2, 1, 34--54. Google ScholarDigital Library
Index Terms
- Tolerating process variations in large, set-associative caches: The buddy cache
Recommendations
Snug set-associative caches: Reducing leakage power of instruction and data caches with no performance penalties
As transistors keep shrinking and on-chip caches keep growing, static power dissipation resulting from leakage of caches takes an increasing fraction of total power in processors. Several techniques have already been proposed to reduce leakage power by ...
Reactive-Associative Caches
PACT '01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation TechniquesAbstract: While set-associative caches typically incur fewer misses than direct-mapped caches, set-associative caches have slower hit times. We propose the reactive-associative cache (r-a cache), which provides flexible associativity by placing most ...
Reducing traffic generated by conflict misses in caches
CF '04: Proceedings of the 1st conference on Computing frontiersOff-chip memory accesses are a major source of power consumption in embedded processors. In order to reduce the amount of traffic between the processor and the off-chip memory as well as to hide the memory latency, nearly all embedded processors have a ...
Comments