Abstract
Exponentially increasing with technology scaling, soft errors have become a serious design concern in the deep sub-micron embedded systems. Partially Pro-tected Cache (PPC) is a promising microarchitectural feature to mitigate failures due to soft errors in embedded processors. A processor with PPC maintains two caches, one protected and the other unprotected, both at the same level of memory hierarchy. By finding out the data more prone to soft errors and mapping only that to the protected cache, the failure rate can be significantly improved at minimal power and performance penalty. While the effectiveness of PPCs has been demonstrated on multimedia applications – where the multimedia data is inherently resilient to soft errors – no such obvious data partitioning exists for applications in general. This paper proposes profile-based data partitioning schemes that are applicable to applications in general and effectively reduce failures due to soft errors at mini-mal power and performance overheads. Our experimental results demonstrate that our algorithm reduces the failure rate by 47× on benchmarks from MiBench while incurring only 0.5% performance and 15% power overheads.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
International Technology Roadmap for Semiconductors 2005 Executive Summary. http://www.itrs.net/Links/2005ITRS/ExecSum2005.pdf.
L. Anghel and M. Nicolaidis. Cost reduction and evaluation of a temporary faults detecting technique. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 591–597, 2000.
Ghazanfar-Hossein Asadi, Vilas Sridharan, Mehdi B. Tahoori, and David Kaeli. Balancing performance and reliability in the memory hierarchy. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 269–279, 2005.
Robert Baumann. Soft errors in advanced computer systems. IEEE Design and Test of Computers, pages 258–266, 2005.
Doug Burger and Todd M. Austin. The SimpleScalar Tool Set, version 2.0. SIGARCH Computer Architecture News, 25(3):13–25, 1997.
D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, Toan Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge. Razor: A low-power pipeline based on circuit-level timing speculation. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 7–13, 2003.
J. Gaisler. Evaluation of a 32-bit microprocessor with builtin concurrent error-detection. In IEEE International Symposium on Fault-Tolerant Computing (FTCS), 1997.
M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. MiBench: A free, commercially representative embedded benchmark suite. In IEEE Workshop on Workload Characterization, pages 3–14, 2001.
P. Hazucha and C. Svensson. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans. on Nuclear Science, 47(6):2586–2594, 2000.
Hewlett Packard, http://www.hp.com. HP iPAQ h4000 Series - System Specifications.
Soontae Kim. Area-efficient error protection for caches. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 1282–1287, Mar 2006.
S. Krishnamohan and N. R. Mahapatra. An efficient error-masking technique for improving the soft-error robustness of static CMOS circuits. In IEEE International SOC Conference (SOCC), pages 227–230, Sep 2004.
Kyoungwoo Lee, Aviral Shrivastava, Ilya Issenin, Nikil Dutt, and Nalini Venkatasubramanian. Mitigating soft error failures for multimedia applications by selective data protection. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pages 411–420, Oct 2006.
Jin-Fu Li and Yu-Jane Huang. An error detection and correction scheme for RAMs with partial-write function. In IEEE International Workshop on Memory Technology, Design and Testing (MTDT), pages 115–120, 2005.
Lin Li, Vijay Degalahal, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin. Soft error and energy consumption interactions: A data cache perspective. In International Symposium on Low Power Electronics and Design (ISLPED), pages 132–137, Aug 2004.
P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson. On latching probability of particle induced transients in combinational networks. In IEEE International Symposium on Fault-Tolerant Computing (FTCS), 1994.
Subhasish Mitra, Norbert Seifert, Ming Zhang, Quan Shi, and Kee Sup Kim. Robust system design with built-in soft-error resilience. IEEE Computer, 38(2):43–52, Feb 2005.
Kartik Mohanram and Nur A. Touba. Partial error masking to reduce soft error failure rate in logic circuits. In IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT), pages 433–440, 2003.
K. Mohr and L. Clark. Delay and area efficient first-level cache soft error detection and correction. In IEEE International Conference on Computer Design (ICCD), 2006.
Shubhendu S. Mukherjee, Joel Emer, Tryggve Fossum, and Steven K. Reinhardt. Cache scrubbing in microprocessors: Myth or necessity? In IEEE Pacific Rim International Symposium on Dependable Computing (PRDC), pages 37–42, 2004.
Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 29–40, Dec 2003.
M. Nicolaidis. Time redundancy based soft-error tolerance to rescue nanometer technologies. In IEEE VLSI Test Symposium (VTS), page 86, 1999.
A. K. Nieuwland, S. Jasarevic, and G. Jerin. Combinational logic soft error analysis and protection. In IEEE International Symposium on On-Line Testing (IOLTS), pages 99–104, 2006.
Richard Phelan. Addressing soft errors in ARM core-based designs. Technical report, ARM, 2003.
D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice Hall, 1996. ISBN 0-1305-
Nhon Quach. High availability and reliability in the Itanium processor. IEEE/ACM International
P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. In WRL Technical Report 2001/2, 2001.
P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi. Modeling the effect of technology trends on soft error rate of combinational logic. In IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 389–398, 2002.
Aviral Shrivastava, Ilya Issenin, and Nikil Dutt. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pages 90–96, 2005.
Makoto Sugihara, Tohru Ishihara, and Kazuaki Murakami. Task scheduling for reliable cache architectures of multiprocessor systems. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 1490–1495, 2007.
Synopsys Inc., Mountain View, CA, USA. Design Compiler Reference Manual, 2001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this paper
Cite this paper
Lee, K., Shrivastava, A., Dutt, N., Venkatasubramanian, N. (2008). Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures. In: Kleinjohann, B., Wolf, W., Kleinjohann, L. (eds) Distributed Embedded Systems: Design, Middleware and Resources. DIPES 2008. IFIP – The International Federation for Information Processing, vol 271. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09661-2_21
Download citation
DOI: https://doi.org/10.1007/978-0-387-09661-2_21
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09660-5
Online ISBN: 978-0-387-09661-2
eBook Packages: Computer ScienceComputer Science (R0)