ABSTRACT
Time redundancy (rollback-recovery) and hardware redundancy are commonly used in real-time systems to achieve fault tolerance. From an energy consumption point of view, time redundancy is generally more preferable than hardware redundancy. However, hard real-time systems often use hardware redundancy to meet high reliability requirements of safety-critical applications. In this paper we propose a hardware-redundancy technique with low energy-overhead for hard real-time systems. The proposed technique is based on standby-sparing, where the system is composed of a primary unit and a spare. Through analytical models, we have developed an online energy-management method which uses a slack reclamation scheme to reduce the energy consumption of both the primary and spare units. In this method, dynamic voltage scaling (DVS) is used for the primary unit and dynamic power management (DPM) is used for the spare. We conducted several experiments to compare the proposed system with a fault-tolerant real-time system which uses time redundancy for fault tolerance and DVS with slack reclamation for low energy consumption. The results show that for relaxed time constraints, the proposed system provides up to 24% energy saving as compared to the time-redundancy system. For tight deadlines when the time-redundancy system can tolerate no faults, the proposed system preserves its fault-tolerance but with about 32% more energy consumption.
- V. Izosimov, P. Pop, P. Eles, and Z. Peng, "Scheduling of Fault-Tolerant Embedded Systems with Soft and Hard Timing Constraints", in Proc. Design, Automation and Test in Europe (DATE '08), pp. 915--920, March 2008. Google ScholarDigital Library
- R. Melhem, D. Mosse, and E. Elnozahy, "The interplay of power management and fault recovery in real-time systems," IEEE Trans. Computers, vol. 53, no. 2, pp. 217--231, 2004. Google ScholarDigital Library
- Y. Zhang and K. Chakrabarty, "Dynamic adaptation for fault tolerance and power management in embedded real-time systems," ACM Tran. Embedded Computing Systems, vol. 3, no. 2, pp. 336--360, 2004. Google ScholarDigital Library
- F. Liberato, R. Melhem, and D. Mosse, "Tolerance to multiple transient faults for aperiodic tasks in hard real-time systems," IEEE Trans. Computers, vol. 49, no. 9, pp. 906--914, 2000. Google ScholarDigital Library
- P. Eles, V. Izosimov, P. Pop, and Z. Peng, "Synthesis of Fault-Tolerant Embedded Systems", in Proc. Design, Automation and Test in Europe (DATE '08), pp. 1117--1122, March 2008. Google ScholarDigital Library
- A. Ejlali, B.M. Al-Hashimi, M.T. Schmitz, P. Rosinger, and S.G. Miremadi, "Combined Time and Information Redundancy for SEU-Tolerance in Energy-Efficient Real-Time Systems", IEEE Trans. VLSI Sys., vol. 14, no. 4, pp. 323--335, April 2006. Google ScholarDigital Library
- I. Koren, and C. M. Krishna, Fault-Tolerant Systems, Morgan Kaufmann, Elsevier, 2007. Google ScholarDigital Library
- Y. Zhang and K. Chakrabarty, "A Unified Approach for Fault Tolerance and Dynamic Power Management in Fixed-Priority Real-Time Embedded Systems", IEEE Trans. CAD, vol. 25, no. 1, pp. 111--125 JAN. 2006. Google ScholarDigital Library
- A. M. K. Cheng, Real-Time Systems, Scheduling, Analysis, and Verification, John Wiley&Sons, 2002. Google ScholarDigital Library
- M. T. Schmitz, B. M. Al-Hashimi, and P. Eles, System-Level Design Techniques for Energy-Efficient Embedded Systems, Norwell, MA: Kluwer, 2004. Google ScholarDigital Library
- T. D. Burd, T. A. Pering, A. J. Stratakos, and R. W. Brodersen, "A dynamic voltage scaled microprocessor system," IEEE J. Solid-State Circuits, vol. 35, no. 11, pp. 1571--1580, Nov. 2000.Google ScholarCross Ref
- K. Marti, Stochastic Optimization Methods, Second Edition, Springer, 2008.Google Scholar
- P. Li, and B. Ravindran, "Fast, Best-Effort Real-Time Scheduling Algorithm", IEEE Trans. Copuuters, vol. 53, no. 9, Sept. 2004. Google ScholarDigital Library
- H. Aydin, R. Melhem, D. Mosse, and P. Mejia-Alvarez, "Power-Aware Scheduling for Periodic Real-Time Tasks", IEEE Trans. Computers, vol. 53, no. 5, May 2004. Google ScholarDigital Library
- D. Zhu, R. Melhem, D. Mosse, and E. Elnozahy, "Analysis of an energy efficient optimistic TMR scheme", in Proc. 10th Int'l Conf. Parallel and Distributed Systems (ICPADS 2004), pp. 559--568, July 2004. Google ScholarDigital Library
- S. Poledna, Fault-tolerant real-time systems: The problem of replica determinism, Kluwer Academic Publishers, 1996. Google ScholarDigital Library
- H. Kopetz, Real-time systems: Design principles for distributed embedded applications, Kluwer Academic Publishers, 2002. Google ScholarDigital Library
- D.K. Pradhan, Fault-tolerant computer system design, Prentice-Hall, 1996. Google ScholarDigital Library
- R. Jejurikar, and R. Gupta, "Dynamic slack reclamation with procrastination scheduling in real time embedded systems", in Proc. 42nd Design Automation Conference (DAC 2005), pp. 111--116, June 2005. Google ScholarDigital Library
- "TM5400/TM5600 Data Book", Transmeta Corp., Santa Clara, CA, 2000.Google Scholar
- http://www-micrel.deis.unibo.it/sitonew/research/mparm.htmlGoogle Scholar
- M. R. Guthaus, J. S. Ringenberg, D. Ernst,T. M. Austin, T. Mudge, and R. B. Brown, "MiBench: A free, commercially representative embedded benchmark suite", in Proc. IEEE 4th annual Workshop on Workload Characterization, pp. 83--94, 2001. Google ScholarDigital Library
- L. Benini, D. Bertozzi, A. Bogoliolo, F. Menichelli, and M. Olivieri., "MPARM: Exploring the Multi-Processor SoC Design Space with SystemC", The Journal of VLSI Signal Processing, vol. 41, no. 2, pp. 169--182, 2005. Google ScholarDigital Library
- http://www.rtems.comGoogle Scholar
Index Terms
- A standby-sparing technique with low energy-overhead for fault-tolerant hard real-time systems
Recommendations
Simultaneous hardware and time redundancy with online task scheduling for low energy highly reliable standby-sparing system
Regular PapersStandby-sparing is one of the common techniques in order to design fault-tolerant safety-critical systems where the high level of reliability is needed. Recently, the minimization of energy consumption in embedded systems has attracted a lot of ...
Low-Energy Standby-Sparing for Hard Real-Time Systems
Time-redundancy techniques are commonly used in real-time systems to achieve fault tolerance without incurring high energy overhead. However, reliability requirements of hard real-time systems that are used in safety-critical applications are so ...
An Efficient Fault-Tolerant Scheduling Approach with Energy Minimization for Hard Real-Time Embedded Systems
AbstractIn this paper, we focus on two major problems in hard real-time embedded systems fault tolerance and energy minimization. Fault tolerance is achieved via both checkpointing technique and active replication strategy to tolerate multiple transient ...
Comments