Skip to main content
Log in

Selective SWIFT-R

A Flexible Software-Based Technique for Soft Error Mitigation in Low-Cost Embedded Systems

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Commercial off-the-shelf microprocessors are the core of low-cost embedded systems due to their programmability and cost-effectiveness. Recent advances in electronic technologies have allowed remarkable improvements in their performance. However, they have also made microprocessors more susceptible to transient faults induced by radiation. These non-destructive events (soft errors), may cause a microprocessor to produce a wrong computation result or lose control of a system with catastrophic consequences. Therefore, soft error mitigation has become a compulsory requirement for an increasing number of applications, which operate from the space to the ground level. In this context, this paper uses the concept of selective hardening, which is aimed to design reduced-overhead and flexible mitigation techniques. Following this concept, a novel flexible version of the software-based fault recovery technique known as SWIFT-R is proposed. Our approach makes possible to select different registers subsets from the microprocessor register file to be protected on software. Thus, design space is enriched with a wide spectrum of new partially protected versions, which offer more flexibility to designers. This permits to find the best trade-offs between performance, code size, and fault coverage. Three case studies have been developed to show the applicability and flexibility of the proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Avirneni NDP, Somani AK (2012) Low overhead soft error mitigation techniques for high-performance and aggressive designs. IEEE Trans Comput 61(4):488–501

    Article  MathSciNet  Google Scholar 

  2. Avizienis A (1985) The N-version approach to fault-tolerant software. IEEE Trans Software Eng 11(12):1491–1501

    Article  Google Scholar 

  3. Azambuja JR, Pagliarini S, Rosa L, Kastensmidt FL (2011) Exploring the limitations of software-based techniques in SEE fault coverage. J Electron Test 27:541–550

    Article  Google Scholar 

  4. Azambuja JR, Lapolli A, Rosa L, Kastensmidt FL (2011) Detecting SEEs in microprocessors through a non-intrusive hybrid technique. IEEE Trans Nucl Sci 58(3):993–1000

    Article  Google Scholar 

  5. Barth JL, Dyer CS, Stassinopoulos EG (2003) Space, atmospheric, and terrestrial radiation environments. IEEE Trans Nucl Sci 50(3, Part 3):466–482

    Article  Google Scholar 

  6. Baumann RC (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Device Mater Reliab 5(3):305–316

    Article  MathSciNet  Google Scholar 

  7. Benso A, Chiusano S, Prinetto P, Tagliaferri L (2000) A C/C++ source-to-source compiler for dependable applications. In: Proceedings of international conference on dependable systems and networks, DSN, pp 71–78

  8. Bernardi P, Bolzani Poehls LM, Grosso M, Sonza Reorda M (2010) A hybrid approach for detection and correction of transient faults in SoCs. IEEE Trans Depend. Secur Comput 7(4):439–445

    Article  Google Scholar 

  9. Bolchini C (2003) A software methodology for detecting hardware faults in VLIW data paths. IEEE Trans Reliab 52(4):458–468

    Article  Google Scholar 

  10. Chielle E, Azambuja JR, Barth RS, Almeida F, Kastensmidt FL (2012) Evaluating selective redundancy in data-flow software-based techniques. In: Proceedings of the 13th European conf. on radiation and its effects on components and systems RADECS

  11. Cong J, Gururaj K (2011) Assuring application-level correctness against soft errors. In: Proceedings of the IEEE/ACM international conference on computer-aided design (ICCAD), pp 150–157

  12. Cuenca-Asensi S, Martínez-Álvarez A, Restrepo-Calle F, Palomo FR, Guzmán-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nucl Sci 58(3):1059–1065

    Article  Google Scholar 

  13. Edwards R, Dyer C, Normand E (2004) Technical standard for atmospheric radiation single event effects (SEE) on avionics electronics. In: IEEE radiation effects data workshop (REDW), pp 1–5. IEEE

  14. Goloubeva O, Rebaudengo M, Reorda MS, Violante M (2005) Improved software-based processor control-flow errors detection technique. In: Proceedings of the annual reliability and maintainability symposium, pp 583–589

  15. Goloubeva O, Rebaudengo M, Sonza Reorda M, Violante M (2006) Software-implemented hardware fault tolerance, vol. XIV. Springer

  16. Gomaa MA, Scarbrough C, Vjaykumar TN, Pomeranz I (2003) Transient-fault recovery for chip multiprocessors. IEEE Micro 23(6):76–83

    Article  Google Scholar 

  17. Guzmán-Miranda H, Aguirre MA, Tombs J (2009) Noninvasive fault classification, robustness and recovery time measurement in microprocessor-type architectures subjected to radiation-induced errors. IEEE Trans Instrum Meas 58(5):1514–1524

    Article  Google Scholar 

  18. Jie H, Li F, Degalahal V, Kandemir M, Vijaykrishnan N, Irwin MJ (2009) Compiler-assisted soft error detection under performance and energy constraints in embedded systems. ACM Trans Embed Comput Syst 8:27:1–27:30

    Google Scholar 

  19. Jochim M (2002) Detecting processor hardware faults by means of automatically generated virtual duplex systems. In: Proceedings of international conference on dependable systems and networks, DSN, pp 399–408

  20. Karnik T, Hazucha P, Patel J (2004) Characterization of soft errors caused by single event upsets in CMOS processes. IEEE Trans Depend Secure Comput 1(2):128–143

    Article  Google Scholar 

  21. Lee J, Shrivastava A (2009) Compiler-managed register file protection for energy-efficient soft error reduction. In: Proceedings of the ASP-DAC 2009: 14th Asia and South Pacific design automation conference, pp 618–623

  22. Lee J, Shrivastava A (2010) A compiler-microarchitecture hybrid approach to soft error reduction for register files. IEEE Trans Comp-Aided Des Integ Cir Sys 29:1018–1027

    Article  Google Scholar 

  23. Lee C, Potkonjak M, Mangione-Smith WH (1997) Mediabench: a tool for evaluating and synthesizing multimedia and communicatons systems. In: Proceedings of the 30th annual ACM/IEEE int. symp. microarchitecture. MICRO 30, pp 330–335

  24. Leonardo MR, Sansoè C, Passerone C, Speretta S, Tranchero M, Borri M, Del Corso D (2010) Aerospace technologies advancementsfs. Chapter 9: design solutions for modular satellite architectures. Intech, Olajnica 19/2, 32000 Vukovar, Croatia

  25. Lin S, Kim Y-B, Lombardi F (2011) A 11-transistor nanoscale CMOS memory cell for hardening to soft errors. IEEE Trans VLSI Syst 19(5):900–904

    Article  Google Scholar 

  26. Martínez-Álvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo FR, Guzmán-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9(2):159–172

    Article  Google Scholar 

  27. McLoughlin IV, Bretschneider TR (March 2010) Reliability through redundant parallelism for micro-satellite computing. ACM Trans Embed Comput Syst 9(3):26:1–26:25

    Google Scholar 

  28. Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Proceedings of the 36th international symposium on microarchitecture, pp 29–40

  29. Nicolaidis M (2005) Design for soft error mitigation. IEEE Trans Device Mat Reliab 5(3):405–418

    Article  Google Scholar 

  30. Nicolaidis M (2011) Soft errors in modern electronic systems, volume 41 of Frontiers in electronic testing, 1 edn. Springer

  31. Nicolescu B, Savaria Y, Velazco R (2004) Software detection mechanisms providing full coverage against single bit-flip faults. IEEE Trans Nucl Sci 51(6):3510–3518

    Article  Google Scholar 

  32. Oh N, McCluskey EJ (2002) Error detection by selective procedure call duplication for low energy consumption. IEEE Trans Reliab 51(4):392–402

    Article  Google Scholar 

  33. Oh N, Mitra S, McCluskey EJ (2002) ED4I: error detection by diverse data and duplicated instructions. IEEE Trans Comput 51(2):180–199

    Article  Google Scholar 

  34. Oh N, Shirvani PP, McCluskey EJ (2002) Control-flow checking by software signatures. IEEE Trans Reliab 51(1):111–122

    Article  Google Scholar 

  35. Oh N, Shirvani PP, McCluskey EJ (2002) Error detection by duplicated instructions in super-scalar processors. IEEE Trans Reliab 51(1):63–75

    Article  Google Scholar 

  36. Pignol M (2005) How to cope with SEU/SET at system level? In: Proceedings of the 11th IEEE international on-line testing symp, IOLTS, pp. 315–318

  37. Pignol M (2010) COTS-based applications in space avionics. In: Proceedings of the 13th design, automation and test in Europe conference, DATE, pp. 1213–1219. Dresden, Germany

  38. Pratt B, Caffrey M, Carroll JF, Graham P, Morgan K, Wirthlin M (2008) Fine-grain SEU mitigation for FPGAs using partial TMR. IEEE Trans Nucl Sci 55(4):2274–2280

    Article  Google Scholar 

  39. Ragel RG, Parameswaran S (2011) A hybrid hardware–software technique to improve reliability in embedded processors. ACM Trans Embed Comput Syst 10(3):36:1–36:16

    Google Scholar 

  40. Rebaudengo M, Reorda MS, Violante M, Torchiano M (2001) A source-to-source compiler for generating dependable software. In: Proceedings of the 1st IEEE international workshop on source code analysis and manipulation, pp 33–42

  41. Rebaudengo M, Reorda MS, Violante M (2004) A new approach to software-implemented fault tolerance. J Electron Test 20(4):433–437

    Article  Google Scholar 

  42. Rebaudengo M, Sonza-Reorda M, Violante M (2011) Soft errors in modern electronic systems, volume 41 of Frontiers in electronic testing, chapter 9, 1 edn. Software-level soft error mitigation techniques. Springer

  43. Reddy VK, Parthasarathy S, Rotenberg E (2006) Understanding prediction-based partial redundant threading for low-overhead, high-coverage fault tolerance. ACM Sigplan Notices 41(11):83–94

    Article  Google Scholar 

  44. Reinhardt SK, Mukherjee SS (2000) Transient fault detection via simultaneous multithreading. In: 27th international symposium on computer architecture, pp 25–36. Vancuver, Canada, Jun 12–14

  45. Reis GA, Chang J, Vachharajani N, Rangan R, August DI (2005) SWIFT: software implemented fault tolerance. In: CGO 2005: international symposium on code generation and optimization, pp 243–254

  46. Reis GA, Chang J, August DI (2007) Automatic instruction-level software-only recovery. IEEE Micro 27(1):36–47

    Article  Google Scholar 

  47. Ruano O, Maestro JA, Reviriego P (2009) A methodology for automatic insertion of selective TMR in digital circuits affected by SEUs. IEEE Trans Nucl Sci 56(4):2091–2102

    Article  Google Scholar 

  48. Samudrala PK, Ramos J, Katkoori S (2004) Selective triple modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs. IEEE Trans Nucl Sci 51(5, Part 4):2957–2969

    Article  Google Scholar 

  49. Sundaram A, Aakel A, Lockhart D, Thaker D, Franklin D (2008) Efficient fault tolerance in multi-media applications through selective instruction replication. In: Proceedings of the 2008 workshop on radiation effects and fault tolerance in nanometer technologies. WREFT ’08, pp 339–346

  50. Vemu R, Abraham JA (2008) Budget-dependent control-flow error detection. In: 14th IEEE international on-line testing symposium IOLTS’08, pp 73–78

  51. Venkatasubramanian R, Hayes JP, Murray BT (2003) Low-cost on-line fault detection using control flow assertions. In: 9th IEEE on-line testing symposium IOLTS, pp. 137–143

  52. Vera X, Abella J, Carretero J, González A (2010) Selective replication: a lightweight technique for soft errors. ACM Trans Comput Syst 27(4):8:1–8:30

    Google Scholar 

  53. XILINX (2008) PicoBlaze 8-bit embedded microcontroller user guide. UG129 (v1.1.2). Xilinx Ltd

  54. Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the IEEE international workshop of the workload characterization. WWC-4, pp 3–14

  55. Yeh TY, Reinman G, Patel SJ, Faloutsos P (2009) Fool me twice: exploring and exploiting error tolerance in physics-based animation. ACM Trans Graph 29(1):5:1–5:11

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felipe Restrepo-Calle.

Additional information

This work was funded by the Ministry of Science and Innovation in Spain with the project ‘RENASER+: Integral Analysis of Digital Circuits and Systems for Aerospace Applications’ (TEC2010-22095-C03-01).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Restrepo-Calle, F., Martínez-Álvarez, A., Cuenca-Asensi, S. et al. Selective SWIFT-R. J Electron Test 29, 825–838 (2013). https://doi.org/10.1007/s10836-013-5416-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-013-5416-6

Keywords

Navigation