Availability analysis of software architecture decomposition alternatives for local recovery

Sözer, Hasan; Stoelinga, Mariëlle; Boudali, Hichem; Akşit, Mehmet

doi:10.1007/s11219-016-9315-9

Availability analysis of software architecture decomposition alternatives for local recovery

Published: 03 May 2016

Volume 25, pages 553–579, (2017)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Hasan Sözer¹,
Mariëlle Stoelinga²,
Hichem Boudali³ &
…
Mehmet Akşit⁴

371 Accesses
5 Citations
Explore all metrics

Abstract

We present an efficient and easy-to-use methodology to predict—at design time—the availability of systems that support local recovery. Our analysis techniques work at the architectural level, where the software designer simply inputs the software modules’ decomposition annotated with failure and repair rates. From this decomposition, we automatically generate an analytical model (a continuous-time Markov chain), from which an availability measure is then computed, in a completely automated way. A crucial step is the use of intermediate models in the input/output interactive Markov chain formalism, which makes our techniques efficient, mathematically rigorous, and easy to adapt. In particular, we use aggressive minimization techniques to keep the size of the generated state spaces small. We have applied our methodology on a realistic case study, namely the MPlayer open-source software. We have investigated four different decomposition alternatives and compared our analytical results with the measured availability on a running MPlayer. We found that our predicted results closely match the measured ones .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A guideline for software architecture selection based on ISO 25010 quality related characteristics

Article 08 November 2016

Research Landscape of Patterns in Software Engineering: Taxonomy, State-of-the-Art, and Future Directions

Article 08 April 2024

From DevOps to DevSecOps is not enough. CyberDevOps: an extreme shifting-left architecture to bring cybersecurity within software security lifecycle pipeline

Article 26 April 2023

Notes

An important component used within a software architecture that supports local recovery.
Proactively restarting a software component to mitigate its aging and thus its failure.
Interaction between the RUs is redirected through Inter-Process Communication.
Modeled as part of the RM as mentioned in Sect. 4.
As described later, for each RU, two models are in fact generated.
The recovery time includes the time for restarting failed modules and also the time for error detection, error notification and diagnosis.
An exponential distribution might not be, in some cases, a realistic choice; however, it is also possible to use a phase-type distribution which approximates any distribution arbitrarily closely.
Note that these models are used for availability estimation only and they do not necessarily reflect software implementation. In an actual implementation, a module might not be aware of its failure to notify it. External error detection mechanisms can be employed (Sozer et al. 2009) for this purpose.

References

Alvarez, G., & Cristian, F. (1997). Centralized failure injection for distributed, fault-tolerant protocol testing. In Proceedings of the 17th international conference on distributed computing systems, pp. 78–85.
Avizienis, A., Laprie, J. C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 1(1), 11–33.
Article Google Scholar
Bernardi, S., Merseguer, J., & Petriu, D. (2011). A dependability profile within MARTE. Software and Systems Modeling, 10(3), 313–336.
Article Google Scholar
Bernardi, S., Merseguer, J., Petriu, D., & Dorina, C. (2012). Dependability modeling and analysis of software systems specified with UML. ACM Computing Surveys, 45(1), 1–48.
Article MATH Google Scholar
Boudali, H., Crouzen, P., & Stoelinga, M. (2007a). A compositional semantics for dynamic fault trees in terms of Interactive Markov Chains. In Proceedings of the 5th international symposium on automated technology for verification and analysis, lecture notes on computer science (LNCS), pp. 441–456.
Boudali, H., Crouzen, P., & Stoelinga, M. (2007b). Dynamic fault tree analysis using input/output interactive Markov chains. In Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN), pp. 708–717.
Boudali, H., Crouzen, P., Haverkort, B. R., Kuntz, M., & Stoelinga, M. (2008). Architectural dependability evaluation with arcade. In Proceedings of the 38th IEEE/IFIP international conference on dependable systems and networks (DSN), IEEE, pp. 512–521.
Bowles, J., Dobbins, J., & Gregory, J. (2004). Approximate reliability and availability models for high availability and fault-tolerant systems with repair. Quality and Reliability Engineering International, 20(7), 679–697.
Article Google Scholar
Bozzano, M., Cimatti, A., Katoen, J. P., Nguyen, V. Y., Noll, T., & Roveri, M. (2011). Safety, dependability and performance analysis of extended AADL models. The Computer Journal, 54(5), 754–775.
Article Google Scholar
Brosch, F., Koziolek, H., Buhnova, B., & Reussner, R. (2012). Architecture-based reliability prediction with the palladio component model. IEEE Transactions on Software Engineering, 38(6), 1319–1339.
Article Google Scholar
Candea, G., Kawamoto, S., Fujiki, Y., Friedman, G., & Fox, A. (2004b). Microreboot: A technique for cheap recovery. In Proceedings of the 6th symposium on operating systems design and implementation (OSDI), San Francisco, CA, pp. 31–44.
Candea, G., Cutler, J., & Fox, A. (2004a). Improving availability with recursive micro-reboots: A soft-state system case study. Performance Evaluation, 56(1–4), 213–248.
Article Google Scholar
Clements, P., Bachman, F., Bass, L., Garlan, D., Ivers, J., Little, R., et al. (2010). Documenting software architectures: Views and beyond (2nd ed.). Reading, MA: Addison-Wesley.
Google Scholar
Das, O., & Woodside, C. (1998). The fault-tolerant layered queueing network model for performability of distributed systems. In Proceedings of the international performance and dependability symposium (IPDS) (pp. 132–141). Durham, NC.
Dashofy, E., van der Hoek, A., & Taylor, R. (2002). An infrastructure for the rapid development of XML-based architecture description languages. In International conference on software engineering (ICSE) (pp. 266–276). Orlando, Florida: ACM.
Devroye, L. (1986). Non-uniform random variate generation. Berlin: Springer.
Book MATH Google Scholar
Dugan, J., & Lyu, M. (1995). Dependability modeling for fault-tolerant software and systems. In M. R. Lyu (Ed.), Software fault tolerance, chapter 5 (pp. 109–138). London: Wiley.
Google Scholar
Durares, J., & Henrique, S. (2006). Emulation of software faults: A field data study and a practical approach. IEEE Transactions on Software Engineering, 32(11), 849–867.
Article Google Scholar
ECLIPSE (2015) Eclipse foundation. http://www.eclipse.org/
Franco, J., Barbosa, R., & Zenha-Rela, M. (2012). Automated reliability prediction from formal architectural descriptions. In Proceedings of joint working IEEE/IFIP conference on software architecture (WICSA) and European conference on software architecture (ECSA), pp. 302–309.
Franco, J., Barbosa, R., & Zenha-Rela, M. (2014). Availability evaluation of software architectures through formal methods. In Proceedings of the 9th international conference on the quality of information and communications technology (QUATIC), pp. 282–287.
Garavel, H., & Lang, F. (2001). SVL: A scripting language for compositional verification. In Proceedings of the international conference on formal techniques for networked and distributed systems (FORTE), pp. 377–394.
Garavel, H., Lang, F., Mateescu, R., & Serwe, W. (2007). CADP 2006: A toolbox for the construction and analysis of distributed processes. In Computer-aided verification (CAV), Springer, Lecture Notes on Computer Science (LNCS), vol. 4590, pp. 158–163.
Garlan, D., Monroe, R., & Wile, D. (1997). Acme: An architecture description interchange language. In Proceedings of conference of the centre for advanced studies on collaborative research (CASCON), pp. 169–183.
Garland, S., Lynch, N., Tauber, J., & Vaziri, M. (2004). IOA user guide and reference manual. Tech. rep., MIT CSAI Laboratory, Cambridge, MA.
Geist, R., & Trivedi, K. (1990). Reliability estimation of fault-tolerant systems: Tools and techniques. IEEE Computer, 23(7), 52–61.
Article Google Scholar
Hermanns, H. (2002). Interactive Markov Chains: The quest for quantified quality, lecture notes on computer science (LNCS), vol. 2428.
Hermanns, H., & Katoen, J. P. (2000). Automated compositional Markov chain generation for a plain-old telephone system. Science of Computer Programming, 36(1), 97–127.
Article MATH Google Scholar
Hunt, G. C., et al. (2007). Sealing OS processes to improve dependability and safety. ACM SIGOPS Operating Systems Review, 41(3), 341–354.
Article Google Scholar
Immonen, A., & Niemel, E. (2008). Survey of reliability and availability prediction methods from the viewpoint of software architecture. Software and Systems Modeling, 7(1), 49–65.
Article Google Scholar
Joyce, J. (2007). Architecting dependable systems with the sae architecture analysis and description language (AADL). In R. de Lemos, C. Gacek, & A. Romanovsky (Eds.), Architecting dependable systems, IV, lecture notes in computer science, vol. 4615 (pp. 1–13). Berlin: Springer.
Google Scholar
Kuntz, M., & Haverkort, B. R. (2008). Formal dependability engineering with MIOA. Technical Report TR-CTIT-08-39.
Lai, C. D., et al. (2002). A model for availability analysis of distributed software/hardware systems. Information and Software Technology, 44(6), 343–350.
Article Google Scholar
Lynch, N., & Tuttle, M. (1989). An introduction to input/output automata. CWI Quarterly, 2(3), 219–246.
MathSciNet MATH Google Scholar
Maier, M., Emery, D., & Hilliard, R. (2001). Software architecture: Introducing IEEE standard 1471. IEEE Computer, 34(4), 107–109.
Article Google Scholar
Majzik, I., & Huszerl, G. (2002). Towards dependability modeling of FT-CORBA architectures. In Proceedings of the 4th European dependable computing conference, lecture notes on computer science (LNCS), pp. 121–139.
Monnet, S., & Bertier, M. (2007). Using failure injection mechanisms to experiment and evaluate a grid failure detector. In M. Dayde, J. Palma, A. Coutinho, E. Pacitti, & J. Lopes (Eds.), High performance computing for computational science, vol. 4395 (pp. 610–621). Berlin, Heidelberg: Springer.
MPLAYER (2015) MPlayer official website. http://www.mplayerhq.hu/
Rugina, A. E., Kanoun, K., & Kaaniche, M. (2007). A system dependability modeling framework using AADL and GSPNs. In R. de Lemos, C. Gacek, & A. Romanovsky (Eds.), Architecting dependable systems, IV, lecture notes in computer science, vol. 4615 (pp. 14–38). Berlin: Springer.
Google Scholar
Sozer, H. (2009). Architecting fault-tolerant software systems. Ph.D. thesis, University of Twente, Enschede, The Netherlands.
Sozer, H., & Tekinerdogan, B. (2008). Introducing recovery style for modeling and analyzing system recovery. In Proceedings of the 7th working IEEE/IFIP conference on software architecture (WICSA), (pp. 167–176). Canada: Vancouver.
Sozer, H., Tekinerdogan, B., & Aksit, M. (2009). FLORA: A framework for decomposing software architecture to introduce local recovery. Software: Practice and Experience, 39(10), 869–889.
Google Scholar
Sozer, H., Tekinerdogan, B., & Aksit, M. (2013). Optimizing decomposition of software architecture for local recovery. Software Quality Journal, 21(2), 203–240.
Article Google Scholar
Vaidyanathan, K., & Trivedi, K. S. (2005). A comprehensive model for software rejuvenation. IEEE Transactions on Dependable and Secure Computing, 2(2), 124–137.
Article Google Scholar

Download references

Acknowledgments

We thank Pepijn Crouzen for his help using CADP and Boudewijn Haverkort for his comments on an earlier version of this paper. This work has been carried out as a part of the TRADER project under the responsibilities of the Embedded Systems Institute. This work is partially supported by the Dutch Ministry of Economic Affairs under the BSIK program; by the Netherlands Organization for Scientific Research (NWO) under FOCUS/BRICKS Grant Number 642.000.505 (MOQS); and by the EU under Grants Numbers IST-004527 (ARTIST2) and FP7-ICT-2007-1 (QUASIMODO).

Author information

Authors and Affiliations

School of Engineering, Ozyegin University, Nişantepe Mah. Orman Sk. No: 34-36, Alemdağ – Çekmeköy, 34794, Istanbul, Turkey
Hasan Sözer
Formal Methods and Tools Group, Department of Computer Science, University of Twente, Enschede, The Netherlands
Mariëlle Stoelinga
European Space Research and Technology Centre, European Space Agency, Noordwijk, The Netherlands
Hichem Boudali
Software Engineering Group, Department of Computer Science, University of Twente, Enschede, The Netherlands
Mehmet Akşit

Authors

Hasan Sözer
View author publications
You can also search for this author in PubMed Google Scholar
Mariëlle Stoelinga
View author publications
You can also search for this author in PubMed Google Scholar
Hichem Boudali
View author publications
You can also search for this author in PubMed Google Scholar
Mehmet Akşit
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hasan Sözer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sözer, H., Stoelinga, M., Boudali, H. et al. Availability analysis of software architecture decomposition alternatives for local recovery. Software Qual J 25, 553–579 (2017). https://doi.org/10.1007/s11219-016-9315-9

Download citation

Published: 03 May 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11219-016-9315-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Availability analysis of software architecture decomposition alternatives for local recovery

Abstract

Access this article

Similar content being viewed by others

A guideline for software architecture selection based on ISO 25010 quality related characteristics

Research Landscape of Patterns in Software Engineering: Taxonomy, State-of-the-Art, and Future Directions

From DevOps to DevSecOps is not enough. CyberDevOps: an extreme shifting-left architecture to bring cybersecurity within software security lifecycle pipeline

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Availability analysis of software architecture decomposition alternatives for local recovery

Abstract

Access this article

Similar content being viewed by others

A guideline for software architecture selection based on ISO 25010 quality related characteristics

Research Landscape of Patterns in Software Engineering: Taxonomy, State-of-the-Art, and Future Directions

From DevOps to DevSecOps is not enough. CyberDevOps: an extreme shifting-left architecture to bring cybersecurity within software security lifecycle pipeline

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation