research-article

Automated Comparison of State-Based Software Models in Terms of Their Language and Structure

Authors:
Neil Walkinshaw

The University of Leicester

The University of Leicester
View Profile

,
Kirill Bogdanov

The University of Sheffield

The University of Sheffield
View Profile

ACM Transactions on Software Engineering and Methodology Volume 22 Issue 2Article No.: 13pp 1–37https://doi.org/10.1145/2430545.2430549

Published:01 March 2013Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

State machines capture the sequential behavior of software systems. Their intuitive visual notation, along with a range of powerful verification and testing techniques render them an important part of the model-driven software engineering process. There are several situations that require the ability to identify and quantify the differences between two state machines (e.g. to evaluate the accuracy of state machine inference techniques is measured by the similarity of a reverse-engineered model to its reference model). State machines can be compared from two complementary perspectives: (1) In terms of their language -- the externally observable sequences of events that are permitted or not, and (2) in terms of their structure -- the actual states and transitions that govern the behavior. This article describes two techniques to compare models in terms of these two perspectives. It shows how the difference can be quantified and measured by adapting existing binary classification performance measures for the purpose. The approaches have been implemented by the authors, and the implementation is openly available. Feasibility is demonstrated via a case study to compare two real state machine inference approaches. Scalability and accuracy are assessed experimentally with respect to a large collection of randomly synthesized models.

References

Biermann, A. and Feldman, J. 1972. On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. 21, 592--597. Google ScholarDigital Library
Bogdanov, K. and Walkinshaw, N. 2009. Computing the structural difference between state-based models. In Proceedings of the 16th IEEE Working Conference on Reverse Engineering (WCRE). Google ScholarDigital Library
Bogdanov, K., Holcombe, M., Ipate, F., Seed, L., and Vanak, S. 2006. Testing methods for X-Machines: A review. Form. Asp. Comput. Sci. 18, 3--30. Google ScholarCross Ref
Börger, E. 2005. Abstract state machines and high-level system design and analysis. Theoret. Comput. Sci. 336, 2--3, 205--207. Google ScholarDigital Library
Bunke, H. 1997. On a relation between graph edit distance and maximum common subgraph. Patt. Recog. Lett. 18, 8, 689--694. Google ScholarDigital Library
Cheng, K. and Krishnakumar, A. 1993. Automatic functional test generation using the extended finite state machine model. In Proceedings of the 30th ACM/IEEE Design Automation Conference. 86--91. Google ScholarDigital Library
Chow, T. 1978. Testing software design modelled by finite state machines. IEEE Trans. Softw. Eng. 4, 3, 178--187. Google ScholarDigital Library
Clarke, E., Grumberg, O., and Peled, D. 1999. Model Checking. The MIT Press, Cambridge, MA. Google ScholarDigital Library
Cook, J. and Wolf, A. 1998. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Meth. 7, 3, 215--249. Google ScholarDigital Library
Cook, J. and Wolf, A. 1999. Software process validation: Quantitatively measuring the correspondence of a process to a model. ACM Trans. Softw. Eng. Methodol. 8, 2, 147--176. Google ScholarDigital Library
Davis, T. 2004. Algorithm 832: Umfpack v4.3---an unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. 30, 2, 196--199. Google ScholarDigital Library
Fujiwara, S., von Bochmann, G., Khendek, F., Amalou, M., and Ghedamsi, A. 1991. Test selection based on finite state models. IEEE Trans. Softw. Eng. 17, 6, 591--603. Google ScholarDigital Library
Gill, A. 1962. Introduction to the Theory of Finite State Machines. McGraw-Hill.Google Scholar
Harel, D. and Naamad, A. 1996. The STATEMATE Semantics of Statecharts. ACM Trans. Softw. Eng. Meth. 5, 4, 293--333. Google ScholarDigital Library
Hopcroft, J., Motwani, R., and Ullman, J. 2007. Introduction to Automata Theory, Languages, and Computation 3rd Ed. Addison-Wesley. Google ScholarDigital Library
Kelter, U. and Schmidt, M. 2008. Comparing state machines. In Proceedings of the Comparison and Versioning of Software Models (CVSM’08). Google ScholarDigital Library
Kermorvant, C. and Dupont, P. 2002. Stochastic grammatical inference with multinomial tests. In Grammatical Inference: Algorithms and Applications, Lecture Notes in Artificial Intelligence, vol. 2484, Springer-Verlag, 149--160. Google ScholarDigital Library
Lang, K. 1992. Random DFA’s can be approximately learned from sparse uniform examples. In Proceedings of the International Conference on Learning Theory (COLT’92). 45--52. Google ScholarDigital Library
Lang, K., Pearlmutter, B., and Price, R. 1998. Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In Proceedings of the International Colloquium on Grammar Inference (ICGI). vol. 1433, 1--12. Google ScholarDigital Library
Laycock, G. 1992. Formal specification and testing: A case study. Softw. Test., Verif. Reliability 2, 1, 7--23.Google ScholarCross Ref
Lee, D. and Yannakakis, M. 1996. Principles and methods of testing finite state machines - a survey. Proc. IEEE 84, 1090--1126.Google ScholarCross Ref
Leskovec, J., Kleinberg, J., and Faloutsos, C. 2007. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Disc. Data 1, 1. Google ScholarDigital Library
Lo, D. and Khoo, S. 2006. QUARK: Empirical assessment of automaton-based specification miners. In Proceedings of the Working Conference on Reverse Engineering (WCRE’06). IEEE Computer Society, 51--60. Google ScholarDigital Library
Melnik, S., Garcia-Molina, H., and Rahm, E. 2002. Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In Proceedings of the 18th International Conference on Data Engineering (ICDE’02). (Best Student Paper award). Google ScholarDigital Library
Needleman, S. and Wunsch, C. 1970. A general method applicable to the search of similarities in the amino acid sequence of two proteins. J. Molec. Biol. 48, 443--453.Google ScholarCross Ref
Nejati, S., Sabetzadeh, M., Chechik, M., Easterbrook, S., and Zave, P. 2007. Matching and merging of statecharts specifications. In Proceedings of the International Conference on Software Engineering (ICSE’07). IEEE Computer Society, 54--64. Google ScholarDigital Library
Pradel, M., Bichsel, P., and Gross, T. 2010. A framework for the evaluation of specification miners based on finite state machines. In Proceedings of the International Conference on Software Maintenance (ICSM’10). Google ScholarDigital Library
Quante, J. and Koschke, R. 2007. Dynamic protocol recovery. In Proceedings of the International Working Conference on Reverse Engineering (WCRE’07). 219--228. Google ScholarDigital Library
Raymond, J., Gardiner, E., and Willett, P. 2002. RASCAL: Calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 6, 631--644.Google ScholarCross Ref
Rijsbergen, C. J. V. 1979. Information Retrieval. Butterworth-Heinemann, Newton, MA. Google ScholarDigital Library
Rousseeuw, P., Ruts, I., and Tukey, J. 1999. The bagplot: A bivariate boxplot. Amer. Stati. 53, 4, 382--387.Google Scholar
Sokolova, M. and Lapalme, G. 2009. A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45, 4, 427--437. Google ScholarDigital Library
Sokolsky, O., Kannan, S., and Lee, I. 2006. Simulation-based graph similarity. In Proceedings of the 12th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’06). Lecture Notes in Computer Science, vol. 3920, Springer, 426--440. Google ScholarDigital Library
Sorrows, M. and Hirtle, S. 1999. The nature of landmarks for real and electronic spaces. In Spatial Information Theory - Cognitive and Computational Foundations of Geographic Information Science. Springer, Berlin, 37--50. Google ScholarDigital Library
Vasilevskii, M. 1973. Failure diagnosis of automata. Cybernetics Syst. Anal.Google Scholar
Walkinshaw, N., Bogdanov, K., Holcombe, M., and Salahuddin, S. 2007. Reverse engineering state machines by interactive grammar inference. In Proceedings of the 14th IEEE International Working Conference on Reverse Engineering (WCRE). Google ScholarDigital Library
Walkinshaw, N., Bogdanov, K., Holcombe, M., and Salahuddin, S. 2008a. Improving dynamic software analysis by applying grammar inference principles. J. Softw. Maint. Evol.: Res. Prac. Google ScholarDigital Library
Walkinshaw, N., Bogdanov, K., and Johnson, K. 2008b. Evaluation and comparison of inferred regular grammars. In Proceedings of the International Colloquium on Grammar Inference (ICGI). Google ScholarDigital Library
Walkinshaw, N., Derrick, J., and Guo, Q. 2009. Iterative refinement of reverse-engineered models by model-based testing. In Proceedings of Formal Methods (FM’09). Lecture Notes in Computer Science, vol. 5850, Springer, 305--320. Google ScholarDigital Library
Walkinshaw, N., Bogdanov, K., Damas, C., Lambeau, B., and Dupont, P. 2010. A framework for the competitive evaluation of model inference techniques. In Proceedings of the International Workshop on Model Inference in Testing (MIIT’10). Google ScholarDigital Library
Weyuker, E. 1983. Assessing test data adequacy through program inference. ACM Trans. Program. Lang. Syst. 5, 4, 641--655. Google ScholarDigital Library
Whaley, R. and Petitet, A. 2005. Minimizing development and maintenance costs in supporting persistently optimized BLAS. Software: Prac. Exper. 35, 2, 101--121. Google ScholarDigital Library

Index Terms

Automated Comparison of State-Based Software Models in Terms of Their Language and Structure
1. Software and its engineering
  1. Software notations and tools
    1. System description languages
      1. Specification languages
  2. Software organization and properties
    1. Software system structures
      1. Software system models
        State systems

Recommendations

Computing the Structural Difference between State-Based Models
WCRE '09: Proceedings of the 2009 16th Working Conference on Reverse Engineering

Software behavior models play an important role in software development. They can be manually generated to specify the intended behavior of a system, or they can be reverse-engineered to capture the actual behavior of the system. Models may differ when ...
Read More
Visual Comparison of Graphical Models
ICECCS '09: Proceedings of the 2009 14th IEEE International Conference on Engineering of Complex Computer Systems

Collaborative development, incremental design and revision management require the ability to compare different versions of software artifacts. There are well-established approaches for comparing textual artifacts such as program files. However, the ...
Read More
Towards a comparative analysis of meta-metamodels
SPLASH '11 Workshops: Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE! 2011, AOOPES'11, NEAT'11, & VMIL'11

A cornerstone in Domain-Specific Modeling is the definition of modeling languages. A widely used method to formalize domain-specific languages is the metamodeling approach. There are a huge number of metamodeling languages. The choice of a suitable ...
Read More

Reviews

Reviewer: Edel M Sherratt

State machine models provide an effective way to compare different versions of a software system, or different systems to solve the same problem. They facilitate software development, testing, and maintenance. Two complementary techniques for comparing state machines are presented in this paper. The first identifies similarities and differences between state machines in terms of the possible sequences of events they describe, that is, the languages generated by the state machines. The second compares the states and transition structures of two state machines. Both start by reducing the state machines to a set of states and labeled transitions, called labeled transition systems (LTSs). The techniques are illustrated using different state machine models of a simple text editor. The authors conducted experiments to assess the scalability of the techniques. These indicate that language-based comparison grows more rapidly with increasing state machine complexity than does structural comparison. These approaches to comparing state machines represent a promising basis for new software engineering tools. It would also be interesting to see a theoretical analysis of the computational complexity of these techniques. Online Computing Reviews Service

Reviewer: Boumediene Belkhouche

Using computer-aided software engineering (CASE) tools to compare different views of similar models offers software developers the ability to evaluate models both qualitatively and quantitatively. Related applications include model-checking, testing, and determining whether two design solutions are similar. This paper elaborates on the development of techniques and their corresponding algorithms to support the comparison and evaluation of state machine models. Given a reference model and a subject model, both expressed as labeled transition systems (LTSs), how can we measure their similarities, differences, and relative accuracy__?__ The first technique, inspired by model-based testing, compares the languages generated by the two LTSs. Traces are generated from the LTSs by enumerating finite string combinations of the transition labels. Attributes, such as precision, recall, and classification, are expressed formally and their values are computed based on membership and nonmembership of these traces in the languages under consideration. The second technique introduces both local and global ways to compare the structures of the reference machine and the subject machine. The local comparison is limited to a given state and its adjacent neighbors. The global comparison includes a given state of interest and all the remaining states. Again, attributes for measuring global similarities and differences are expressed formally and an algorithm to compute them is presented. The results of an empirical evaluation of the two techniques show that the second technique tends to be highly accurate and more robust than the first. Such a tool would help software developers compare models generated by different teams, or assess the similarity of an existing model and a reverse-engineered model. CASE tool developers will be inspired by the process of integrating various formalisms to shed light on a different software development issue. Basing the empirical evaluation of the techniques on real rather than synthetic data would strengthen the claim that the techniques are scalable and the results remain accurate. It would also address the issue of computational costs associated with state explosion inherent in modeling with finite state machines. In this case, a formal complexity analysis of the algorithms is required. The authors, somewhat implicitly, claim that the expressive power of LTSs enables the modeling of any software system. According to automata theory, an LTS is equivalent to a regular grammar or language, whereas software systems may require more powerful machines than finite state machines. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Software Engineering and Methodology Volume 22, Issue 2
March 2013
190 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2430545
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 March 2013
- Revised: 1 February 2012
- Accepted: 1 February 2012
- Received: 1 March 2010
Published in tosem Volume 22, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Labeled transition systems
accuracy
comparison
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 676
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automated Comparison of State-Based Software Models in Terms of Their Language and Structure

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Computing the Structural Difference between State-Based Models

Visual Comparison of Graphical Models

Towards a comparative analysis of meta-metamodels

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automated Comparison of State-Based Software Models in Terms of Their Language and Structure

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Computing the Structural Difference between State-Based Models

Visual Comparison of Graphical Models

Towards a comparative analysis of meta-metamodels

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media