skip to main content
research-article

Automated Comparison of State-Based Software Models in Terms of Their Language and Structure

Published:01 March 2013Publication History
Skip Abstract Section

Abstract

State machines capture the sequential behavior of software systems. Their intuitive visual notation, along with a range of powerful verification and testing techniques render them an important part of the model-driven software engineering process. There are several situations that require the ability to identify and quantify the differences between two state machines (e.g. to evaluate the accuracy of state machine inference techniques is measured by the similarity of a reverse-engineered model to its reference model). State machines can be compared from two complementary perspectives: (1) In terms of their language -- the externally observable sequences of events that are permitted or not, and (2) in terms of their structure -- the actual states and transitions that govern the behavior. This article describes two techniques to compare models in terms of these two perspectives. It shows how the difference can be quantified and measured by adapting existing binary classification performance measures for the purpose. The approaches have been implemented by the authors, and the implementation is openly available. Feasibility is demonstrated via a case study to compare two real state machine inference approaches. Scalability and accuracy are assessed experimentally with respect to a large collection of randomly synthesized models.

References

  1. Biermann, A. and Feldman, J. 1972. On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. 21, 592--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bogdanov, K. and Walkinshaw, N. 2009. Computing the structural difference between state-based models. In Proceedings of the 16th IEEE Working Conference on Reverse Engineering (WCRE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bogdanov, K., Holcombe, M., Ipate, F., Seed, L., and Vanak, S. 2006. Testing methods for X-Machines: A review. Form. Asp. Comput. Sci. 18, 3--30. Google ScholarGoogle ScholarCross RefCross Ref
  4. Börger, E. 2005. Abstract state machines and high-level system design and analysis. Theoret. Comput. Sci. 336, 2--3, 205--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bunke, H. 1997. On a relation between graph edit distance and maximum common subgraph. Patt. Recog. Lett. 18, 8, 689--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cheng, K. and Krishnakumar, A. 1993. Automatic functional test generation using the extended finite state machine model. In Proceedings of the 30th ACM/IEEE Design Automation Conference. 86--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chow, T. 1978. Testing software design modelled by finite state machines. IEEE Trans. Softw. Eng. 4, 3, 178--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Clarke, E., Grumberg, O., and Peled, D. 1999. Model Checking. The MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cook, J. and Wolf, A. 1998. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Meth. 7, 3, 215--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cook, J. and Wolf, A. 1999. Software process validation: Quantitatively measuring the correspondence of a process to a model. ACM Trans. Softw. Eng. Methodol. 8, 2, 147--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Davis, T. 2004. Algorithm 832: Umfpack v4.3---an unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. 30, 2, 196--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fujiwara, S., von Bochmann, G., Khendek, F., Amalou, M., and Ghedamsi, A. 1991. Test selection based on finite state models. IEEE Trans. Softw. Eng. 17, 6, 591--603. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gill, A. 1962. Introduction to the Theory of Finite State Machines. McGraw-Hill.Google ScholarGoogle Scholar
  14. Harel, D. and Naamad, A. 1996. The STATEMATE Semantics of Statecharts. ACM Trans. Softw. Eng. Meth. 5, 4, 293--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hopcroft, J., Motwani, R., and Ullman, J. 2007. Introduction to Automata Theory, Languages, and Computation 3rd Ed. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kelter, U. and Schmidt, M. 2008. Comparing state machines. In Proceedings of the Comparison and Versioning of Software Models (CVSM’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kermorvant, C. and Dupont, P. 2002. Stochastic grammatical inference with multinomial tests. In Grammatical Inference: Algorithms and Applications, Lecture Notes in Artificial Intelligence, vol. 2484, Springer-Verlag, 149--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lang, K. 1992. Random DFA’s can be approximately learned from sparse uniform examples. In Proceedings of the International Conference on Learning Theory (COLT’92). 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lang, K., Pearlmutter, B., and Price, R. 1998. Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In Proceedings of the International Colloquium on Grammar Inference (ICGI). vol. 1433, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Laycock, G. 1992. Formal specification and testing: A case study. Softw. Test., Verif. Reliability 2, 1, 7--23.Google ScholarGoogle ScholarCross RefCross Ref
  21. Lee, D. and Yannakakis, M. 1996. Principles and methods of testing finite state machines - a survey. Proc. IEEE 84, 1090--1126.Google ScholarGoogle ScholarCross RefCross Ref
  22. Leskovec, J., Kleinberg, J., and Faloutsos, C. 2007. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Disc. Data 1, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lo, D. and Khoo, S. 2006. QUARK: Empirical assessment of automaton-based specification miners. In Proceedings of the Working Conference on Reverse Engineering (WCRE’06). IEEE Computer Society, 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Melnik, S., Garcia-Molina, H., and Rahm, E. 2002. Similarity flooding: A versatile graph matching algorithm and its application to schema matching. In Proceedings of the 18th International Conference on Data Engineering (ICDE’02). (Best Student Paper award). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Needleman, S. and Wunsch, C. 1970. A general method applicable to the search of similarities in the amino acid sequence of two proteins. J. Molec. Biol. 48, 443--453.Google ScholarGoogle ScholarCross RefCross Ref
  26. Nejati, S., Sabetzadeh, M., Chechik, M., Easterbrook, S., and Zave, P. 2007. Matching and merging of statecharts specifications. In Proceedings of the International Conference on Software Engineering (ICSE’07). IEEE Computer Society, 54--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pradel, M., Bichsel, P., and Gross, T. 2010. A framework for the evaluation of specification miners based on finite state machines. In Proceedings of the International Conference on Software Maintenance (ICSM’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Quante, J. and Koschke, R. 2007. Dynamic protocol recovery. In Proceedings of the International Working Conference on Reverse Engineering (WCRE’07). 219--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Raymond, J., Gardiner, E., and Willett, P. 2002. RASCAL: Calculation of graph similarity using maximum common edge subgraphs. Comput. J. 45, 6, 631--644.Google ScholarGoogle ScholarCross RefCross Ref
  30. Rijsbergen, C. J. V. 1979. Information Retrieval. Butterworth-Heinemann, Newton, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rousseeuw, P., Ruts, I., and Tukey, J. 1999. The bagplot: A bivariate boxplot. Amer. Stati. 53, 4, 382--387.Google ScholarGoogle Scholar
  32. Sokolova, M. and Lapalme, G. 2009. A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45, 4, 427--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sokolsky, O., Kannan, S., and Lee, I. 2006. Simulation-based graph similarity. In Proceedings of the 12th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’06). Lecture Notes in Computer Science, vol. 3920, Springer, 426--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sorrows, M. and Hirtle, S. 1999. The nature of landmarks for real and electronic spaces. In Spatial Information Theory - Cognitive and Computational Foundations of Geographic Information Science. Springer, Berlin, 37--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Vasilevskii, M. 1973. Failure diagnosis of automata. Cybernetics Syst. Anal.Google ScholarGoogle Scholar
  36. Walkinshaw, N., Bogdanov, K., Holcombe, M., and Salahuddin, S. 2007. Reverse engineering state machines by interactive grammar inference. In Proceedings of the 14th IEEE International Working Conference on Reverse Engineering (WCRE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Walkinshaw, N., Bogdanov, K., Holcombe, M., and Salahuddin, S. 2008a. Improving dynamic software analysis by applying grammar inference principles. J. Softw. Maint. Evol.: Res. Prac. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Walkinshaw, N., Bogdanov, K., and Johnson, K. 2008b. Evaluation and comparison of inferred regular grammars. In Proceedings of the International Colloquium on Grammar Inference (ICGI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Walkinshaw, N., Derrick, J., and Guo, Q. 2009. Iterative refinement of reverse-engineered models by model-based testing. In Proceedings of Formal Methods (FM’09). Lecture Notes in Computer Science, vol. 5850, Springer, 305--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Walkinshaw, N., Bogdanov, K., Damas, C., Lambeau, B., and Dupont, P. 2010. A framework for the competitive evaluation of model inference techniques. In Proceedings of the International Workshop on Model Inference in Testing (MIIT’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Weyuker, E. 1983. Assessing test data adequacy through program inference. ACM Trans. Program. Lang. Syst. 5, 4, 641--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Whaley, R. and Petitet, A. 2005. Minimizing development and maintenance costs in supporting persistently optimized BLAS. Software: Prac. Exper. 35, 2, 101--121. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated Comparison of State-Based Software Models in Terms of Their Language and Structure

      Recommendations

      Reviews

      Edel M Sherratt

      State machine models provide an effective way to compare different versions of a software system, or different systems to solve the same problem. They facilitate software development, testing, and maintenance. Two complementary techniques for comparing state machines are presented in this paper. The first identifies similarities and differences between state machines in terms of the possible sequences of events they describe, that is, the languages generated by the state machines. The second compares the states and transition structures of two state machines. Both start by reducing the state machines to a set of states and labeled transitions, called labeled transition systems (LTSs). The techniques are illustrated using different state machine models of a simple text editor. The authors conducted experiments to assess the scalability of the techniques. These indicate that language-based comparison grows more rapidly with increasing state machine complexity than does structural comparison. These approaches to comparing state machines represent a promising basis for new software engineering tools. It would also be interesting to see a theoretical analysis of the computational complexity of these techniques. Online Computing Reviews Service

      Boumediene Belkhouche

      Using computer-aided software engineering (CASE) tools to compare different views of similar models offers software developers the ability to evaluate models both qualitatively and quantitatively. Related applications include model-checking, testing, and determining whether two design solutions are similar. This paper elaborates on the development of techniques and their corresponding algorithms to support the comparison and evaluation of state machine models. Given a reference model and a subject model, both expressed as labeled transition systems (LTSs), how can we measure their similarities, differences, and relative accuracy__?__ The first technique, inspired by model-based testing, compares the languages generated by the two LTSs. Traces are generated from the LTSs by enumerating finite string combinations of the transition labels. Attributes, such as precision, recall, and classification, are expressed formally and their values are computed based on membership and nonmembership of these traces in the languages under consideration. The second technique introduces both local and global ways to compare the structures of the reference machine and the subject machine. The local comparison is limited to a given state and its adjacent neighbors. The global comparison includes a given state of interest and all the remaining states. Again, attributes for measuring global similarities and differences are expressed formally and an algorithm to compute them is presented. The results of an empirical evaluation of the two techniques show that the second technique tends to be highly accurate and more robust than the first. Such a tool would help software developers compare models generated by different teams, or assess the similarity of an existing model and a reverse-engineered model. CASE tool developers will be inspired by the process of integrating various formalisms to shed light on a different software development issue. Basing the empirical evaluation of the techniques on real rather than synthetic data would strengthen the claim that the techniques are scalable and the results remain accurate. It would also address the issue of computational costs associated with state explosion inherent in modeling with finite state machines. In this case, a formal complexity analysis of the algorithms is required. The authors, somewhat implicitly, claim that the expressive power of LTSs enables the modeling of any software system. According to automata theory, an LTS is equivalent to a regular grammar or language, whereas software systems may require more powerful machines than finite state machines. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Software Engineering and Methodology
        ACM Transactions on Software Engineering and Methodology  Volume 22, Issue 2
        March 2013
        190 pages
        ISSN:1049-331X
        EISSN:1557-7392
        DOI:10.1145/2430545
        Issue’s Table of Contents

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 March 2013
        • Revised: 1 February 2012
        • Accepted: 1 February 2012
        • Received: 1 March 2010
        Published in tosem Volume 22, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader