A systematic review on search based mutation testing

https://doi.org/10.1016/j.infsof.2016.01.017Get rights and content

Abstract

Context

Search Based Software Testing refers to the use of meta-heuristics for the optimization of a task in the context of software testing. Meta-heuristics can solve complex problems in which an optimum solution must be found among a large amount of possibilities. The use of meta-heuristics in testing activities is promising because of the high number of inputs that should be tested. Previous studies on search based software testing have focused on the application of meta-heuristics for the optimization of structural and functional criteria. Recently, some researchers have proposed the use of SBST for mutation testing and explored solutions for the cost of application of this testing criterion.

Objective

The objective is to identify how SBST has been explored in the context of mutation testing, how fitness functions are defined and the challenges and research opportunities in the application of meta-heuristic search techniques.

Method

A systematic review involving 263 papers published between 1996 and 2014 examined the studies on the use of meta-heuristic search techniques for the optimization of mutation testing.

Results

The results show meta-heuristic search techniques have been applied for the optimization of test data generation, mutant generation and selection of effective mutation operators. Five meta-heuristic techniques, namely Genetic Algorithm, Ant Colony, Bacteriological Algorithm, Hill Climbing and Simulated Annealing have been used in search based mutation testing. The review addressed different fitness functions used to guide the search.

Conclusion

Search based mutation testing is a field of interest, however, some issues remain unexplored. For instance, the use of meta-heuristics for the selection of effective mutation operators was identified in only one study. The results have pointed a range of possibilities for new studies to be developed, i.e., identification of equivalent mutants, experimental studies and application to different domains, such as concurrent programs.

Introduction

Glover [1] was the first author to introduce the term meta-heuristic (also called generic heuristics), defining it as a set of search algorithms to be used in different problems. The search process for the optimization of problems is inspired in different contexts, as Simulated Annealing technique [2], which uses concepts of metallurgy, and Genetic Algorithm [3], which uses fundamentals of biology and genetics (e.g. crossover, mutation and evolution). The use of meta-heuristics can be justified when the internal complexity of a problem hampers the use of an exhaustive technique due to the large number of possible solutions [4].

Glover and Kochenberger [5] addressed the general process of execution of a meta-heuristic in which a search for solutions is guided by a fitness function. A fitness function is a mathematical function that assigns values to each solution present in the search space. When an exhaustive technique is applied, all solutions are visited and the best solution is returned. On the other hand, when a meta-heuristic is applied, only some solutions are visited during the search for the optimal solution. Therefore, each meta-heuristic follows a specific search process, but hopefully in a clever way, so that good solutions can be found. Since only a few solutions are visited during the search, there is no guarantee the best solution of the search space will be returned.

Studies in the area of Search Based Software Engineering (SBSE) have shown the application of meta-heuristics is promising in the software testing context. Harman et al. [6] provided data on the increase in the number of publications on SBST. According to the authors, if the trend continues, over 1700 SBST papers will have been published before the end of this decade. Some issues addressed by SBST are test data generation [7], [8], [9], [10], [11], test case selection [12], [13], [14], [15], test case prioritization [14], [16], [17], [18], [19], functional testing [20] and non-functional testing [21], [22], [23].

Mutation testing is the most widely known criterion of the fault injection testing technique [24]. It evaluates and improves the quality of a test case set. Therefore, small syntactic modifications are inserted in the program under test and for each modification made, a modified version of the program, called mutant, is created. Modifications represent possible mistakes programmers may commit while coding a program. This criterion aims at supporting the generation of a test case set that indicates the mistakes inserted in mutants are not present in the original program, which enhances the reliability of the program. For the application of this criterion, first the original program is executed with the initial test case set. Mutants are then generated and executed with the same test set. Those that behave differently than the original program are considered dead and are no longer used in the test. The set of alive mutants is analyzed and equivalent mutants are identified. A mutant is considered equivalent when, for all test cases, it displays exactly the same behavior of the program under test. Finally, new test cases are created to kill the alive mutants. Despite the benefits of mutation testing in terms of effectiveness, some problems such as high number of mutants generated, computational cost required for their execution and high effort necessary for the identification of equivalent mutants are raised [25].

This manuscript addresses applications of meta-heuristic search techniques to mutation testing. The focus is on the identification of potential characteristics of mutation testing to which search based techniques can be applied. Each technique and the fitness functions used are detailed and possible limitations are highlighted.

The remainder of the paper is organized as follows: Section 2 provides an overview of related works in the context of mutation testing and SBST; Section 3 addresses the plan for the systematic review, including its objective and the study selection criteria; Section 4 describes the review; Section 5 discusses the results, synthesis of findings and threats to validity; the conclusions are summarized in Section 6.

Section snippets

Related works

The concept of Search-Based Software Testing is related to the use of a meta-heuristic optimizing search technique, as Genetic Algorithm, for the automation or partial automation a testing task [26]. The aim is the obtaining of good solutions to a problem in an acceptable time in comparison with the application of a random search, for instance. Some techniques have attempted to solve problems of mutation testing criterion. In their application, the major computational cost is related to the

Review plan

SBST techniques have been employed for the automation of the software testing activity by meta-heuristics. Therefore, studies that apply SBST to the context of mutation testing must be identified.

A systematic literature review is a method of synthesis of best quality scientific studies on a specific topic or research question. In contrast to an ad hoc literature selection, a systematic review is a methodologically rigorous review of research results. The aim is the aggregation of all evidence

Conduction of the review

After the selection of the databases, the search string was used and 260 papers were returned. Two papers [58] and [59] were not indexed in any database, and a paper [60] was not returned in the results. Because of their importance, we included them, therefore, 263 papers were considered in the review. The number of returned papers in each database is shown in Table 3.

The review was divided into five phases (Fig. 2) for the selection of papers to be completely read. The first phase (search in

Analysis of findings

The results were analyzed after the selection of studies. Fig. 3 shows the number of studies included and excluded from each database in the first phase.

Fig. 4 displays data of Research Question 2 (How many papers were published per year?) and the number of papers published from 1998 to 2014. We can observe a growth in the publishing of papers in the search based mutation testing area, especially in the years 2012 (7 papers published), 2013 (10 papers published) and 2014 (9 papers published).

Results and synthesis of findings

This section provides a synthesis of the studies. The objective is the identification of meta-heuristics applied to mutation testing. The first research question regards the stage of the mutation testing process to which SBST is applied. After the analysis of all the 69 primary studies, we observed meta-heuristics were applied to optimize the selection of mutation operator, test data generation, mutant generation and the simultaneous generation of mutants and test data. One study applied

Discussion

The body of evidence has shown the use of meta-heuristics for mutation testing is mostly related to test data generation (present in 49 studies). Regarding the use of fitness function, 21 studies utilized mutation score to calculate it, 9 studies utilized reachability, necessity and sufficiency cost and 19 studies utilized different fitness functions. The techniques found can be classified into three distinct groups based on how the data are generated (Fig. 9). Techniques in the first group use

Threats to validity

A fundamental question concerning results from an experiment refers to their validity, therefore, the systematic review, as a discipline of Experimental Software Engineering, has been validated. It ensures the results found can be considered valid for the population of interest (in this case, the academic and scientific community) [120]. Wohlin et al. [120] defined four different threats to the validity of empirical studies: conclusion, internal, construct and external threats.

The conclusion

Conclusions

This paper has addressed a systematic review for the finding of evidences about the use of meta-heuristic search techniques for mutation testing. Sixty-nine primary studies demonstrated the use of meta-heuristics for mutation operators selection (1 paper), test data generation (49 papers), mutant generation (15 papers) and simultaneous mutant and test data generation (4 papers). The main search techniques investigated for mutation testing are Genetic Algorithm, NSGA-II and Hill Climbing.

Based

Acknowledgments

The authors acknowledge CAPES, Brazilian funding agency (grant no. DS-7252237/D) for the financial support.

References (120)

  • E.-G. Talbi

    Metaheuristics: From Design to Implementation

    (2009)
  • F. Glover et al.

    Handbook of Metaheuristics

    (2003)
  • M. Harman et al.

    Achievements, open problems and challenges for search based software testing

    Proceedings of the 8th IEEE International Conference on Software Testing, Verification and Validation

    (2015)
  • B. Korel

    Automated software test data generation

    IEEE Trans. Softw. Eng.

    (1990)
  • N. Tracey et al.

    An automated framework for structural test-data generation

    Proceedings of the 13th IEEE international conference on Automated software engineering

    (1998)
  • C.C. Michael et al.

    Generating software test data by evolution

    IEEE Trans. Softw. Eng.

    (2001)
  • I. Hermadi

    Genetic algorithm based test data generator

    Proceedings of the 2003 Congress on Evolutionary Computation, CEC 2003

    (2003)
  • S. Khor et al.

    Using a genetic algorithm and formal concept analysis to generate branch coverage test data automatically

    Proceedings of the 19th IEEE International Conference on Automated Software Engineering

    (2004)
  • G. Rothermel et al.

    Empirical studies of a safe regression test selection technique

    IEEE Trans. Softw. Eng.

    (1998)
  • N. Mansour et al.

    Empirical comparison of regression test selection algorithms

    J. Syst. Softw.

    (2001)
  • C.L.B. Maia et al.

    A multi-objective approach for the regression test case selection problem

    Proceedings of Anais do XLI Simpósio Brasileiro de Pesquisa Operacional

    (2009)
  • G. Rothermel et al.

    Prioritizing test cases for regression testing

    IEEE Trans. Softw. Eng.

    (2001)
  • K.R. Walcott et al.

    Timeaware test suite prioritization

    Proceedings of the 2006 International Symposium on Software Testing and Analysis

    (2006)
  • Z. Li et al.

    Search algorithms for regression test case prioritization

    IEEE Trans. Softw. Eng.

    (2007)
  • C.L.B. Maia et al.

    Automated test case prioritization with reactive GRASP

    Adv. Soft. Eng.

    (2010)
  • M.B. Cohen et al.

    Constructing test suites for interaction testing

    Proceedings of the 25th International Conference on Software Engineering

    (2003)
  • J. Wegener et al.

    Testing real-time systems using genetic algorithms

    Softw. Qual. Control

    (1997)
  • J. Wegener et al.

    Verifying timing constraints of real-time systems by means of evolutionary testing

    Real-Time Syst.

    (1998)
  • L.C. Briand et al.

    Performance Stress Testing of Real-Time Systems Using Genetic Algorithms

    Technical Report

    (2004)
  • M. Pezzè et al.

    Teste e Análise de Software: Processos, Princípios e Técnicas

    (2008)
  • M.E. Delamaro et al.

    Conceitos básicos

  • P. McMinn

    Search-based software testing: past, present and future

    Proceedings of the International Workshop on Search-Based Software Testing (SBST 2011)

    (2011)
  • A.J. Offutt et al.

    Mutation Operators for Ada

    Technical Report

    (1996)
  • A.J. Offutt et al.

    Mutation 2000: uniting the orthogonal

    Mutation Testing for the New Century

    (2001)
  • A.T. Acree

    On mutation

    (1980)
  • A.P. Mathur

    Performance, effectiveness, and reliability issues in software testing

    Proceedings of the Fifteenth Annual International Computer Software and Applications Conference, 1991. COMPSAC ’91

    (1991)
  • A.J. Offutt et al.

    An experimental evaluation of selective mutation

    Proceedings of the 15th International Conference on Software Engineering

    (1993)
  • E.S. Mresa et al.

    Efficiency of mutation operators and selective mutation strategies: an empirical study

    Softw. Test. Verif. Reliab.

    (1999)
  • A.J. Offutt et al.

    An experimental determination of sufficient mutant operators

    ACM Trans. Softw. Eng. Methodol.

    (1996)
  • E.F. Barbosa et al.

    Uma contribuição para a determinação de um conjunto essencial de operadores de mutação no teste de programas C

    XII Simpósio Brasileiro de Engenharia de Software

    (1998)
  • S. Hussain

    Mutation Clustering

    (2008)
  • Y. Jia et al.

    Constructing subtle faults using higher order mutation testing

    Proceedings of 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation

    (2008)
  • R.H. Untch

    On reduced neighborhood mutation analysis using a single mutagenic operator

    Proceedings of the 47th Annual Southeast Regional Conference, 2009, Clemson, South Carolina, USA

    (2009)
  • P. Ammann et al.

    Establishing theoretical minimal sets of mutants

    Proceedings of 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation (ICST)

    (2014)
  • P. McMinn

    Search-based software test data generation: a survey

    Softw. Test. Verif. Reliab.

    (2004)
  • A. Arcuri et al.

    On the effectiveness of whole test suite generation

    Proceedings of the Sixth International Conference on Search Based Software Engineering

    (2014)
  • M. Höschele et al.

    Test generation across multiple layers

    Proceedings of the 7th International Workshop on Search-Based Software Testing

    (2014)
  • T. Vos et al.

    Evolutionary functional black-box testing in an industrial setting

    Softw. Qual. J.

    (2013)
  • G. Gay et al.

    Moving the goalposts: coverage satisfaction is not enough

    Proceedings of the 7th International Workshop on Search-Based Software Testing

    (2014)
  • R.E. Lopez-Herrejon et al.

    A parallel evolutionary algorithm for prioritized pairwise testing of software product lines

    Proceedings of the 2014 Conference on Genetic and Evolutionary Computation

    (2014)
  • Cited by (71)

    • Assessing test artifact quality—A tertiary study

      2021, Information and Software Technology
    • Testing RESTful APIs: A Survey

      2023, ACM Transactions on Software Engineering and Methodology
    View all citing articles on Scopus
    View full text