Keywords

1 Introduction

Symbolic transition systems, such as Input Output Symbolic Transition Systems (IOSTS) [11] are a classical reference modeling framework for model-based testing of reactive systems. They provide a convenient abstraction of the behaviors of such systems by modeling system state evolution using variable assignments. The symbolic execution tree of an IOSTS characterizes the different classes of numeric executions. Each path defines a sequence of symbolic inputs and outputs, and a path condition which is a formula constraining the values exchanged (inputs or outputs) with the environment of the system. It is possible to use such paths as reference symbolic behaviors to be tested (i.e. as test purposes). In [11], we have proposed a framework to analyze IOSTS both to extract such test purposes and to solve the oracle problem thanks to a fully on-line algorithm. However, this kind of framework is limited by the symbolic treatment of functions. Indeed, IOSTS variables are assigned by terms built on functions. In order to be able to reason on the symbolic values assigned to variables, the symbolic execution engine is equipped with constraint solving techniques able to analyze those functions. As long as one deals with basic arithmetic or boolean functions, it is generally tractable, but as soon as one deals with user-defined or ad-hoc functions, solving techniques may fail to scale, or even, due to undecidability results, such techniques may not exist. Analyzing such functions (later referred as “programs”) may require both to deal with sophisticated data structures and to explore their (arbitrarily complex) control graph.

In this paper we propose an approach to overcome this limitation by abstracting program behaviors by means of contracts [18]. A contract for a program consists in a collection of couples, also called behaviors, formed of a pre-condition that specifies constraints that the caller must enforce at the call site, and a post-condition which is a property guaranteed at the program return. We enrich the basic IOSTS framework to deal with program calls equipped with contracts. We show how to extend symbolic execution mechanisms to reason about IOSTS equipped with program calls by analyzing those calls through their contracts. Thus, we avoid analyzing the actual behavior of the program and replace it by abstract constraints on its formal parameters. Our framework allows computing symbolic paths that can be used as test purposes. It may happen that guards and contracts are incompatible so that some symbolic paths are infeasible (i.e. they have no associated trace). In practice it means that there exists no program that can both satisfy its associated contracts and compute values allowing to follow the whole symbolic path. We show how to use symbolic techniques to check that a given set of symbolic paths is consistent with respect to program calls.

Moreover, since guards occurring on transitions of an IOSTS interact with contracts associated to programs, we present an approach to extract new contracts for each of the program exercised. Such contracts reflect constraints on the program that make the path feasible. As such, they represent new contracts that can be used at the unitary level, to evaluate the correctness of actual program used to implement the system under test.

The remaining of the paper is organized as follows. In Sect. 2, we give basic definitions about many-typed first order logic. Section 3 presents programs and their contracts. In Sect. 4, we introduce IOSTS with programs. Section 5 defines symbolic execution of an IOSTS with programs and the associated feasibility condition. Finally, usage of symbolic execution for testing purposes, including contract inference for unitary testing is introduced in Sect. 6.

2 Preliminaries

For two sets A and B, \(B^A\) denotes the set of mappings \(f : A \rightarrow B\) from A to B and \(id_A\) is the identity mapping on A. For a mapping \(f:A \rightarrow B\), \(f[a_i \mapsto b_i]_{i \in 1..n}\) is the mapping associating \(b_i\) to \(a_i\) for all i in 1..n and f(a) to a not belonging to \(\{ a_i ~|~ i \in 1..n \}\). By convention, \([a_i \mapsto b_i]_{i \in 1..n}\) stands for \(id_A[a_i \mapsto b_i]_{i \in 1..n}\). For two mappings \(f : A_1 \rightarrow B\) and \(g : A_2 \rightarrow B\) with \(A_1 \cap A_2 = \emptyset \), \(f \cup g : A_1 \cup A_2 \rightarrow B\) is the mapping defined by: \(\forall a \in A_1, (f \cup g)(a)= f(a)\) and \(\forall a \in A_2, (f \cup g)(a)= g(a)\). \(A^*\) (resp. \(A^+\)) denotes the set of words on A provided with the concatenation operator ’.’ and the empty word \(\varepsilon \) (resp. deprived of the empty word \(\varepsilon \)). For an ordered list \(l = (a_1,\ldots ,a_n)\) of n elements of A, \(\{\!\!\{l \}\!\!\}\) denotes the set \(\{a_1, \ldots , a_n\}\) of elements occurring in l.

We use classical multi-typed first order logic to handle data. A data signature is a pair (SF) where S is a set of so-called types and F is a set of functions provided with a profile \(s_1...s_{n-1} \rightarrow s_n\) with each \(s_i\in S\). For \(V = \coprod _{s \in S} V_s\) a set of variables typed in S, the set \(T_F(V)= \coprod _{s \in S} T_F(V)_s\) of so-called functional terms over V is defined as usual over (SF). Moreover, each set \(V_s\) contains an identified subset, denoted \(V_s^{fro}\), whose elements are called frozen variables and we denote \(V^{fro} = \coprod _{s \in S} V_s^{fro}\) the subset of V of all frozen variables. The set \(Sen_F(V)\) of formulas is built over Boolean constants \(\top \) and \(\bot \), equalities \(t = t'\) for t and \(t'\) terms in \(T_F(V)\) of same type and usual Boolean connectives (\(\wedge \), \(\vee \), \(\lnot \), ...). Substitutions over V are applications \(\sigma : V \rightarrow T_{F}(V)\) that preserve types and are such that all elements of \(V^{fro}\) are frozen for \(\sigma \) (i.e. \(\forall v \in V^{fro}, \sigma (v)=v\)). Thus, as frozen variables cannot be substituted, they may be considered as new special constants. Substitutions can be canonically extended to \(T_{F}(V)\). For a term t in \(T_F(V)\), for a formula \(\varphi \) in \(Sen_F(V)\), Occ(t) and \(Occ(\varphi )\) will denote the set of variables occurring in respectively t and \(\varphi \).

A F-model is a set of typed variables \(M = \coprod _{s\in S} M_s\) provided with a function \(\overline{f}: M_{s_1}\times \cdots \times M_{s_{n-1}}\rightarrow M_{s_n}\) for each \(f:s_1 \cdots s_{n-1}~\rightarrow ~s_n\) in F. An interpretation is an application \(\nu : M^V\) that preserves types and can be canonically extended to \(T_{F}(V)\). The satisfaction of a formula \(\varphi \) in \(Sen_F(V)\) by an interpretation \(\nu \in M^V\), denoted \(M \models _{\nu } \varphi \), is defined as usual by considering the meaning of the equality predicate, Boolean constants and connectives. A formula \(\varphi \) in \(Sen_F(V)\) is valid if and only if for all interpretations \(\nu : V \rightarrow M\), \(M \models _{\nu } \varphi \). In the sequel, data signature (SF) and F-model M are supposed given.

3 Programs and Contracts

Programs. User-defined functions, called programs, are identifiers provided with an interface specifying their formal parameters used to store input and output data. We only consider here programs with no side effect and one output variable.

Definition 1

(Program). Let \(X = \coprod _{s \in S} X_s\) be a set of typed variables. A program over X is an identifier p provided with:

  • a list \(InOut(p)=(x_1, \cdots , x_{n+1})\in X^{n+1}\), called the interface of p, with \(n \ge 1\) and \(\forall i \ne j, x_i \ne x_j\). In(p) (resp. Out(p)) denotes the list \((x_1 \cdots x_n)\) (resp. \((x_{n+1})\)) of input (resp. output) formal parameters of p.

  • and a mapping \(Sem: M^{\{\!\!\{In(p) \}\!\!\}} \rightarrow M^{ \{\!\!\{InOut(p) \}\!\!\}}\), called the semantics of p, verifying the so-called semantic condition: \(\forall \nu \in M^{\{\!\!\{In(p) \}\!\!\}}\), \(\forall x_j \in \{\!\!\{In(p) \}\!\!\}, Sem(\nu )(x_j)=\nu (x_j)\).

Depending on the values associated to In(p) through the interpretation \(\nu \), Sem associates a value to the formal parameter \(x_{n+1}\) in Out(p). The semantic condition ensures that a program call has no effect on its input formal parameters. By extrapolation, given a list \(l=(x_1,\cdots ,x_{n+1})\), In(l) and Out(l) will resp. denote \((x_1,\cdots , x_n)\) and \((x_{n+1})\).

A signature \(\varSigma \) is a tuple (SFXP) where (SF) is a data signature and P is a set of programs defined over the set of typed variables X.

Let \(V = \coprod _{s \in S}V\) be a set of typed variables. The set \(T_{\varSigma }(V)= \coprod _{s \in S} T_{\varSigma }(V)_s\) of typed terms over V contains:

  • all functional terms of \(T_F(V)\)

  • all elements \(p(t_1,\cdots ,t_{n})\) with \(p \in P\) of interface \((x_1, \cdots , x_{n+1})\), \(\forall 1 \le i \le n, x_i \in X_{s_i}\), and \(t_i \in T_{F}(V)_{s_i}\). If \(x_{n+1}\in V_s\), \(p(t_1,\cdots ,t_n)\in T_{\varSigma }(V)_s\).

Any interpretation \(\nu : V \rightarrow M\) can be canonically extended on \(T_{\varSigma }(V)\) as follows: for any program p in P defined by its interface \((x_1 \cdots x_{n+1})\) and its semantics \(Sem_p\), let us consider \(\mu _\nu ^p: \{\!\!\{In(p)\}\!\!\}\rightarrow M\) an interpretation such that \(\forall 1 \le i \le n, ~\mu _\nu ^p(x_i)=\nu (t_i)\), we have \(\nu (p(t_1,\cdots ,t_{n}))=Sem_p(\mu _\nu ^p)(x_{n+1})\).

Contracts. Contracts specify what programs are expected to compute, as opposed to how they compute their result. They have been introduced in the pioneering work of Floyd [10] and Hoare [12], and form a key ingredient of the Eiffel programming language [18]. In short, a contract describes what a program requires from its caller (the pre-condition) and what it guarantees when it returns (the post-condition). We use here a slightly refined notion where a contract can be split in a set of behaviors [2, 5]. In this setting, pre-condition of a behavior indicates a possible case in which the program may be executed. As before, when a behavior is active, its post-condition must hold at the end of the execution.

Most of the times, pre and post conditions of a program are simply formulas in resp. \(Sen_F(\{\!\!\{In(p) \}\!\!\})\) and \(Sen_F(\{\!\!\{InOut(p) \}\!\!\})\). However, contracts can involve other variables representing the global state of the system. The latter will be frozen variables whose associated values are conditioned by axioms and cannot be modified. These variables will be useful for inferring contracts from symbolic execution tree, as shown in Sect. 6.2.

Definition 2

(Program Contract). Let \(l = (x_1,\ldots ,x_{n+1})\) be a list of variables with \(\forall i \le n+1, x_i \in X\). Let W be a subset of frozen variables verifying \(X \cap W = \emptyset \). A program contract for l and W is a set:

$$\begin{aligned}\{(Pre_1,Post_1),\ldots ,(Pre_k,Post_k)\}\end{aligned}$$

such that \(\forall \) \(i \le k\), \(Pre_i \in Sen_F( \{\!\!\{In(l) \}\!\!\}\cup W)\) and \(Post_i \in Sen_F( \{\!\!\{l \}\!\!\}\cup W)\).

A program contract is said to be:

  • disjoint if for all \(i,j \le k\) with \(i \ne j\), the formula \(\lnot (Pre_i \wedge Pre_j)\) is valid.

  • complete if the formula \(\bigvee _{i\le k}{Pre_i}\) is valid.

Disjointness requires that at most one behavior of the contract is applicable for any considered input data, i.e. the pre-conditions are mutually exclusive. For simplicity purpose, we only consider disjoint contracts in this paper. Completeness indicates that for any input at least one behavior is applicable. In practice, programs are often partially defined over their input domain. We thus allow incomplete contracts, rejecting input data outside the scope of preconditions.

Example 1

Let us consider a program Price of interface \((x_1,x_2)\) where \(x_1\) is of type Drink, an enumerated type with two values \(\{0,1\}\) and \(x_2\) is of type Integer. \(x_1\) is the input parameter indicating the selected beverage and \(x_2\) is the output parameter corresponding to its price. An example of contract for Price is \(C_r=\{(Pre_1,Post_1),(Pre_2,Post_2)\}\) (both disjoint and complete), with:

  • \(Pre_1: ~x_1=0\), \(Post_1: ~x_2 \ge 100 \wedge x_2 \le 200\)

  • \(Pre_2: ~x_1=1\), \(Post_2: ~x_2 \ge 200 \wedge x_2 \le 300\)

Definition 3

(Contract Satisfaction). Let \(l = (x_1,\cdots ,x_{n+1})\) be an interface, W a set of frozen variables provided with \(Ax \subseteq Sen_F(W)\) and C a contract for l and W. Let us consider an interpretation \(\nu \in M^W\) such that \(M\models _\nu Ax\) and a mapping \(Sem: M^{\{\!\!\{In(l) \}\!\!\}} \rightarrow M^{\{\!\!\{l \}\!\!\}}\) satisfying the semantic condition.

Sem satisfies C up to \(\nu \), denoted \(Sem \models _{\nu } C\), if and only if:

$$\begin{aligned}\forall (Pre,Post) \in C, \forall \mu \in M^{\{\!\!\{In(l) \}\!\!\}}, M \models _{\nu \cup \mu } Pre \Rightarrow M \models _{\nu \cup Sem(\mu )} Post\end{aligned}$$

\(Sem_\nu (C)=\{ Sem: M^{\{\!\!\{In(l) \}\!\!\}} \rightarrow M^{\{\!\!\{l \}\!\!\}} ~| ~Sem \models _\nu C \}\) denotes the set of semantics satisfying C up to \(\nu \).

For each interface l, we consider the trivial contract \(C_{\emptyset ,l} = \{\}\), simply denoted \(C_{\emptyset }\), defined on l that does not restrict behaviors of programs, that is \(p \in Sem(C_{\emptyset })\) for all programs p of interface l. Similarly, we consider the contract \(C_{\top ,l} = \{(\top ,\top )\}\), simply denoted \(C_{\top }\), defined on l that requires that the program is defined for every well-typed input data tuple.

Given a signature \(\varSigma =(S,F,X,P)\), a set of frozen variables W with its set of axioms \(Ax \subseteq Sen_F(W)\), and an interpretation \(\nu \in M^W\) verifying \(M \models _\nu Ax\), we consider families \(\mathbb {C} = (C_p)_{p \in P}\) of contracts indexed by P, in particular \(\mathbb {C}_\emptyset =(C_\emptyset )_{p\in P}\) and \(\mathbb {C}_\top =(C_\top )_{p\in P}\). \(Mod_{\nu }(\mathbb {C})\) is the set of all families \(\mathcal{S}em = (Sem_p)_{p \in P}\) such that \(\forall p \in P, Sem_p \models _\nu C_p\). \(\mathcal{S}em\) is then called a P-model.

4 IOSTS

Input Output Symbolic Transition Systems (IOSTS) represent behaviors of reactive systems as sequences of emissions or receptions of values through communication channels conditioned by guards expressed on some attribute values. An IOSTS-signature \(\varGamma \) is a couple (ACh), where \(A = \coprod _{s \in S} A_s\) is a set of types variables, called attribute variables, such that for all s in S, \(A_s \cap X_s = \emptyset \) and where Ch is a set of communication channel names.

An IOSTS communicates with its environment through communication actions. The set of symbolic actions over \(\varGamma \), denoted \(Act(\varGamma )\), is \(I(\varGamma ) \cup O(\varGamma ) \cup \left\{ \tau \right\} \) where: \(I(\varGamma )=\{c?x | x\in A,c\in Ch\}\) is the set of inputs, \(O(\varGamma )=\{c!t | t\in T_{\varSigma }(A), c\in Ch\}\) is the set of outputs and \(\tau \) is an internal action.

Values of attribute variables can be modified in two ways: by receiving a value from the environment or by assigning a value from some internal process.

Definition 4

(IOSTS). An IOSTS \((Q,q_{0},Tr)\) over \(\varSigma \) and \(\varGamma = (A,Ch)\) is a triple where Q is a set of states, \(q_{0}\in Q\) is the initial state and \(Tr \subseteq Q \times Sen_{F}(A) \times Act(\varGamma ) \times T_{\varSigma }(A)^A \times Q\) is a set of transitions tr of the form \((q,\psi ,act,\rho ,q')\) where:

  • q and \(q'\) are resp. the source (source(tr)) and target state (target(tr)) of tr,

  • \(\psi \in Sen_F(A)\) is a guard

  • \(act \in Act(\varGamma )\) is a communication action;

  • \(\rho \in T_{\varSigma }(A)^A\) is a substitution associating a term to attribute variables;

Remark 1

We can always consider an IOSTS in which guards only contain conjunctions. If not, for a transition tr of guard \(\psi \), it suffices to use a disjunctive normal form \(\bigvee _{i=1}^{n}\psi _i\) equivalent to \(\psi \) and to split the transition into n transitions having the same source, target and communication action as tr and \(\psi _i\) as guard.

Example 2

(Drink vending machine). We consider a very simple drink vending machine. Its behavior is specified by the IOSTS in Fig. 1. An initialization step (\(q \rightarrow q_0\)) sets the amount to zero. Then, in \(q_0\), the machine waits for an amount (x) of coins introduced by the user, and updates the amount m. The user then chooses his/her beverage (0 or 1 for “Tea” or “Coffee”). The choice is stored in variable B. In the transition \(q_2 \rightarrow q_3\), the program Price computes the price of the chosen drink. Two cases are possible here. If the introduced amount is lower than the price (\(m < p\)), then a message “Add” appears on the screen and the machine returns to \(q_0\). Otherwise (\(m \ge p\)), the drink is delivered, the amount is reinitialized to zero and the machine goes back to \(q_0\). Note that transitions outgoing from \(q_3\) constrain the value (p) computed by Price (\(p\ge 150 \wedge p\le 200\)).

For a transition \(tr=(q,\psi ,act,\rho ,q')\in Tr\) and a P-model \(\mathcal{S}em\), the semantics of tr, denoted as \(Run(tr,\mathcal{S}em) \subseteq M^{A} \times Act^{M}(\varGamma ) \times M^{A}\), is defined as the set of triple \((\nu _i,act_M,\nu _f)\) verifying:

  • if act is of the form c!t (resp. \(\tau \)), then \(M \models _{\nu _i} \psi \), \(\nu _f = \nu _i \circ \rho \) and \(act_M =c!\nu _i(t)\) (resp. \(act_M=\tau \))

  • if act is of the form c?x, then \(M \models _{\nu _i} \psi \), there exists \(\nu _a\) such that \(\nu _a(z) = \nu _i(z)\) for every \(z \ne x\), \(\nu _f =\nu _a \circ \rho \) and \(act_M =c?\nu _a(x)\),

Fig. 1.
figure 1

IOSTS of the drink vending machine.

Note that the definition of semantics of transitions is very classical and does not explicitly refers to \(\mathcal{S}em\). In fact, semantics of programs are taken into account when defining \(\nu _f\) from the extensions of \(\nu _i\) or \(\nu _a\) to \(T_\varSigma (A)\) as defined in Sect. 3.

For a run \(r = (\nu _i,act_M,\nu _f)\), we note source(r), act(r) and target(r) resp. for \(\nu _i\), \(act_M\) and \(\nu _f\). \(\nu _i\) and \(\nu _f\) are the interpretation of attribute variables resp. before and after executing the transition. Let us observe that, given a transition tr and an interpretation \(\nu _i\), the set \(Run(tr,\mathcal{S}em)\) does not necessarily contain a run of the form \((\nu _i,act_M,\nu _f)\) due to the fact that \(\nu _i\) may not satisfy \(\psi \).

The set of paths of an IOSTS \(\mathbb {G}=(Q,q_0,Tr)\), denoted \(Path(\mathbb {G})\), are all finite sequences \(tr_1.\cdots .tr_n\) of transitions with \(source(tr_1)=q_0\) and \(\forall i, 1 \le i < n\), \(target(tr_i)=source(tr_{i+1})\). The set of runs of a path \(pa=tr_1. \cdots .tr_n\) in \(Path(\mathbb {G})\), denoted as \(Run(pa,\mathcal{S}em)\), are sequences \(r_1. \cdots . r_n\) such that \(\forall i\le n\), \(r_i\in Run(tr_i , \mathcal{S}em)\) and \(\forall i< n\), \(target(r_i)=source(r_{i+1})\). Similarly, the set of traces \(Traces(pa,\mathcal{S}em)\) of pa is the set of sequences \(act(r_1). \cdots .act(r_n)\) for all \(r_1. \cdots . r_n\in Run(pa,\mathcal{S}em)\), act(r) being equal to \(\varepsilon \) if \(act(r) = \tau \).

In general, it is not guaranteed that there exists at least a run for a given path pa, as it depends on the semantics associated to programs involved in pa.

Definition 5

(Path Feasibility Condition). Let \(\mathbb {G}=(Q,q_0,Tr)\) be an IOSTS over \(\varGamma =(A,Ch)\) and pa a path of \(\mathbb {G}\). pa is a feasible path if and only if:

$$\begin{aligned}\exists ~ \mathcal{S}em \in Mod(\mathbb {C}_{\emptyset }), Traces(pa,\mathcal{S}em) \ne \emptyset \end{aligned}$$

Let W be a set of frozen variables provided with \(Ax \subseteq Sen_F(W)\) and \(\nu \in M^W\) an interpretation satisfying \(M \models _{\nu } Ax\). Let us consider \(\mathbb {C} = (C_p)_{p \in P}\) a family of contracts indexed by P. pa is a feasible path up to \((\nu ,\mathbb {C})\) if and only if:

$$\begin{aligned}\exists \mathcal{S}em \in Mod_{\nu }(\mathbb {C}), Traces(pa,\mathcal{S}em) \ne \emptyset \end{aligned}$$

5 Symbolic Execution and Path Feasibility Condition

Symbolic execution consists in executing an IOSTS for symbolic values (taken from a dedicated set of frozen variables \(Fr=\coprod _{s\in S}Fr_{s}\)) rather than numerical ones, and computing constraints on those values for all possible IOSTS executions. The main novelties with respect to [11] are twofold: substitutions occurring in transitions may include program calls and a renaming mechanism ensures that a given frozen variable can not appear in two distinct paths.

To store information concerning an execution, we use structures called symbolic states. A symbolic state is a tuple of the form \((q, \pi , \lambda ,\kappa )\) where \(q \in Q\), \(\pi \in Sen_{F}(Fr)\), \(\lambda : A \rightarrow T_{F}(Fr)\) is an application preserving types and \(\kappa \subset P \times T_{F}(Fr)^* \times Fr\). For a symbolic state \(\eta =(q,\pi ,\lambda ,\kappa )\), q (or \(q(\eta )\)) denotes the state reached after an execution leading to \(\eta \), \(\pi \) (or \(\pi (\eta )\)) is a constraint on variables in Fr called path condition that should be satisfied for the execution to reach \(\eta \), \(\lambda \) (or \(\lambda (\eta )\)) denotes terms over variables in Fr that are assigned to variables of A and \(\kappa \) (or \(\kappa (\eta )\)) denotes the set of tuples of the form \((p,(t_1,\cdots ,t_n),x)\) indicating that a program call has been performed for the program p with the arguments \((t_1,\cdots ,t_n)\) and that its result is stored in the variable x in Fr.

In our approach we do not have the code of programs. Instead, we reason on their contracts. Since the input formal parameters associated to a call are represented symbolically by functional terms \(t_1, \cdots , t_n\), different pre-conditions may hold depending on the way those terms will be interpreted. At the symbolic execution level, we thus consider a sub-case for each of those pre-conditions. More precisely, the symbolic execution of a transition tr from a given symbolic state \(\eta \) will consist in a set of symbolic transitions, one for each possible combination of pre-conditions for all program calls occurring in tr. We now introduce some notations aiming at tracing program calls: for a substitution \(\rho : A \rightarrow T_\varSigma (A)\) and for \(p\in P\), \(Res(p,\rho )\) is the set of variables \(y\in A\) such that \(\rho (y)\) is of the form \(p(t_1 ,\cdots ,t_n)\) and for such an y, \(Arg(y,\rho )\) is then \((t_1 , \cdots ,t_n)\) and \(Prog(y,\rho ) =p\). We also denote \(Res(\rho )\) for \(\bigcup _{p\in P} Res(p,\rho )\).

Definition 6

(Symbolic Execution of Transitions). Let \(\mathbb {G}=(Q,q_0,Tr)\) be an IOSTS over \(\varSigma \) and \(\varGamma =(A,Ch)\), \(tr=(q,\psi ,act,\rho ,q')\in Tr\) be a transition and \(\eta =(q,\pi ,\lambda ,\kappa )\) be a symbolic state over \(\mathbb {G}\).

If act is of the form c?x, \(\lambda _i=\lambda [x \mapsto f]\), f fresh in Fr. Otherwise, \(\lambda _i=\lambda \).

\(\lambda '\) is the substitution such that for all \(y\in Res(\rho )\), \(\lambda '(y)\) is a fresh variable of Fr and for all \(y \in A \setminus Res(\rho )\), \(\lambda '(y)= \lambda _i \circ \rho (y)\).

The symbolic execution \(SE(tr,\eta )\) of tr from \(\eta \) is the set defined as follows:

  • if \(Res(\rho )=\emptyset \) then \(SE(tr,\eta )=\{ (\eta , \lambda _i (act) , \eta ')\}\) with \(\eta '=(q', \pi \wedge \lambda (\psi ) ,\lambda ',\kappa )\).

  • if \(Res(\rho )\ne \emptyset \), we consider all mappings \(Beh: Res(\rho ) \rightarrow \bigcup _{p\in P}C_p\) such that for \(y\in Res(p,\rho )\), \(Beh(y) = (Pre_y,Post_y) \in C_p\). For \(y \in Res(p,\rho )\) with \(InOut(p) = (x_1, \cdots , x_n, x_{n+1})\) and \(Arg(y,\rho )=(t_1,\cdots ,t_n)\), we have \((\eta , \lambda _i (act), \eta ')\in SE(tr,\eta )\) with

    • \(\eta '\) the symbolic state \((q', \pi \wedge \lambda (\psi ) \wedge \bigwedge _{y\in Res(\rho )} \varDelta (y) ,\lambda ',\kappa ')\)

    • \(\varDelta (y)=(Pre_y \wedge Post_y)[x_1 \mapsto \lambda _i (t_1) \cdots x_n \mapsto \lambda _i (t_n), x_{n+1}\mapsto \lambda '(x)]\)

    • \(\kappa '\) the set \(\kappa \cup \bigcup _{y \in Res(\rho )} \{(Prog(y,\rho ),(\lambda _i(t_1),\cdots ,\lambda _i(t_n)),\lambda '(y))\}\)

Elements of \(SE(tr,\eta )\) are called symbolic transitions. We denote \(Fr(\eta ')\) the set of all fresh variables of Fr occurring in its definition.

Example 3

In order to illustrate Definition 6, let us consider a transition tr of the form \((q,\psi ,c?x,\rho ,q')\) with \(\rho = [y \mapsto p_1(t_1,t_2), z \mapsto t'_1+ t'_2]\) with \(p_1\) a program and \(t_1,t_2,t'_1,t'_2\) functional terms. Let us observe that \(Res(p_1,\rho ) = \{y\}\), \(Arg(y,p_1) = (t_1,t_2)\), \(Prog(y,\rho ) = p_1\) and \(Res(\rho ) = \{y\}\).

Let \(\eta =(q,\pi ,\lambda ,\kappa )\) be a symbolic state. Let us suppose that the program \(p_1\) is provided with an interface \((x_1,x_2,x_3)\) and with a behavior \((Pre_1,Post_1)\). Then \(SE(tr,\eta )\) contains the symbolic transition \((\eta ,c?f_1,\eta ')\) with \(f_1\) a fresh variable of Fr and \(\eta '\) the symbolic state defined as:

$$\begin{aligned} {\begin{array}{cl} (\!\!\!\!\!\!\!\!\!\!\!\!\!&{}q' , \pi \wedge \lambda (\psi ) \wedge (Pre_1 \wedge Post_1)[x_1 \mapsto \lambda [x \mapsto f_1](t_1), x_2 \mapsto \lambda [x \mapsto f_1](t_2),x_3 \mapsto f_2] \\ &{} [x \mapsto f_1, y \mapsto f_2, z \mapsto \lambda [x \mapsto f_1](t'_1+t'_2)], \\ &{} \kappa \cup \{(p_1,(\lambda [x\mapsto f_1](t_1),\lambda [x\mapsto f_1](t_2)),f_2\} ) \end{array}} \end{aligned}$$

\(Fr(\eta ')\) is then \(\{f_1,f_2\}\).

Definition 7

(IOSTS Symbolic Execution). Given an IOSTS \(\mathbb {G}\), the symbolic execution \(SE(\mathbb {G})=(Init, ST)\) of \(\mathbb {G}\) is minimally defined by:

  • \(Init = (q_0,Ax,\lambda _0)\) with \(\forall x \in A, \lambda _0 (x) \in Fr\) and \(\forall x \ne y \in A, \lambda _0 (x) \ne \lambda _0(y)\),

  • for \(tr\in Tr\) and \(\eta \) symbolic state with \(source(tr)=q(\eta )\), \(SE(tr,\eta )\subseteq ST\).

  • for any distinct \(SE(tr_1,\eta _1)\) \(SE(tr_2,\eta _2)\) that are defined, \(Fr(SE(tr_1,\eta _1))\cap Fr(SE(tr_2,\eta _2))=\emptyset \).

Definition 8

(Paths and Distinguished Paths). The set \(Paths(SE(\mathbb {G}))\) of paths of \(SE(\mathbb {G})\) is the set of all sequences \(tr_1 \cdots tr_n\) with \(\forall i \in 1..n, tr_i \in ST\) such that \(source(tr_1) = Init\) and for any \(j<n\), \(q(target(tr_j))=q(source(tr_{j+1}))\).

For a non-empty sequence \(\delta =tr_1 \cdots tr_n\), we note \(End(\delta )=target(tr_n)\) and \(Fr(\delta ) = \cup _{i \in 1..n} Fr(target(tr_i))\). By convention, \(End(\varepsilon )=Init\) and \(Fr(\varepsilon ) = \emptyset \).

Given a finite subset \(\varDelta \) of \(Paths(SE(\mathbb {G}))\), \(DPaths(\varDelta )\) is a set of paths \(\delta ^*\) such that there exists an unique path \(\delta \) in \(\varDelta \) such that \(\delta \) and \(\delta ^*\) are isomorphic up to a renaming of variables of Fr and such that for two distinct paths \(\delta _1^*\) and \(\delta _2^*\) in \(DPaths(\varDelta )\), \(Fr(\delta _1^*) \cap Fr(\delta _2^*) = \emptyset \).

We say that \(DPaths(\varDelta )\) is a set of distinguished paths issued from \(SE(\mathbb {G})\).

Generally speaking, a set \(\varDelta \) of \(Paths(SE(\mathbb {G}))\) represents a tree whose transitions issued from the root Init can be shared by several paths of \(\varDelta \) while \(DPaths(\varDelta )\) consists in applying a variable renaming mechanism in order to duplicate shared transitions to completely separate paths. Distinguished paths can still share common variables, namely those in W.

Example 4

The drink vending machine of Fig. 1 has two possible paths from q to \(q_0\) with exactly one cycle on \(q_0\). They share a transition with a call to program Price defined by its contract \(C_r\) as seen in Example 1. We thus get 4 distinguished paths shown in Fig. 2. Associated path conditions are the following:

$$\begin{aligned} \begin{array}{l} pc_1: B_1=0 \wedge p_1 \ge 100 \wedge p_1 \le 200 \wedge v_1<p_1 \wedge p_1\ge 150 \wedge p_1 \le 200 \\ pc_2: B_2=0 \wedge p_2 \ge 100 \wedge p_2 \le 200 \wedge v_2\ge p_2 \wedge p_2\ge 150 \wedge p_2 \le 200 \\ pc_3: B_3=1 \wedge p_3 \ge 200 \wedge p_3 \le 300 \wedge v_3<p_3 \wedge p_3\ge 150 \wedge p_3 \le 200 \\ pc_4: B_4=1 \wedge p_4 \ge 200 \wedge p_4 \le 300 \wedge v_4\ge p_4 \wedge p_4\ge 150 \wedge p_4 \le 200 \end{array} \end{aligned}$$
Fig. 2.
figure 2

Symbolic paths.

A path condition is a formula over the frozen variables built by accumulating constraints from the guards of the IOSTS transitions and from constraints of called programs contracts. The path is infeasible if its path condition is not satisfiable. In addition, this feasibility depends on the fact that if a program is called twice with the same arguments, it returns the same value (semantic condition of Definition 1). Since this is not enforced by the path condition alone, we consider another set of constraints accounting for this condition:

Definition 9

(Feasibility of a Set of Paths). Let \(\mathbb {G}\) be an IOSTS over \(\varGamma =(A,Ch)\) and let \(\varDelta ^*\) be a set of distinguished paths issued from \(SE(\mathbb {G})\).

For any program p of interface \((x_1,\cdots ,x_n,x_{n+1})\), for \((p,(t_1,\cdots ,t_n),f)\) and \((p,(t'_1,\cdots ,t'_n),f')\) two distinct elements of \(\cup _{\delta ^* \in \varDelta ^*} \; \kappa (End(\delta ^*))\) we introduce the deterministic program condition relating to these two program calls as the formula \(\phi _{\{f,f'\}}\) defined by \(\bigwedge _{i=1}^n t_i=t'_i \Rightarrow f = f'\).

The deterministic program condition related to \(\varDelta ^*\) is then \(\varPhi _p=\bigwedge _{}\phi _{\{f,f'\}}\), for all f and \(f'\) appearing as return variable of a call of p in \(\varDelta ^*\).

Finally, the feasibility condition of \(\varDelta ^*\) is

$$\begin{aligned}\bigwedge _{\delta ^* \in \varDelta ^*} \pi (End(\delta ^*)) \wedge \bigwedge _{p\in P} \varPhi _p\end{aligned}$$

If this feasibility condition holds, it is possible to implement the programs occurring in the IOSTS so that all paths of \(\varDelta ^*\) will complete successfully. Note that the contracts of the programs are taken into account in the path condition, and have thus an impact on the paths that are feasible or not.

Example 5

In the context of the drink vending machine, we now want to check the feasibility condition of the distinguished paths associated to the paths described in Example 4 according to two distinct contracts for Price, denoted resp. \(C_w\) and \(C_r\) (in Example 1). Both \(C_w\) and \(C_r\) include two behaviors resulting in 4 distinguished paths. Path conditions are given in Table 1.

  • With the contract \(C_w\), no distinguished path is feasible because of contradictions between guards of the IOSTS transitions and post-conditions of \(C_w\).

  • With the contract \(C_r\), all distinguished paths are feasible. Price can return anything between 150 and 200 for an argument equal to 0 and must return 200 for an argument equal to 1.

Table 1. Feasibility according to different contracts

6 Testing

6.1 Model-Based Testing of IOSTS with Program Calls and Contracts

In a previous work [11], we have proposed an online testing algorithm to test Systems Under Test (SUT) with respect to a basic IOSTS (without program calls). The algorithm is based on the ioco conformance relation [21] and on the use of test purposes (TP) to select some behaviors to be tested. A TP is a finite sub-tree of the symbolic execution structure (SES) derived from the IOSTS of reference so that any execution trace constructed by interacting with SUT and leading to a leaf of TP will be considered as covering TP. The testing process is implemented as a simultaneous traversal of both SES and TP. Verdicts depend on whether the observed execution trace does or does not belong to TP and SES: WeakPASS when the execution trace covers TP and belongs to at least one path of SES which does not end at a leaf of TP, PASS when the execution trace covers TP and does not belong to another path of SES, INCONC (for inconclusive) when the execution trace belongs to SES but does not cover TP, FAIL when the execution trace does not cover TP and goes outside SES.

In Sect. 5, we have associated to any IOSTS with contracts a symbolic tree structure in order to be able to use it both as the SES input of the algorithm given in [11] and as a carrier to extract a finite sub-tree to play the role of TP. We can use the work described in [11] with the following slight modifications:

  • Unlike [11], we allow unobservable \(\tau \) transitions. Under the assumption that there does not exist a cycle of \(\tau \) transitions, we can replace any sequence of consecutive \(\tau \) transitions by a transition carrying the input/output action just located at the end of the \(\tau \) sequence. Furthermore, in [11], quiescence conditions are expressed by enriching the reference IOSTS with transitions carrying the special label \(\delta \) denoting the intended absence of reaction. Because the presence of \(\tau \) transitions makes such a direct enrichment tricky, it becomes more appropriate to perform this enrichment at the level of the \(\tau \)-reduced symbolic execution itself. Once the operations of \(\tau \)-reduction and \(\delta \)-enrichment are applied to the symbolic execution of the IOSTS with contracts, we can then apply the algorithm of [11] for free.

  • In [11], path conditions for paths that are part of test purposes are satisfiable by construction. In our setting, we have to take into account the notion of feasibility, i.e., the existence or not of programs that meet their associated contracts and that are compatible with considered paths. Indeed, if the considered set of distinguished paths constituting the test purpose is unfeasible, then the application of algorithm is meaningless. In other words, the feasibility of the targeted set of paths plays the role of a testing hypothesis.

6.2 Contracts Inference

As we have seen in Sect. 5, the feasibility condition checks whether a given program contract preserves the feasibility of a symbolic path or not. In this section, we focus on the inference of contracts based on path conditions. Such contracts can then be used to define unit tests for the programs. More precisely, we start with an IOSTS \(\mathbb {G}\) calling programs without associated contract. We then show that we can infer contracts such that feasible paths of \(\mathbb {G}\) are guaranteed to verify the feasibility condition of the IOSTS augmented with contracts. The generated contract for a program p contains one behavior per call to p in \(SE(\mathbb {G})\). For that, we use the parts of the final condition of the path on which the call occurs that are related to the return variable and to the arguments.

Given a formula F, we define inductively the set \(Rel_F(X)\) of variables related to a set X of variables, as the smallest set satisfying the following conditions

  • \(X\subset Rel_F(X)\)

  • \(Occ(t_1=t_2)\cap Rel_{t_1 = t_2}(X)\ne \emptyset \Rightarrow Occ(t_1=t_2)\subset Rel_{t_1 = t_2}(X)\)

  • \(Rel_{F_1}(X)\cup Rel_{F_2}(X)=Rel_{F_1\wedge F_2}(X)=Rel_{F_1\vee F_2}(X)\)

  • \(Rel_{F}(X)=Rel_{\lnot F}(X)\)

Similarly, for a formula F and a set of variables X, \(Clean_X(F)\) is defined as follows. As noted in Remark 1, we can assume that the path condition only has conjunctions, and is in negation-normal form.

  • \(Clean_X(\top )=\top \)    and   \(Clean_X(\bot )=\bot \)

  • \(Clean_X(t_1=t_2)=t_1=t_2\) if \(Occ(t_1=t_2)\cap X\ne \emptyset \)

  • \(Clean_X(t_1=t_2)=\top \) if \(Occ(t_1=t_2)\cap X = \emptyset \)

  • \(Clean_X(\lnot t_1=t_2)=\lnot t_1=t_2\) if \(Occ(t_1=t_2)\cap X\ne \emptyset \)

  • \(Clean_X(\lnot t_1=t_2)=\top \) if \(Occ(t_1=t_2)\cap X = \emptyset \)

  • \(Clean_X(F_1\wedge F_2)=Clean_X(F_1)\wedge Clean_X(F_2)\)

Remark 2

If F is satisfiable, then \(Clean_X(F)\) is also satisfiable, as we only remove atomic propositions from the conjunction.

Definition 10

(Contract Inference). Let \(\mathbb {G}\) be an IOSTS , \(\varDelta ^*\) a set of distinguished paths from \(SE(\mathbb {G})\). We note \(\kappa (\varDelta ^*)=\cup _{\delta ^*\in \varDelta ^*}\kappa (End(\delta ^*))\).

For any f such that \((p,(t_1,\cdots ,t_n),f)\in \kappa (\varDelta ^*)\), with \(In(p)=(x_1,...,x_{n})\) and \(Out(p)=x_{n+1}\), we define a behavior \((Pre_f, Post_f)\) for p, as well as a set of frozen variables \(G_f\) and axioms \(Ax_f\).

We pose \(\phi =\pi (End(\delta ^*))\) the final condition for the path containing the call and \(Y=Occ(t_1,\cdots ,t_n)\cup \{f\}\) the variables occurring in the call. Then

  • \(G_f\) is \(Rel_\phi (Y)\)

  • \(Ax_f\) is \(Clean_{Rel_\phi (Y)}(\phi )\)

  • \(Pre_f\) is \(\bigwedge _{i=1}^{n} x_i=t_i\)

  • \(Post_f\) is \(x_{n+1} = f\)

Finally, the inferred contracts for \(\varDelta ^*\) are defined as follows.

  • G is \(\bigcup _{(p,(t_1,\cdots ,t_n),f)\in \kappa (\varDelta ^*)} G_f\)

  • Ax is \(\bigwedge _{(p,(t_1,\cdots ,t_n),f)\in \kappa (\varDelta ^*)} Ax_f\)

  • \(\forall p\in P, C_p=((Pre_f,Post_f))_{(p,(t_1,\cdots ,t_n),f)\in \kappa (\varDelta ^*)}\)

Example 6

Let us consider here a symbolic path \(\delta ^*\) of our drink vending machine’s specification (Fig. 3) that calls twice the program Price of interface \((x_1,x_2)\). The first call leads to the appearance of a message “Add” on the screen and the second call permits the drink delivery, such that:

\(\pi (End(\delta ^*)):v_1<p_1 \wedge p_1 \ge 150 \wedge p_1 \le 200 \wedge (v_1+v_2) \ge p_2 \wedge p_2 \ge 150 \wedge p_2 \le 200\)

Fig. 3.
figure 3

Symbolic path.

From the path condition \(\phi =\pi (End(\delta ^*))\), two behaviors will be generated according to Definition 10. For \(p_1\) the result of the first call \((Price,(B_1),p_1)\) we have:

$$\begin{aligned} \begin{array}{ccl} Y &{} : &{} \{B_1,p_1\} \\ G_{p_1} &{} : &{} \{B_1,p_1,v_1,p_2,v_2\} \\ Ax_{p_1} &{} : &{} v_1<p_1 \wedge p_1 \ge 150 \wedge p_1 \le 200 \wedge (v_1+v_2) \ge p_2 \wedge p_2 \ge 150 \wedge p_2 \le 200 \\ Pre_{p_1} &{} : &{} x_1= B_1\\ Post_{p_1} &{} : &{} x_2=p_1 \end{array} \end{aligned}$$

For \(p_2\) the result of the second call \((Price,(B_2),p_2)\) we have:

$$\begin{aligned} \begin{array}{ccl} Y &{} :&{} \{B_2,p_2\} \\ G_{p_2} &{} : &{} \{B_2,p_2,v_1,p_1,v_2\} \\ Ax_{p_2} &{} : &{} v_1<p_1 \wedge p_1 \ge 150 \wedge p_1 \le 200 \wedge (v_1+v_2) \ge p_2 \wedge p_2 \ge 150 \wedge p_2 \le 200 \\ Pre_{p_2} &{} : &{} x_1=B_2 \\ Post_{p_2} &{} : &{} x_2=p_2 \end{array} \end{aligned}$$

Finally, the inferred contract for our program Price in \(\delta ^*\) is defined by: \(G=G_{p_1} \cup G_{p_2}\), \(Ax=Ax_{p_1}\wedge Ax_{p_2} \) and \(C=((Pre_{p_1},Post_{p_1}),(Pre_{p_2},Post_{p_2}))\)

We can now define the IOSTS \(\mathbb {G}'\) with the same signature and transitions than \(\mathbb {G}\) and equipped with the inferred contracts for the programs in P. Then, for every path \(\delta ^*\) in \(\varDelta ^*\) that is feasible, there exist paths \(\underline{\delta ^*}\) in \(\mathbb {G}'\) similar to \(\delta ^*\) except that the path conditions \(\pi \) are augmented with axioms and behaviors. For each \((p, (t_1,...,t_n), f)\in \kappa (End(\delta ^*))\), \(Ax_f\) is satisfiable by Remark 2 and the behavior \((Pre_f,Post_f)\) becomes trivially true: one of the behaviors of p makes the corresponding transition feasible. Since this is true for any call in \(\delta ^*\), there exists thus a path in \(\underline{\delta ^*}\) that is feasible. This leads to the following theorem.

Theorem 1

(Feasibility Preservation). Let \(\mathbb {G}\) be an IOSTS, \(\varDelta ^*\) a set of feasible distinguished symbolic paths of \(\mathbb {G}\). \(\mathbb {G}'\) is the IOSTS obtained by adding to G the inferred contracts of Definition 10. For any path \(\delta ^*\) in \(\varDelta ^*\), there exists a symbolic path \({\delta ^*}'\) for \(\mathbb {G}'\) having the same transitions as \(\delta ^*\) and which is feasible.

7 Related Work

In the context of reactive systems verification, IOSTS and symbolic execution have been used in many works [1, 11, 14] for different purposes. They use IOSTS with atomic actions and substitutions whereas, in our case, we enrich IOSTS with programs specified by contracts. Our purpose is to define an integration framework and analyze in one hand the impact of programs contracts on a whole system and in the other hand elicit accurate contracts for our programs.

Our work is quite close to [13], that augments a SOA’s BPEL business model with pre- and post-condition contracts defining essential component traits, and derive a suite of feasible test cases, taking into account contracts that are provided for some of the opaque components of their system. On the other hand, they do not infer contracts from the constraints expressed directly in the BPEL model as is done in Sect. 6.2.

The use of symbolic execution and path feasibility analysis are studied in [3, 22] but this is limited to the analysis of programs themselves and does not take in consideration as we do the impact of the program calls on the feasibility of the system as a whole. Similarly, symbolic execution techniques over the code have been used to infer program annotations. More specifically, such approaches concentrate on generating invariants. This is for instance the case in the KeY verification framework [20], for the DySy tool [7], or for the iDiscovery tool [24]. Those invariants are meant to help the formal verification of the code against its specification, while we are aiming at generating a specification that the programs must meet in order to be usable in the context of the system under test.

The problem of inferring contracts for programs has been studied differently in other works that do not rely on symbolic execution. In particular, [6] derives pre-conditions from assertions already present in the code using abstract interpretation. [23] uses dynamic analysis to augment simple programmer-written contracts with candidate post-conditions that describes precisely what the code is doing, building upon techniques developed initially in the Daikon tool [9] for proposing likely invariants. This kind of inference is dual to ours, in the sense that we infer contracts in a top-down approach, in order to express what conditions individual components should fulfill inside a broader system, while the works mentioned above are bottom-up, encapsulating the behavior of actual code in contracts in order to check whether callers can use this particular implementation. The same can be said of works that aim at generating transition systems modeling the behavior of programs, either as message sequence charts as in [16], or as scenarios expressed under the form of live sequence charts, as in [17].

8 Conclusion

In this work, we extended the IOSTS framework with programs which are specified with contracts and we adapted symbolic execution techniques to deal with them. This gives rise to two main results. First, we study how contracts impact path conditions and describe the feasibility condition of the entire symbolic execution tree. Second, we show that path conditions can be used to infer contracts for programs in order to specify what these programs should do in the context of the system under test. Such contracts can then be used for unitary testing purposes, while feasibility preservation theorem gives some guarantees that program calls will not get in the way during integration testing.

The contribution of this paper is mainly theoretical, only illustrated by a toy example by lack of space. In [4], we provide a more realistic example involving a program call for giving the money change by considering three possible values for coins and the state of the coin reserve of the vending machine.

Implementation of the technique presented in this paper is currently under development in the Diversity [8] symbolic execution tool and the Frama-C [15] C code analysis framework using the ACSL specification language [2] as target for contract inference.

This work is in its early stages, nevertheless it provides a promising framework to explore integration testing for systems whose user scenarios are described using some IOSTS extensions (e.g. UML Sequence Diagrams [19]) and whose unitary bricks are program calls.