Copyright © 2003 Elsevier Science (USA). All rights reserved.
Pair-independence and freeness analysis through linear refinement
Giorgio Levi
, a and Fausto Spoto
,
, b
Received 29 May 2001.
Abstract
Linear refinement is a technique for systematically constructing more precise abstract domains for program analysis starting from the basic domain which represents just the property of interest. We use here linear refinement to construct a domain for pair-independence and freeness analysis of logic programs which is strictly more precise than Jacobs and Langen’s domain for sharing analysis endowed with freeness information. Moreover, it can be used for abstract compilation, while Jacobs and Langen’s domain can only be used for abstract interpretation. We provide an approximate representation of our domain and algorithms for the abstract operations. We describe an implementation of an analyser which uses abstract compilation over our domain and its evaluation over a set of benchmarks. This shows that its precision is comparable to that of a traditional sharing and freeness analysis performed through abstract interpretation. To the best of our knowledge, this is the first implementation of a sharing analysis based on abstract compilation, as well as the first implementation of a static analysis based on a new domain developed through linear refinement.
Article Outline
- 1. Introduction
- 2. Related work
- 3. Preliminaries
- 3.1. Terms, substitutions, and Herbrand constraints
- 3.2. The s-semantics
- 3.3. Abstract interpretation
- 3.4. Abstract compilation
- 3.5. Goal-independence
- 3.6. Linear refinement
- 4. Pair-independence analysis
- 5. Freeness analysis
- 6. Linear refinement revisited
- 7. Pair-independence and freeness
- 8. A representation
- 9. Abstraction
- 10. Implementation
- 10.1. Normalisation
- 10.2. Abstraction
- 10.3. Fixpoint computation
- 10.4. Reduction rules
- 10.5. Improving the efficiency
- 10.6. The result of the analysis
- 11. Experimental evaluation
- 12. Conclusions
- Acknowledgements
- Appendix A. Proofs of Section 4
- Appendix B. Proofs of Section 8
- Appendix C. Proofs of Section 9
- Appendix D. Proofs of Section 10
- References
1. Introduction
This paper is concerned with the systematic design, by means of linear refinement, of a new abstract domain for two important properties of logic programs, i.e., pair-independence and freeness. Pair-independence analysis [4 and 38] is concerned with determining at compile-time a superset of the set of pairs of variables which, in a given program point, can be bound at run-time to two terms which share some variable. It is a particular case of set-(in)dependence analysis, also called sharing analysis [6, 7, 28, 29, 30, 31 and 36]. In set-independence analysis, not only pairs but sets of variables are considered. (In)dependence analysis is useful for avoiding occur-check [38] and for automatic program parallelisation [28 and 36]. As stressed in [4], pair-(in)dependence information is actually needed in program analysis and transformation, and set-(in)dependence information is redundant w.r.t. pair-(in)dependence information.
Freeness analysis [6, 7, 10, 11, 27, 30 and 36] is concerned with determining at compile-time a subset of variables which are guaranteed to be bound at run-time to some variable in a given program point. Freeness analysis is useful for optimising unification, for goal reordering, for avoiding type checking and, again, in automatic program parallelisation. It is well known that performing sharing and freeness analysis in conjunction improves the precision of both [28 and 36].
Linear refinement [24] is a technique for systematically constructing abstract domains for program analysis. Given a basic abstract domain representing just the property of interest and a concrete operation (which, since we are considering logic programs, is usually unification) a new more accurate domain is constructed. The new domain leads to more precise abstract operations.
The first contribution of this paper is the definition, through abstract interpretation [16 and 17] and linear refinement, of a new domain for pair-independence and freeness. The use of linear refinement for the definition of our domain, differently from the Sharing×Free domain [28 and 36], leads to simple and general definitions and proofs. It is worth noting that the original correctness proof for the domain Sharing is very complex and uses a large part of Langen’s PhD thesis [31]. We also show, within the linear refinement framework, why and how independence information interacts with freeness information. An important feature of our domain is that it can be used for abstract compilation [22 and 26], which is an application of abstract interpretation where, rather than computing the abstract denotation of a program by executing its concrete code over abstract data, the code itself is abstracted and replaced by abstract code, where concrete data structures are replaced by their abstraction. As a consequence, the computation of the abstract denotation can be achieved by the same algorithm used in the concrete computation.
The second contribution is the design of a computationally feasible representation of our domain, together with algorithms for computing an approximation of the concrete operations. Since this approximation can reduce the theoretical precision of our domain, we describe a prototypical analyser for pair-independence and freeness, based on abstract compilation and a fixpoint semantics. The use of a fixpoint semantics results in a goal-independent analysis. This means that the program is analysed for the most general goals only. More instantiated goals are analysed by using the analysis of the most general goals. We evaluate our analyser over a set of benchmarks. Although it is just a prototype, our evaluation shows that it is efficient enough for practical use on small benchmarks. Its precision is shown to be comparable to that of a traditional goal-dependent analysis.
To the best of our knowledge, this is the first implementation of a goal-independent sharing analysis based on abstract compilation, as well as the first implementation of a static analysis based on a new domain developed through linear refinement.
This paper is organised as follows. Section 2 discusses related works. Section 3 introduces preliminary definitions. 4 and 5 introduce two basic domains for pair-independence and freeness analysis, respectively, and show that their linear refinement does not lead to useful domains. In Section 6 we justify this result and in Section 7 we show why it is useful to combine the two analyses by using a domain which is defined as the linear refinement of the reduced product of the basic domains for pair-independence and for freeness. This domain is shown to be more precise than the domain of [28 and 36]. Section 8 defines a data structure which can be used as an approximate representation for our new domain, together with algorithms for the abstract operations. Section 9 shows an algorithm for computing the abstraction map. Section 10 describes the implementation of a prototypical analyser, and Section 11 reports its evaluation over a set of benchmarks. Finally, Section 12 draws some conclusions. Most of the proofs are kept in a separate appendix, for the convenience of the reader.
Preliminary and partial versions of this paper appeared in [1 and 33].
2. Related work
Almost all the domains developed for sharing analysis are not amenable to abstract compilation [6, 28, 29, 30, 31 and 36]. Moreover, they have been developed without using any systematic technique like linear refinement.
To the best of our knowledge, only [7 and 13] provide abstract domains for sharing analysis which can be used for abstract compilation. The domain in [13] is isomorphic to the Sharing domain of [28 and 31]. This means that, when used for abstract compilation, in order to obtain a useful precision, it must be coupled with a domain expressing further information, like freeness or linearity. This domain must in turn be amenable to abstract compilation. We do not know of any prototypical analyser implemented through their domain. The domain in [7] models sharing, freeness and groundness, but it is not developed through abstract interpretation. Instead, it uses pre-interpretations.
In the context of logic languages, linear refinement has been already used for reconstructing the domain Pos for groundness analysis [37]. Moreover, it has been used to develop new domains for type [32] and freeness analysis [27].
3. Preliminaries
3.1. Terms, substitutions, and Herbrand constraints
We denote by
(S) the powerset of a set S, by #S its cardinality and by
f(S) the set of all subsets of S of finite cardinality.
In this paper, we assume that
is an infinite set of variables,
and Σ is a set of function symbols with associated arity, containing at least a symbol of arity 0. We define terms(Σ,V) as the minimal set of terms built from V and Σ as: V
terms(Σ,V) and if t1,…,tn
terms(Σ,V) and
has arity n
0, then
. Let t
terms(Σ,V). By vars(t) we denote the set of variables which occur in t. If vars(t)=
, then t is ground. It is linear if every v
V occurs at most once in t. If
and then V
x means V
{x} and V
x means V
{x}. Syntactical substitution in t of x with t′
terms(Σ,V) is denoted by t[x
t′].
A substitution θ is a map from variables to terms. Its domain is denoted by dom(θ) and the set of variables in its range by rng(θ). The set of idempotent substitutions θ such that dom(θ)
rng(θ)
V and dom(θ)∩rng(θ)=
is denoted by ΘV. We write θ
ΘV extensionally as θ={v1
t1,…,vn
tn}, meaning that dom(θ)={v1,…,vn} and θ(vi)=ti for every i=1,…,n. Let θ
ΘV and R
V. We define θ|R(x)=θ(x) if x
R and θ|R(x)=x if x
V
R. If t
terms(Σ,V) then tθ
terms(Σ,V) is the term obtained by replacing every variable x in t by θ(x). Composition of substitutions θ,σ
ΘV is defined as (θσ)(x)=θ(x)σ for every x
V. We recall that it is associative, the empty substitution
is the neutral element and, for each term t, we have t(θσ)=(tθ)σ.
The set CV of finite sets of Herbrand equations is
f({t1=t2 | t1,t2
terms(Σ,V)}).
dom(θ)}. We hence assume that ΘV
CV. Let c
CV. We say that cθ is true if t1θ is syntactically equal to t2θ for every (t1=t2)
c. We know [35] that if there exists θ
ΘV such that cθ is true, then c can be put in the normal form mgu(c)
ΘV which is such that cθ is true if and only if mgu(c)θ is true. If no θ
ΘV exists such that cθ is true, then mgu(c) is undefined. Note that c
CV in normal form can be seen as a substitution, and hence the notations c(x) and tc are defined.Let
be an infinite set of variables disjoint from
. We define the set
We define
Wc)={θ|V | θ
ΘV
W, rng(θ)
V and cθ istrue}.
Wc)=solV(
Wmgu(c)). For instance, if V={X,Z}, then .A constraint
Wc is in normal form if c is in normal form. It is consistent if solV(
Wc)≠
. Two constraints h1,h2
HV are equivalent if solV(h1)=solV(h2). For instance, the constraints
and
are equivalent. In the following, a constraint will stand for its equivalence class. Since, as shown above, every consistent existential Herbrand constraint has an equivalent normal form, in the following we will consider only normal existential Herbrand constraints.
3.2. The s-semantics
We use HV as the computational domain of programs. Since we will later define abstractions of HV (Section 8), we decorate the following definitions with HV. Once an abstraction will be defined, we just substitute it instead of HV.
Definition 1. Let Π be a finite set of predicate symbols with associated arity. A logic program over H is a finite set of clauses
0, {X1,…,Xn}
V are distinct and for every i=1,…,m we have Gi
HV or
V distinct. The left-hand side of (1) is the head of the clause, the right-hand side is its tail. We say that the clause (1) defines the predicate p. Every predicate must be defined by at least one clause of P. If more clauses of P define the same predicate, they must use the same variables X1,…,Xn in (1).The s-semantics of logic programs [5] is based on a fixpoint definition over interpretations. Interpretations work over the collecting version [17] of HV, i.e., over the lattice 
(HV),∩,
,HV,
.
Definition 2. An interpretation over H is a function I which maps every
to
(H{X1,…,Xn}), where {X1,…,Xn} are the variables in the head of the clauses which define p (Definition 1). The set of interpretations over H is denoted by
.
Four operations over HV, called conjunction, restriction, expansion, and renaming, respectively, are used to define the s-semantics. They are defined in Definition 3. The operation
HV computes the conjunction of two constraints through the normalisation procedure. The restrict and expand operations remove a variable from and add a variable to a constraint, respectively. Note that expand is not the identity function but an embedding, as its signature shows. The operation rename gives a new name to a variable.
Definition 3. We define
HV:HV×HV
HV,
HV
x with x
V,
HV
x with x
V,
H(V
x)
n with x
V and n
V
Wc)=
Wc,
Wc)=
W(c[x
n]).The operations of Definition 3 are pointwise extended to
(HV). For instance, if S1,S2
HV, then S1
HVS2={h1
HVh2 | h1
S1, h2
S2 and h1
HVh2 isdefined} and 
(HV)xS={
HVxh | h
S}. On the collecting domain
(HV) a new operation
HV is defined as
HV(S1,S2)=S1
S2.
We abuse notation and we use the operations of Definition 3 with sets of variables instead of single variables. For instance, restrict{x1,…,xn}HV stands for the composition restrictHVx1
restrictHVxn and expandHVx1,…,xm→n1,…,nm for the composition expandHVx1→n1
expandHVxm→nm.
The s-semantics of a program is the least fixpoint of its immediate consequence operator.
Definition 4. Let P be a program over H. Its immediate consequence operator
is such that
HV
HV[[Gm]]I
HV,As one can see from Definition 4, the denotation of the tail of a clause is computed by using the conjunction operator
HV applied to the denotations of the components of the tail. The operator TP then projects (restrictHV) this denotation over the variables that occur in the head of the clause. The denotation of a predicate q in the tail of a clause is computed by fetching its current interpretation, by renaming (renameHV) its variables in order to reflect its calling context and by enlarging (expandHV) the set of variables in order to cover the entire set V.
3.3. Abstract interpretation
Abstract interpretation [16 and 17] allows us to reason about the abstraction relation between two different domains (the concrete and the abstract domain).
We recall that a complete lattice L is a partially ordered set where least upper bound (or join, denoted by
) and greatest lower bound (or meet, denoted by
) exist for every subset of L. A Moore family M of C is a topped completely meet-closed subset of C, i.e., M contains the top element of C and is closed w.r.t. arbitrary meets. The Moore (
C) closure of a set A
C is denoted by c(A).
Definition 5. Let
C,
and
A,
be two complete lattices (the concrete and the abstract domain). A Galois connection from C to A is a pair of monotonic maps α:C→A (abstraction) and γ:A→C (concretisation) such that for each x
C we have x
γα(x) and for each y
A we have αγ(y)
y. A Galois insertion is a Galois connection where αγ is the identity map on A.
The composition of Galois connections is a Galois connection. The composition of Galois insertions is a Galois insertion. A Galois connection is a Galois insertion if and only if γ is one-to-one or, equivalently, if and only if α is onto. In a Galois insertion, the abstraction map uniquely identifies the concretisation map and vice versa. It is well known [16] that the set of Galois insertions from C to A is isomorphic to the set of the Moore families of C. This means that every Moore family M
C is an abstract domain whose concretisation map is the identity map. This way of looking at abstract domains allows us to distinguish the property of a domain from the properties of its representations.
Let f:Cn→C be a concrete operator and let
. Then
is a correct approximation of f if for all y1,…,yn
A we have
A we have Every abstract domain A, with abstraction function αA, allows us to compute the corresponding abstract s-semantics of a logic program, by substituting A instead of H in Definitions 1–4Definitions 1–4Definitions 1–4Definitions 1–4. The denotation of a Herbrand constraint becomes its abstraction. Hence, we modify Definition 4 with [[h]]I=αA(h). The precision of the abstract semantics (analysis) depends on the precision of the abstract domain.
3.4. Abstract compilation
As we said at the end of the previous section, we can compute the abstract s-semantics of a program by using its same definition instantiated over the abstract domain A. However, this requires to abstract the concrete constraints in the program at every iteration of the immediate consequence operator (Definition 4). It is hence natural to optimise the fixpoint computation by abstracting the logic program once and for all into an abstract logic program, and by then computing its s-semantics without using the abstraction function anymore. In such a case, Definition 4 can be instantiated to the abstract domain A without the modification described at the end of the previous section. This technique is called abstract compilation [12 and 26].
Example 6. The computation over A of the abstract s-semantics of the following logic program:
proceeds as follows. We first substitute the concrete constraints with their abstraction. Let
and
. The compiled program is
We then compute the fixpoint of the TPA operator (Definition 4).
Note that abstract compilation can be used only if all the abstract operations are defined over elements of the abstract domain only, which is what we assume when we instantiate Definition 3 over the abstract domain A. Instead, the conjunction operation of the domain Sharing of [31] is defined between a concrete element and an abstract one. This does not allow us to use abstract compilation for that domain. Actually, even Definition 4 must be modified to fit that domain.
3.5. Goal-independence
By goal-independence we mean that the (abstract) semantics of a program is computed for the most general goals only. The semantics of the other goals is derived from that of the most general goals by instantiation and without using the text of the program. Hence the semantics of the most general goals must contain all the information needed to derive the semantics of the other goals.
The advantage of goal-independence is that the analysis becomes naturally modular, since it cannot look at the text of other modules, but only at the summary information gained from them. Another advantage of goal-independence is that, once a module has been analysed, its source code can be kept secret. Thus the analysis can be applied also when the code cannot be publicly divulged for copyright reasons.
A typical example of a goal-independent analysis is that obtained through the computation of an abstract s-semantics (Section 3.2). Since only the most general goals are analysed, the analysis of a goal like
is derived from the analysis of the most general goal
through its conjunction with the abstraction of
.
In has been shown that, in general, a goal-independent analysis is less precise than a goal-dependent analysis computed by using the same abstract domain, and that, for domains which are condensing w.r.t. conjunction, both analyses have the same precision [25].
Note that our notion of goal-independence is different from that used in [9 and 18], where the program is still needed to derive the goal-dependent information from the (so-called) goal-independent analysis of the same program. Hence, our notion is in our opinion more correct.
3.6. Linear refinement
Given an abstract domain A
C, a domain refinement operator R yields an abstract domain R(A)
C which is more precise than A, i.e., which contains A [19 and 23]. A classical domain refinement operator is the reduced product A
B of two domains A and B, both contained in another domain C [16]. It is isomorphic to the Cartesian product of A and B, modulo the equivalence relation
a1,b1
≡
a2,b2
if and only if a1
b1=a2
b2. Hence pairs with the same meaning are identified.
Linear refinement [24] is a slight generalisation of Cousot’s reduced power operation [16]. It allows us to include in a domain the information related to the propagation of the abstract property of interest before and after the application of a partial operator over C. We consider here just the case when C=
(HV) and the operator is the pointwise extension of conjunction (Definition 3).
Let a,b
(HV). We define the linear refinement of a w.r.t. b as
Example 7. For every v
V, let v={
Wc
HV | vars(c(v))=
}. The set v is the set of constraints which bind v to a ground term. Let x,y
V. Eq. (2) becomes in this case
x→y is such that in all its instantiations if x is ground then y is ground. Equivalently, you can say that h transforms the groundness of x into the groundness of y upon conjunction. For instance, we have Given an abstract domain L
HV, we define
L=c{a→b | a,b
L}.
L is then the collection of all possible intersections of arrows which can be built from elements of L. Note that l→(a
b)=(l→a)
(l→b).The linear refinement L→L of L is the domain
L
L, i.e., if the properties in L are (degenerate) cases of intersections of arrows, (4) can be simplified intoThis simplification is relevant since it allows a simpler representation and simpler operations for L→L. Indeed, we need to represent elements and operations over L







E-mail Article
Add to my Quick Links

Cited By in Scopus (0)


