ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Information and Computation
Volume 183, Issue 1, 25 May 2003, Pages 19-42
International Workshop on Implicit Computational Complexity (ICC'99)
 
Font Size: Decrease Font Size  Increase Font Size
 Article - selected
PDF (260 K)
Thumbnails - selected | Full-Size Images

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0890-5401(03)00014-2    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2003 Elsevier Science (USA). All rights reserved.

Term rewriting for normalization by evaluation

Ulrich Berger, Matthias Eberl and Helmut SchwichtenbergCorresponding Author Contact Information, E-mail The Corresponding Author

Mathematischen Institut der Ludwig-Maximilians-Universitat Munchen, D-80333, Munchen, Germany

Received 2 November 1999. 
Available online 15 March 2003.

Abstract

We extend normalization by evaluation (first presented in [5]) from the pure typed λ-calculus to general higher type term rewriting systems and prove its correctness w.r.t. a domain-theoretic model. We distinguish between computational rules and proper rewrite rules. The former is a rather restricted class of rules, which, however, allows for a more efficient implementation.

Article Outline

1. Introduction
2. A simply typed λ-calculus with constants
2.1. Types, terms, rewrite rules
2.2. Computation rules
2.3. Examples
2.4. Normalizable terms and their normal forms
2.5. Term families
3. Normalization by evaluation
3.1. Domain theoretic semantics of simply typed λ-calculi
3.2. Interpretation of the types
3.3. Reification and reflection
3.4. Predecessor functions
3.5. Interpretation of the constants
3.6. Correctness of normalization by evaluation
Acknowledgements
References

1. Introduction

It is well known that implementing normalization of λ-terms in the usual recursive fashion is quite inefficient. However, it is possible to compute the long normal form of a λ-term by evaluating it in an appropriate model (cf. [5]). When using for that purpose the built-in evaluation mechanism of e.g., SImage (a pure LImage dialect) one obtains an amazingly fast algorithm called “normalization by evaluation” or NbE for short. In the context of type-directed partial evaluation [8] it has been analyzed in what sense NbE is more efficient, and why: a punctual comparison between NbE and a naive, symbolic normalizer can be found in [4]. The essential idea is to find an inverse to evaluation, converting a semantic object into a syntactic term. This normalization procedure is used and tested in the proof system MImage developed in Munich (cf. [2]). Notice, however, that once NbE is expressed in a functional programming language, the evaluation order of this language (call-by-value for SImage ) determines the reduction order of NbE (applicative order for a call-by-value language). It is thus easy to defeat NbE in SImage by normalizing the application of a nonstrict function to an expression that is expensive to normalize. For such a term, a symbolic normalizer following a normal order reduction strategy can easily be more efficient.

Obviously, for applications pure typed λ-terms are not sufficient; one clearly needs constants as well. In [4] NbE has been extended to term systems with higher order term rewrite rules. The present paper adds a distinction between what we call computational rules and (proper) rewrite rules; NbE seems to be much more efficient for the former than for the latter. In our implementation (in the MImage system) we therefore use computational rules whenever possible.

A related approach (using a glueing construction) is elaborated by Coquand and Dybjer in [6]. Another related paper is Altenkirch et al. [1]; there a cartesian closed category is defined which has the property that the interpretation of the simply typed lambda calculus in it yields the reduction-free normalization algorithm from [5], as well as its correctness. Moreover, Danvy (cf. e.g., [8]) has successfully used this algorithm (or more precisely its call-by-value counterpart) in the context of partial evaluation. Filinski [10] also treats NbE for an extension of the λ-calculus by constants, where nontermination is allowed. However, he does not consider constants whose meaning is only given operationally, i.e., by arbitrary rewrite rules. Therefore the normal proof technique employing the logical relation “the value of expression e in environment δ is a” is available in his case, whereas in ours it is more convenient to follow a different approach, via an appropriate inductive generation of the reducibility relation.

Why should one be interested in the correctness of NbE for general rewrite rules, where neither termination nor even confluence is assumed? One reason is that in an interactive proof development system (MImage in our case) it is convenient not having to deal explicity with equality axioms, but rather to identify terms with the same normal form, modulo a given set of rewrite rules. Then an efficient normalization algorithm such as NbE to test for equality clearly is useful. However, one does not want to have the obligation to prove termination and confluence of the whole set of rewrite rules whenever a new one is added.

The aim of the present paper is to develop the theory of normalization by evaluation from scratch, up to and including (some generalizations of) Gödel’s system T of higher order primitive recursion. In fact, we will treat almost arbitrary rewrite systems.

Let us begin with a short explanation of the essence of the method for normalizing typed λ-terms by means of an evaluation procedure of some functional programming language such as SImage . For simplicity we return to the simplest case, simply typed λ-calculus without constants.

Simple types are built from ground types τ by ρ → σ (later also products ρ×σ will be included). The set Λ of terms is given by xσ,(λxρMσ)ρ→σ,(Mρ→σNρ)σ; let Λρ denote the set of all terms of type ρ. The set LImage of terms in long normal form (i.e., normal w.r.t. β-reduction and η-expansion) is defined inductively by (xM1Mn)τxM (we abbreviate xM1Mn by xM and similar a list M1Mn by M). By Image (M) we denote the long normal form of M, i.e., the unique term in long normal form βη-equal to M.

Now we have to choose our model. A simple solution is to take terms of ground type as ground type objects and all functions as possible function type objects:

(1)
left double bracket delimiterτright double bracket delimitercolon, equalsΛτ, left double bracket delimiterρ → σright double bracket delimitercolon, equalsleft double bracket delimiterσright double bracket delimiterleft double bracket delimiterρright double bracket delimiter (thefullfunctionspace).
It is crucial that all terms (of ground type) are present, not just the closed ones. Next we need an assignment ↑ lifting a variable to an object and a function ↓ giving us a normal term from an object. They should meet the following condition, which might be called correctness of normalization by evaluation

(2)
Image
where left double bracket delimiterMρright double bracket delimiterset membership, variantleft double bracket delimiterρright double bracket delimiter denotes the value of M under the assignment ↑. Two such functions ↓ and ↑ can be defined simultaneously, by induction on the type. It is convenient to define ↑ on all terms (not just on variables). Hence for every type ρ we define ↓ρ:left double bracket delimiterρright double bracket delimiter → Λρ and ↑ρρleft double bracket delimiterρright double bracket delimiter (called reify and reflect) by

Image
Here a little difficulty appears: what does it mean that x is new? This clearly is not a problem for an implementation, where we have an operational understanding and may use something like gensym, but it is for a mathematical model. We will solve this problem by slightly modifying the model and defining left double bracket delimiterτright double bracket delimiter to be the set of families of terms of type τ (instead of single terms) and setting ↓ρ→σ(a)(k)colon, equalsλxk(↓σ(a(↑ρ(xk)))(k+1)), where xk is the constant family xk. The definition of ↑ρ→σ has to be modified accordingly. This idea corresponds to a representation of terms in the style of de Bruijn [9]. An advantage of this approach is that the NbE program is purely functional and hence can be verified relatively easily. If side effects were involved the verification would be much more complicated.

The proof of correctness is easy (ignoring the problem with the “new variable”): Since for the typed lambda calculus without constants we have preservation of values; i.e., Image for all terms M and environments ξ, we only have to verify ↓(left double bracket delimiterNright double bracket delimiter)=N for terms N in long normal form, which is straightforward, by induction on N:

Case xρ→τNρ (w.l.o.g.)

(3)
τ(left double bracket delimiterxNright double bracket delimiter)=↑ρ→τ(x)(left double bracket delimiterNright double bracket delimiter)=↑τ(xρ(left double bracket delimiterNright double bracket delimiter))=xN.

Case λyN


ρ→σ(left double bracket delimiterλyNright double bracket delimiter)=λxσ(left double bracket delimiterλyNright double bracket delimiter(↑ρ(x))) x new
xσ(left double bracket delimiterNy[x]right double bracket delimiter)
xNy[x] byIH
=αλyN.
Notice that this is a correctness proof in the style of [5]. The situation is different when we add constants together with rewrite rules, since then preservation of values (in our model) is false in general (cf. Examples 20 and 19Examples 20 and 19 below). However, correctness of normalization by evaluation still holds, but needs to be proven by a different method. It might be worth noting that in the special case where no rewrite or computation rules are present our proof below boils down to the simple correctness proof sketched above.

The structure of the paper is as follows. In Section 2 we present the simply typed λ-calculus with constants and pairing and give some examples of higher order rewrite systems. We also introduce the distinction between computational and (proper) rewrite rules. Then we inductively define a relation M Q, with the intended meaning that M is normalizable with long normal form Q, and prove in Section 3.6 the correctness of normalization by evaluation by showing that M Q (essentially) implies ↓(left double bracket delimiterMright double bracket delimiter)=Q. Hence the mapping M maps to ↓(left double bracket delimiterMright double bracket delimiter) is a normalization function. In order to define the semantics left double bracket delimiterMright double bracket delimiter of a term M properly we use domain theory. This is described briefly in Section 3.1.

Note that we prove correctness of NbE w.r.t. a denotational semantics, but do not attempt to prove operational correctness, i.e., the fact that the functional program formalizing NbE when called with a term M such that M Q will terminate with Q as output. In order to obtain operational correctness from denotational correctness one needs a suitable adequacy result à la Plotkin [13] relating the denotational and the operational semantics. Plotkin’s result cannot be applied here because it refers to a call-by-name operational semantics, whereas we are interested in a call-by-value semantics in order to obtain a correctness result for our implementation of NbE in the call-by-value language SImage . Furthermore Plotkin only considers the integers and the booleans as base types, whereas we need complex recursively defined types as base types (see Section 3.2). We leave the problem of proving adequacy of our denotational semantics for a fragment of a call-by-value language suitable for formalizing our extension of NbE to future work.

2. A simply typed λ-calculus with constants

2.1. Types, terms, rewrite rules

We start from a given set of ground types. Types are inductively generated from ground types τ by ρ → σ and ρ×σ. Terms are


xρ typedvariables,
cρ constants,
xρMσ)ρ→σ abstractions,
(Mρ→σNρ)σ applications,
left angle bracketM0ρ,M1σright-pointing angle bracketρ×σ pairing,
π0(Mρ×σ)ρ, π1(Mρ×σ)σ projections.
Type indices will be omitted whenever they are inessential or clear from the context. Also, λx binds tighter than application and pairing; however, a dot after λx means that the scope extends as far as allowed by the parentheses. So λxMN means (λxM)N, but λx.MN means λx(MN).

Ground types will always be denoted by τ. We sometimes write M0 for π0(M) and M1 for π1(M). Two terms M and N are called α-equal—written M=αN—if they are equal up to renaming of bound variables. Λρ denotes the set of all terms of type ρ (α-equal terms are not identified). MN denotes (…(MN1)N2…)Nn, where some of the Ni’s may be 0 or 1. By Image (M) we denote the list of variables occurring free in M. By Mx[N] we mean substitution of every free occurrence of x in M by N, renaming bound variables if necessary. Similarly Mx[N] denotes simultaneous substitution. λxM abbreviates λx1…λxnM. If MN is of type σ,Ni of type ρi, then we call ρ → σ a type information for M. Here ρ is a list of types, 0’s or 1’s indicating the left or right part of a product type. So, e.g., a term M of type ρ=(τ → τ → τ)×(τ → (τ×τ)) has (0,τ) → (τ → τ) or (1,τ,0) → τ as a type information. If there are no product types ρ → σ simply abbreviates (ρ1 → (ρ2cdots, three dots, centered → (ρn → σ)cdots, three dots, centered)).

For the constants cρ we assume that some rewrite rules of the form cK maps to N are given, where Image and cK, N have the same type (not necessarily a ground type). Moreover, for any type information ρ1,…,ρn → τ for c (τ a ground type), we require that there is a fixed length kless-than-or-equals, slantn of arguments for the rewrite rules, i.e., cM maps to N implies that M has length k, provided the projection markers in M and in ρ1,…,ρk coincide. If no rewrite rate of the form cM maps to N (1 less-than-or-equals, slant length of Mless-than-or-equals, slantn) applies, then this fixed length is stipulated to be n. We write cρ→σ to indicate that we only consider c with argument lists K with these projection markers; the notation cMN is used to indicate that M are the fixed arguments for the rewrite rules of c. In particular if there is no rewrite rule for c, then N is empty and cM is of ground type.

For example, if c is of type (τ → τ → τ)×(τ → τ), then the rules c0xx maps to a and c1 maps to b are admitted, and c0,τ,τ→τ indicates that we only consider argument lists of the form 0, x, y.

2.2. Computation rules

Given a set of rewrite rules, we want to treat some rules—which we call computation rules—in a different, more efficient way. The idea is that a computation rule can be understood as a description of a computation in a suitable semantical model, provided the syntactic constructors correspond to semantic ones in the model, whereas the other rules describe syntactic transformations.

A constant c is called a constructor if there is no rule of the form cK maps to N. For instance in the examples of Section 2.3 the constants 0, Image , and there exists+ are constructors. Constructor patterns are special terms defined inductively as follows.

• Every variable is a constructor pattern.

• If c is a constructor and P1,…,Pn are constructor patterns or projection markers 0 or 1, such that cP is of ground type, then cP is a constructor pattern.

From the given set of rewrite rules we choose a subset CImage with the following properties.

• If cP maps to Qset membership, variant CImage , then P1,…,Pn are constructor patterns or projection markers.

• The rules are left-linear, i.e., if cP maps to Qset membership, variant CImage , then every variable in cP occurs only once in cP.

• The rules are nonoverlapping, i.e., for different rules cK maps to M and cL maps to N in CImage the left-hand sides cK and cL are nonunifiable.

We write Image to indicate that the rule is in CImage . The set of constructors appearing in the constructor patterns is denoted by CImage . All other rules will be called (proper) rewrite rules, written Image .

In our reduction strategy below computation rules will always be applied first, and since they are nonoverlapping, this part of the reduction is unique. However, since we allowed almost arbitrary rewrite rules, it may happen that in case no computation rule applies a term may be rewritten by different rules negated set membership CImage . In order to obtain a deterministic procedure we assume that for every constant cρ→σ we are given a function Image computing from M either a rule Image , in which case M is an instance of K, i.e., M=Kx[L], or else the message “Image ”, in which case M does not match any rewrite rule: i.e., there is no rule Image such that M is an instance of K. Clearly Image should be compatible with α-equality and should satisfy an obvious uniformity property; i.e., whenever M and M are variants (i.e., can be obtained from each other by an invertible substitution), then Image .

Often the rewrite rules will be left-linear (i.e., no variable occurs twice in the left-hand side of a rule); then it is reasonable to require that every select function Image is strongly uniform in the sense that for all instances (with not necessarily distinct variables z) we have Image .

2.3. Examples

(a) Usually we have the ground type ι of natural numbers available, with constructors Image and recursion operators Rρι→ρ→(ι→ρ→ρ)→ρ. The rewrite rules for R are

R0 maps to λyz.y,


Image
The reason for writing the rules in this way, and not in the more familiar form R0yz maps to y, Image , will become clear later (see Example 16 in Section 3.5.) A simplified scheme of a similar form gives a cases construct.

Image


Image
Moreover we can write down rules according to the usual recursive definitions of addition and multiplication, e.g.,

Image
Simultaneous recursion may be treated as well, e.g.,

Image


Image
All these rules are possible computation rules, whereas the next two rules are not (since Image and Image are no constructors).

Image
(a rewrite rule due to McCarthy [12]) or

Image

(b) We can also deal with infinitely branching trees such as the Brouwer ordinals of type Image . There are constructors Image and Image and recursion constants

. The rewrite rules for RImage are

Image


Image

(c) It is well known that by the Curry–Howard correspondence natural deduction proofs can be written as λ-terms with formulas as types. To use normalization by evaluation for normalizing proofs we may also introduce a ground type Image with constructors and destructors

Image
these are called existential constants. The rewrite rule for there exists is

there exists(there exists+x0x1) maps to λy.yx0x1.
The (constructive) existential quantifier can then be dealt with conveniently by means of axioms

there exists+:for allx(A there existsx A),


Image
If x has type ρ0 and the formulas A and B are associated with the types ρ1 and σ, respectively, the rewrite rule above is clear. It seems that the existential type Image could be replaced by ρ0×ρ1 and the constants there exists+ρ01 and there existsρ01 by the terms λx0λx1(x0,x1) and λzλf(fπ0(z1(z)), respectively. However, the latter term does not correspond to a derivation in first order logic, since it is impossible to pass from an arbitrary derivation d (possibly with free assumptions) of there existsxA to a term π0(d) and a derivation π1(d) of Ax0(d)].

One can easily formulate rules for permutative conversions, which permute an application of an there exists-elimination rule with other elimination rules, e.g.,

there existsρ010→σ1p maps to λzv.there existsρ011pxy.(zxyv)).

2.4. Normalizable terms and their normal forms

We inductively define a relation M Q for terms M,Q. The intended meaning of M Q is that M is normalizable with (long) normal form Q. However, it is necessary to split up → into two relations: a “weak” one →w intended to unwrap the outer constructor form, followed by a “strong” one →s, where we assume that it is applied to terms M irreducible w.r.t. →w.

Looking at the form of a term we will embark on the following strategy:

• β-redexes (λxM)N and computation rules cMN are reduced promptly; i.e., we use call-by-name here.

• If no rule applies to cMN one first tries to find out whether M can be reduced to P such that cP matches a computation rule. This does not require reducing each Mi, to normal form; it suffices to find out the outer pattern of Mi (let us call it for now “constructor normal form”). The reductions for doing so will be called “weak” and we write →w for them.

• If in cMN all M are already in constructor normal form and no computation rule applies, then in a second step one reduces all M and N to normal form (if it exists) and tries to apply a proper rewrite rule, i.e., we use call-by-value at this point.

Let MM abbreviate M1M1,…,MnMn and similarly for other relations, and let →w* be the reflexive and transitive closure of →w.

Definition 1.  SImage .

Image
  EImage .

Image
  VImage AImage .

Image
  BImage .

xM)NPwMx[N]P left angle bracketM0,M1right-pointing angle bracketiPwMiP for iset membership, variant{0,1}.
  CImage .

cPx[L]Nw Qx[L]N if cP maps tocomp Q.
For the next three rules assume that cM is not an instance of a computation rule.  AImage .

Image
The final two rules have premises MsM. Note that by Lemma 2 below, cM cannot be an instance of a computation rule, for then also cM would be one.  RImage .

Image
PImage AImage .

Image
In case the constant c in the rules AImage and PImage AImage is a constructor, N is required to be empty.

For readability we will often write RImage in the following form, assuming that Image is the selected rule.

RImage .

Image

For the definition above to make sense we prove the following.

Lemma 2.  If M sM and M is an instance of a constructor pattern P, then also M is an instance of P.

Proof.  By induction on P. If P is a variable the claim is trivial, so let P=cP. Then M=cK and K is an instance of P. Moreover, the only possibility to infer M sM=cK is by PImage AImage . Thus M=cK,KsK and by induction hypothesis (IH) K is an instance of P. Since P is linear we eventually get that cK is an instance of cP. □

Definition 3.  The set LImage of terms in long normal form is defined as follows. λxM,left angle bracketM,Nright-pointing angle bracket,(xM)τ, and (cMN)τ are in LImage if M, N, M, N are, provided that cM is not an instance of any computation or rewrite rule.  For example, the η-expansion Image (x) of a variable x is in long normal form; it is defined using induction on types by (e.g., for pure → -types) Image (xτ)=xτ, Image Image .

Lemma 4.  If M Q or M s Q, then Q is in long normal form.

Proof.  By simultaneous induction on M Q and M s Q. The only interesting case is PImage AImage , where we have to show that cM is not an instance of a computation rule. But if cM would be such an instance, by the previous lemma cM would also be, contradicting the assumption. □

Furthermore it can be shown easily that if M Q, M w Q, or M s Q, then M reduces to Q in the usual sense w.r.t. β-reduction, η-expansion, and the computation and rewrite rules for the constants. However, the converse is not true in general. For a counterexample, consider the nonterminating rewrite rules Image and Image . Then 0 is a normal form of Image perpendicular0, but we cannot have Image for any Q. To see this, note that we cannot have perpendicularsN for any N (since Image ; hence we also cannot have Image perpendicular0 →s Q for any Q. Since perpendicular, 0 are →w*-reducible only to themselves, the claim follows. But under the hypothesis that M is strongly normalizable the converse is true.

Lemma 5.  If M is strongly normalizable w.r.t. these reductions (i.e., every reduction sequence terminates), then MQ for some Q.

Proof.  For simplicity we consider pure →-types only; the extension to product types is immediate. We will prove the claim by induction on hM and side induction on Image (M), where hM denotes the height of the reduction tree for M and Image (M) is the height of M. Note that if M w Q then M reduces to Q in at least one step; hence hM>hQ.  Case λyM. We have (λyM)y wM Q by BImage and the side induction hypothesis (SIH); hence λyM → λyQ by EImage .  Case M has a type ρ → σ, but is not an abstraction; Then M η-expands to λy. My where y is a new variable of type ρ; hence hM>hλy.Mygreater-or-equal, slantedhMy. Therefore My Q by IH. Hence M → λyQ by EImage .  It remains to consider terms of ground type.  Case xM. Obvious, using the SIH and rule VImage AImage .  CasexM)NP. Then (λxM)NPwMx[N]PQ by BImage and the IH.  Case cPx[L]N with Image . Then cPx[L]NwQx[L]NQ1 by CImage and the IH.  Case cMN with cM not an instance of a computation rule. By SIH MM. If at least one Mi is →w-reduced, the claim follows from the IH and AImage . Otherwise we have MsM. Now if Image and M=Kx[L], the claim follows from the IH for Qx[L]N. If, however, Image =Image , then proceed as in case xM, using PImage AImage instead of VImage AImage . □

Moreover, the relation M Q clearly is not closed under substitution. However, it is closed under substitution of variables, provided the result is a variant of M.

Lemma 6.  Let