Skip to main content
Log in

Juggrnaut: using graph grammars for abstracting unbounded heap structures

  • Published:
Formal Methods in System Design Aims and scope Submit manuscript

Abstract

This paper presents a novel abstraction framework for heap data structures. It employs graph grammars, more precisely context-free hyperedge replacement grammars. We will show that this is a very natural formalism for modelling dynamic data structures in an intuitive way. Our approach aims at extending finite-state verification techniques to handle pointer-manipulating programs operating on complex dynamic data structures that are potentially unbounded in their size. The theoretical foundations of our approach and its correctness are the main focus of this paper. In addition, we present a prototypical tool entitled Juggrnaut that realizes our approach and show encouraging experimental verification results for three case studies: a doubly-linked list reversal, the flattening of binary trees, and the Deutsch–Schorr–Waite tree traversal algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30

Similar content being viewed by others

Notes

  1. Juggrnaut is derived from the Sanskrit term Jagann\(\bar{a}\)tha which stands for “lord of the world” (i.e., the god Vishnu or Krishna). It is also used for any large, overpowering, destructive force or object.

  2. This operation resembles a materialisation step in shape analysis [39] which splits off a concrete heap object from a summary node. However, summary nodes are conceptually different from nonterminals as the former stand for sets of “similar” objects while the latter represent whole subheaps.

  3. This can always be established by an appropriate renaming of the vertices and edges of K or H.

  4. In the literature, this is often referred to as canonical abstraction.

  5. By Lemma 3, the transition relation \(\rhd \) from Definition 6 (on page 11) may be considered as a subset of \(( Stm \times \mathrm {AHC}_{{\varSigma _N}}) \times (( Stm \cup \{\varepsilon \}) \times \mathrm {HC}_{{\varSigma _N}})\).

  6. Just use graph grammars to Nicely Abstract Unbounded sTructures

  7. Bertrand Meyer writes in his blog of November 2011 that “The resulting traversal algorithm is a beauty—although it is fairly tricky, presents a challenge for verification tools, and raises new difficulties in a multi-threaded environment” (see http://bertrandmeyer.com/tag/deutsch-schorr-waite/).

  8. In fact, [27, p. 125] points out that “To overcome this limitation, we need to use a more general grammar, where the nonterminals can talk about shared cells”.

References

  1. Bals M, Jansen C, Noll T (2013) Incremental construction of Greibach normal form for context-free grammars. In: International symposium on theoretical aspects of software engineering (TASE 2013), IEEE CS Press, pp 165–168

  2. Berdine J, Calcagno C, O’Hearn PW (2004) A decidable fragment of separation logic. In: 24th International conference on foundations of software technology and theoretical computer science (FSTTCS), Springer, LNCS, vol 3328, pp 97–109

  3. Berdine J, Calcagno C, O’Hearn PW (2005) Smallfoot: modular automatic assertion checking with separation logic. In: Formal methods for components and objects, Springer, LNCS, vol 4111, pp 115–137

  4. Bhat G, Cleaveland R, Grumberg O (1995) Efficient on-the-fly model checking for CTL*. In: 10th Annual IEEE symposium on logic in computer science, pp 388–397

  5. Bogudlov I, Lev-Ami T, Reps TW, Sagiv M (2007) Revamping TVLA: making parametric shape analysis competitive. In: 19th International conference on computer aided verification (CAV), Springer, LNCS, vol 4590, pp 221–225

  6. Bouajjani A, Bozga M, Habermehl P, Iosif R, Moro P, Vojnar T (2006a) Programs with lists are counter automata. In: 18th international conference on computer-aided verification (CAV), Springer, LNCS, vol 4144, pp 517–531

  7. Bouajjani A, Habermehl P, Rogalewicz A, Vojnar T (2006b) Abstract regular tree model checking of complex dynamic data structures. In: Static analysis symposium (SAS), Springer, LNCS, vol 4134, pp 52–70

  8. Courcelle B (1990) The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Inf Comput 85(1):12–75

    Article  MATH  MathSciNet  Google Scholar 

  9. Courcelle B (1997) The expression of graph properties and graph transformations in monadic second-order logic. In: Rozenberg G (ed) Handbook of graph grammars. Singapore, Singapore, pp 313–400

    Google Scholar 

  10. Distefano D, Katoen JP, Rensink A (2005) Safety and liveness in concurrent pointer programs. In: Formal methods for components and objects, Springer, LNCS, vol 4111, pp 280–312

  11. Dodds M, Plump D (2009) From hyperedge replacement to separation logic and back. ECEASST 16, http://journal.ub.tu-berlin.de/index.php/eceasst/article/view/237/236

  12. Drewes F, Kreowski HJ, Habel A (1997) Hyperedge replacement graph grammars. In: Rozenberg G (ed) Handbook of graph grammars. World Scientific, Singapore, pp 95–162

    Google Scholar 

  13. Elgaard J, Møller A, Schwartzbach MI (2000) Compile-time debugging of C programs working on trees. In: Programming languages and systems, LNCS, vol 1782, Springer, pp 119–134

  14. Engelfriet J (1992) A Greibach normal form for context-free graph grammars. In: International conference on automata, languages and programming (ICALP), Springer, LNCS, vol 623, pp 138–149

  15. Ghamarian AH, de Mol MJ, Rensink A, Zambon E, Zimakova MV (2012) Modelling and analysis using GROOVE. Int J Softw Tools Technol Transf 14:15–40

    Article  Google Scholar 

  16. Halin R (1976) S-functions for graphs. J Geom 8(1–2):171–186

    Article  MATH  MathSciNet  Google Scholar 

  17. Heinen J (2015) Verifying Java programs—a graph grammar approach. PhD thesis, RWTH Aachen University, Germany

  18. Heinen J, Noll T, Rieger S (2010) Juggrnaut: graph grammar abstraction for unbounded heap structures. In: Proceedings of the 3rd international workshop on harnessing theories for tool support in software (TTSS 2009), Elsevier, ENTCS, vol 266, pp 93–107

  19. Heinen J, Barthels H, Jansen C (2012) Juggrnaut—an abstract JVM. In: Formal verification of object-oriented software (FoVeOOS 2011), Springer, LNCS, vol 7421, pp 142–159

  20. Hinman P (2005) Fundamentals of mathematical logic. A.K. Peters Ltd, Wellesley

    MATH  Google Scholar 

  21. Iosif R, Rogalewicz A, Simacek J (2013) The tree width of separation logic with recursive definitions. In: Automated deduction (CADE-24) (Lecture notes in computer science), vol 7898, Springer, pp 21–38

  22. Jansen C, Noll T (2014) Generating abstract graph-based procedure summaries for pointer programs. In: Graph transformations (ICGT 2014), Springer, LNCS, vol 8571, pp 49–64

  23. Jansen C, Heinen J, Katoen JP, Noll T (2011) A local Greibach normal form for hyperedge replacement grammars. In: 5th international conference on language and automata theory and applications (LATA 2011), Springer, LNCS, vol 6638, pp 323–335

  24. Jansen C, Göbe F, Noll T (2014) Generating inductive predicates for symbolic execution of pointer-manipulating programs. In: Graph transformation (ICGT 2014), Springer, LNCS, vol 8571, pp 65–80

  25. Jensen JL, Jørgensen ME, Schwartzbach MI, Klarlund N (1997) Automatic verification of pointer programs using monadic second-order logic. In: ACM SIGPLAN 1997 conference on programming language design and implementation (PLDI ’97), ACM Press, pp 226–234

  26. Klarlund N, Møller A, Schwartzbach MI (2001) Mona implementation secrets. In: Implementation and application of automata, LNCS, vol 2088, Springer, pp 182–194

  27. Lee O, Yang H, Yi K (2005) Automatic verification of pointer programs using grammar-based shape analysis. In: Proceedings of 14th European symposium on programming (ESOP ’05), Springer, LNCS, vol 3444, pp 124–140

  28. Lindstrom G (1973) Scanning list structures without stacks or tag bits. Inf Process Lett 2(2):47–51

    Article  MathSciNet  Google Scholar 

  29. Loginov A, Reps TW, Sagiv M (2006) Automated verification of the Deutsch-Schorr-Waite tree-traversal algorithm. In: 13th International static analysis symposium (SAS), Springer, LNCS, vol 4134, pp 261–279

  30. Madhusudan P, Qiu X (2011) Efficient decision procedures for heaps using STRAND. In: Static analysis, LNCS, vol 6887, Springer, pp 43–59

  31. Madhusudan P, Parlato G, Qiu X (2011) Decidable logics combining heap structures and data. In: POPL 2011, ACM Press, pp 611–622

  32. Mehta F, Nipkow T (2005) Proving pointer programs in higher-order logic. Inf Comput 199(1–2):200–227

    Article  MATH  MathSciNet  Google Scholar 

  33. O’Hearn PW, Yang H, Reynolds JC (2004) Separation and information hiding. In: ACM symposium on principles of programming languages (POPL), ACM Press, pp 268–280

  34. Plump D (2010) Checking graph-transformation systems for confluence. ECEASST 26, http://journal.ub.tu-berlin.de/eceasst/article/view/367/347

  35. Pnueli A (1977) The temporal logic of programs. In: 18th annual symposium on foundations of computer science, IEEE CS Press, pp 46–57

  36. Poskitt C, Plump D (2012) Hoare-style verification of graph programs. Fundam Inf 114:1–43

    MathSciNet  Google Scholar 

  37. Reynolds JC (2002) Separation logic: a logic for shared mutable data structures. In: IEEE symposium on logic in computer science (LICS), IEEE CS Press, pp 55–74

  38. Rieger S, Noll T (2008) Abstracting complex data structures by hyperedge replacement. In: 4th international conference on graph transformations (ICGT 2008), Springer, LNCS, vol 5214, pp 69–83

  39. Sagiv S, Reps TW, Wilhelm R (2002) Parametric shape analysis via 3-valued logic. ACM TOPLAS 24(3):217–298

    Article  Google Scholar 

  40. Schorr H, Waite WM (1967) An efficient machine-independent procedure for garbage collection in various list structures. Commun ACM 10:501–506

    Article  MATH  Google Scholar 

  41. Yang H, Lee O, Berdine J, Calcagno C, Cook B, Distefano D, O’Hearn PW (2008) Scalable shape analysis for systems code. In: 20th international conference on computer aided verification (CAV), Springer, LNCS, vol 5123, pp 385–398

  42. Yuasa Y, Tanabe Y, Sekizawa T, Takahashi K (2008) Verification of the Deutsch-Schorr-Waite marking algorithm with modal logic. In: 2nd international conference on verified software: theories, tools, experiments (VSTTE), Springer, LNCS, vol 5295, pp 115–129

  43. Zambon E (2013) Abstract graph transformation—theory and practice. PhD thesis, University of Twente

  44. Zambon E, Rensink A (2012) Graph subsumption in abstract state space exploration. In: Graph inspection and traversal engineering (GRAPHite 2012), Electronic proceedings in theoretical computer science, vol 99, pp 35–49

Download references

Acknowledgments

This research has partially been funded by EU FP7 project CARP (Correct and Efficient Accelerator Programming), http://www.carpproject.eu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Noll.

Appendices

Appendix 1: Decidability of DSG property

Theorem 2 in Sect. 4 (p. 15) states that it is decidable whether a grammar \(G \in \mathrm {HRG}_{{\varSigma _N}}\) is a DSG. We will prove this result for productive grammars. This is not a restriction as each HRG G can easily be transformed into an equivalent productive one. The transformation is described in the following.

Recall that a nonterminal \(X \in N\) is productive iff \(\mathcal {L}_{}({{X}^\bullet }) \ne \emptyset \). We collect the productive nonterminals of G in the set \(P \subseteq N\) as follows. We initialise P with all \(X \in N\) for which at least one X-rule has a terminal rule graph, i.e., \(X \rightarrow H\) where \({lab}(E_H) \cap N = \emptyset \). Then, \(X \in N\) is added if there exists a rule \(X \rightarrow H\) such that all nonterminals occurring in H are productive, i.e., \({lab}(E_H) \cap N \subseteq P\). This procedure terminates as both G and N are finite. We then remove each non-productive nonterminal \(Y \in N \setminus P\) and every rule containing Y. It is evident that this will not affect the language G describes and that the resulting grammar is productive.

To check whether a productive grammar \(G \in \mathrm {HRG}_{{\varSigma _N}}\) is a DSG, we check for each rule \(X \rightarrow H \in G\) and node \(v \in V_H\) whether v is connected to a variable, i.e.,

$$\begin{aligned} \exists e \in E_H: lab(e)\in Var _\varSigma \, \wedge \, att(e) = v, \end{aligned}$$
(10)

or whether an outgoing selector can be derived more than once at v, i.e.,

$$\begin{aligned} \begin{array}{c} \exists e_1, e_2 \in E_H: lab(e_1) = X \, \wedge \, lab(e_2) = Y \, \wedge \, \\ \exists i,j: att (e_1)(i) = att (e_2)(j) = v \, \wedge \, \textit{term}((X,i)) \cap \textit{term((Y,j))} \ne \emptyset . \end{array} \end{aligned}$$
(11)

Here, \(\textit{term}({\cdot })\) denotes the set of derivable selectors for tentacle \(T_i = (X, i)\). It follows that G is a DSG iff both properties (10) and (11) are violated. Condition (10) can be readily checked, whereas condition (11) amounts to determine the sets \(\textit{term}({\cdot })\). This can be recursively done as follows:

$$\begin{aligned}\textit{term}((X,i)) = {\left\{ \begin{array}{ll} \{X\} &{} \text{ if } i = 1 \\ \emptyset &{} \text{ if } i = 2 \\ \bigcup _{(X,i) \rightarrow _p \, (Y,j)} term((Y,j)) &{} \text{ otherwise } \end{array}\right. } \end{aligned}$$

Here, \((X,i) \rightarrow _p \, (Y,j)\) denotes that (Xi) can be replaced by (Yj) by applying the rule \(p = X \rightarrow H \in G\). As the sets \(\textit{term}({\cdot })\) are finite, they can be calculated incrementally.

Excluding both conditions directly ensures that all derivable terminal graphs fulfil the properties of heap configurations and, thus, that G is a DSG.

Appendix 2: Typedness for DSGs

We split Theorem 5 from Sect. 5 (p. 23) into two lemmas (5 and 6) and prove them separately. For simplicity we introduce the notion of typedness on hypergraphs and relate this to the notion of typedness of HRGs as introduced in Definition 19.

Definition 25

(Type of a hypergraph) Let \(H \in \mathrm {HG}_{{\varSigma _N}}\) and \(i \in [1, |ext_H| ]\). The type sequence of \(H, type(H) = type(H)(1) \dots type(H)(|ext_H|)\), is defined by

$$\begin{aligned} type(H)(i) = lab_H(out_H(ext_H(i))). \end{aligned}$$

The set of types of a nonterminal is given by \(types(X) = \{type(H) \mid H \in \mathcal {L}_{}({{X}^\bullet })\}\).

The next lemma directly follows from the definition of typedness.

Lemma 4

A grammar \(G \in \mathrm {DSG}_{{\varSigma _N}}\) is typed iff \(|types(X)| = 1\) for every \(X \in N\).

Using this result, we now prove the first part of Theorem 5.

Lemma 5

It is decidable whether an HRG is typed.

Proof

We can verify if a HRG G is typed by checking if for each \(X \in {N}, i \in [1, rk (X)]\) and every pair of rules \(X \rightarrow H_1 \mid H_2\) the sets of concretisable outgoing edges are equal:

$$\begin{aligned} \bigcup _{\forall T \in Tent _{H_1}(ext_{H_1}(i))} term(T) = \bigcup _{\forall T \in Tent _{H_2}(ext_{H_2}(i))} term(T) \end{aligned}$$
(12)

where \( Tent _H(v) = \{(Y,i) \mid \forall \, Y \in N, i \in [1, rk (Y)], \exists e \in E_H: lab_H(e) = Y \wedge att_H(e)(i) = v\}\) is the set of tentacles connected to node v. If property (12) holds for each \(X \in {N}, i \in [1, rk (X)]\) and every pair of rules \(X \rightarrow H_1 \mid H_2\), then G is typed, otherwise it is not. \(\square \)

Lemma 6

For every untyped DSG \(G \in \mathrm {DSG}_{{\varSigma _N}}\) an equivalent typed DSG \(G' \in \mathrm {DSG}_{\varSigma _{N'}}\) exists, i.e., one satisfying

$$\begin{aligned} \bigcup _{X \in {N}} L_G({{X}^\bullet }) = \bigcup _{Y \in N'} L_{G'}({{Y}^\bullet }). \end{aligned}$$

Proof

We transform \(G \in \mathrm {DSG}_{{\varSigma _N}}\) into an equivalent typed grammar \(G' \in \mathrm {DSG}_{\varSigma _{N'}}\) with \(N' = \{(X, T) \in N \times \mathcal {P}(\varSigma )^{*} \mid T \in \textit{types}(X) \}\) and \( rk ((X, T)) = rk (X)\), by replacing every nonterminal with a corresponding set of typed ones. We first generate a set of right-hand side graphs with replaced nonterminals for each nonterminal \(X \in {N}\): \(R^X = \{ t(H) \mid X \rightarrow H \in G \}\), with t(H) the set of graphs derived from H by replacing old nonterminals from N by new ones from \(N'\):

$$\begin{aligned} t(H) = \{ H[{{X_1}^\bullet } / e_1] \dots [{{X_k}^\bullet } /e_k] \mid \begin{array}{@{}l@{}} E^N = \{ e_1 \dots e_k\},\\ X_i = (lab(e_i), T) \in N', T \in types({{lab(e_i)}^\bullet })\} \end{array} \end{aligned}$$

where \(E^N = \{e \in E_H \mid lab(e) \in {N}\}\) is the set of edges labelled with nonterminals. The grammar \(G'\) is defined as follows:

$$\begin{aligned} G' = \{ (X, T) \rightarrow H \mid X \in N' \wedge H \in R^X: type(H) = T \}. \end{aligned}$$

The definition of \(G'\) obviously implies typedness.

It remains to show that G and \(G'\) are equivalent, i.e., that

$$\begin{aligned}\forall H \in \mathrm {HG}_{\varSigma }. (\exists X \in N: {{X}^\bullet } \Rightarrow ^{*} H) \Longleftrightarrow (\exists Y \in N': {{Y}^\bullet } \Rightarrow ^{*} H) \end{aligned}$$
  • \(\Longleftarrow \)”:  Let \({{(X,T)}^\bullet }[K_1/e_1]\dots [K_n / e_n] = H\) be a derivation for \(H \in \mathrm {HG}_{\varSigma }\) in \(G'\). We obtain the derivation \(X[K'_1 / e'_1]\dots [K'_n / e'_n] = H\) in G by simply replacing every occurring edge labelled by a nonterminal \((X, T) \notin N\) by the corresponding untyped nonterminal, i.e., \(K_i' = (V_{K_i}, E_{K_i}, {lab}_{K_i'}, att _{K_i}, ext _{K_i})\), where \(lab_{K_i'}(e) = lab_{K_i}(e)\) if \(lab_{K_i}(e) \in N\) and \(lab_{K_i'}(e) = X\) if \(lab_{K_i}(e) = (X,T)\). The resulting sequence indeed describes hyperedge replacements, since typed edges and hypergraphs are of the same rank as the corresponding untyped ones. Furthermore the replacement sequence is a derivation in G, as each rule \((X,T) \rightarrow t(H) \in G'\) is a copy of a rule \(X \rightarrow H \in G\) except from nonterminal relabelling.

  • \(\Longrightarrow \)”:  Let \({{X}^\bullet }[K_1 / e_1]\dots [K_n / e_n] = H\) be a derivation in G. We construct a corresponding derivation \({{(X,T)}^\bullet }[K'_1 / e'_1]\dots [K'_n / e'_n] = H\) in \(G'\). Analogously to the first part, we use graphs \(K_i' = (V_{K_i}, E_{K_i}, {lab}_{K_i'}, att _{K_i}, ext _{K_i})\) but here we have to determine a suitable type for each nonterminal edge. Note that within the derivation each introduced nonterminal edge \(e'_j \in E_{K'_i}^N\) will be replaced later on as the resulting hypergraph H is a terminal graph, i.e., \(H \in \mathrm {HG}_{\varSigma }\). We will give \(e'_j\) an appropriate label \((lab_{K_i}(e_j), type(K'_j))\) such that the jth replacement can be realised properly, i.e., \(lab_{K_i'}(e'_j) = ({lab}_{K_i}(e), type (K'_j))\), thus all nonterminal edges \(e \in K'_i\) are labelled with corresponding typed nonterminals. The types of all \(K'_i\) are determined from right to left, starting with the terminal graph \(K'_n\). As each rule in G has a copy in \(G'\) for any combination of edge labels extended by types and as \(lab(e_i) \rightarrow K_i \in G\), there is also a rule \({lab}(e_i) \rightarrow K_i\) for each \(i \in [1, n]\). \(\square \)

Appendix 3: Local Greibach normal form

Theorem 6 in Sect. 5 (p. 25) states that for every DSG there exists an equivalent DSG in LGNF. We will prove this result by giving an algorithm for constructing the LGNF of an arbitrary DSG and by showing its correctness. It is inspired by the corresponding construction for string grammars. First we recall the definition of LGNF (cf. Definition 21) of a grammar G. Remember that it ensures that for each nonterminal \(X \in N\) and for each non-reduction tentacle (Xi) there exists a subgrammar \(G_{(X,i)}\) which allows to expose all outgoing selector edges. Additionally, the combination of \(G_{(X,i)}\) and the non-X rules of G suffices to generate the full language of X.

LGNF for a DSG G is established by merging sets \(G_{(X,i)}\), which are constructed for every (Xi) in G in four steps along the lines of the GNF construction for string grammars: Assume a total order on the non-reduction tentacles \(T_1, \dots , T_n\) over N. Starting at the lowest tentacle, (1) every rule p implying \(T_i \rightarrow _p T_j\) with \(j < i\) is eliminated, then (2) local recursion is removed. In a subsequent step (3) all rules are brought into LGNF using simple hyperedge replacements. Finally (4) rules for nonterminals introduced during the construction are transformed. In the following we guide through the four construction steps and define them in detail.

For each non-reduction (Xi)-tentacle we initialise the set \(G_{(X,i)} = G^X\) and define an order \(T_1, \dots , T_n\) on non-reduction tentacles \(T_i\).

Step 1: Elimination of rules. We first eliminate the rules \(p = X \rightarrow H \in G_{(X,i)}\) with \((X,i) = T_k \rightarrow _p T_l, l < k\). Let \(T_l = (Y,j), e \in E_H\) with \(lab(e) = Y\) and \(att(e) (j) = ext(i)\). Then p is replaced by the set \(\{ X \rightarrow H[K/e] \mid Y \rightarrow K \in G_{(Y, j)} \}\).

By Theorem 1, the elimination procedure is language preserving. This directly implies the following lemma.

Lemma 7

Let \(G \in HRG_{{\varSigma _N}}\). For a grammar \(G'\) originating from G by eliminating a production rule, it holds that \(L_G(H) = L_{G'}(H), \, \forall H \in HG_{\varSigma _{{N}}}\).

Note that \(G_j^Y\) does not contain any rule p with \(T_l \rightarrow _p T_m, m < l\), as they are removed before. Thus after finitely many steps all corresponding rules are eliminated. However, rules p with \(T_i \rightarrow _p T_i\) remain. They will be handled in the next step.

Step 2: Elimination of local recursion.

Definition 26

(Local recursion) Let \(G \in \mathrm {HRG}_{\varSigma _{N}}, X \in {N}, i \in [1, rk(X)]\). G is locally recursive at tentacle (Xi) if there exists a rule p with \((X, i) \rightarrow _p (X, i)\).

Let \(G_r^X \subseteq G_{(X,i)}\) be the set of all rules locally recursive at (Xi). To remove local recursion in \(p = X \rightarrow H \in G_{(X,i)}\), we introduce a new nonterminal \(B'\), a recursive rule \(B' \rightarrow J_n\) and an exit rule \(B' \rightarrow J_t\). \(J_t\) corresponds to graph H where edge e causing local recursion is removed. We also remove all external nodes singly connected to e (\(V_R = \{ v \in [ext_H] \mid \forall e' \in E_H: v \in [att_H(e')] \Rightarrow e = e'\}\)). By removing border edge e, its previously connected internal nodes move to the border and thus become external. Thus \(V_{ext} = ([att_{H}(e)] \cup [ext_{H}]) \cap V_{J_t}\) and \(ext_{J_t}\) is an arbitrary permutation of the set \(V_{ext}\).

\(J_n\) extends \(J_t\) by an additional edge \(e'\) labelled by \(B'\). As this edge models the structure from the other side, it is connected to the remaining external nodes of H that will not be external any longer. Note that the rank of \(B'\) is already given by \(J_t\) and therefore gaps introduced in the external sequence are filled by new external nodes that are connected to edge \(e'\) (\( fill (k) = ext_{J_t}(k)\) if \(ext_{J_t}(k) \in att_{H}(e)\), a new node otherwise, \(\forall k \in [1, rk(e')]\)). If different \(J_t\) suggest different ranks for \(B'\), the highest rank is chosen and the rule graphs with a lower rank are then integrated into rules of higher rank. This may entail the need to introduce a set of rules, generated in the same fashion as \(J_n\), to cover the different derivations, where the graph is extended only between a restricted set of external nodes. To build up the same structure as (Xi) “from the other side” edge \(e'\) has to be plugged in correctly:

$$\begin{aligned}&\textit{plug}(g) = \left\{ \begin{array}{ll} ext_{H}(y) &{} \text{ if } ext_{J_n}(g) \in V_{J_t} \\ ext_{J_n}(g) &{} \text{ otherwise } \\ \end{array} \right. ,\quad \text{ with } att_{H}(e)(y) = ext_{J_t}(g).\\&\begin{array}{ll} B' \rightarrow J_t\hbox { with}:&{}\qquad \qquad B' \rightarrow J_n\hbox { with}:\\ \quad V_{J_t} = V_{H} \setminus V_{R}(e) &{}\quad \qquad \qquad V_{J_n} = V_{J_t} \cup [ext_{J_n}] \\ \quad E_{J_t} = E_{H} \setminus \{e\} &{}\quad \qquad \qquad E_{J_n} = E_{J_t} \cup \{e'\} \\ \quad {lab}_{J_t} = {lab}_H\!\!\upharpoonright \!\!E_{J_t} &{} \quad \qquad \qquad {lab}_{J_n} = {lab}_{J_t} \cup \{ e' \rightarrow B' \}\\ \quad att _{J_t} = att _{H}\!\!\upharpoonright \!\!E_{J_t} &{} \quad \qquad \qquad att _{J_n} = att _{J_t} \cup \{ e' \rightarrow \textit{plug}\}\\ \quad ext _{J_t} &{} \quad \qquad \qquad ext _{J_n} = \textit{fill} \\ \end{array} \end{aligned}$$

Newly introduced nonterminals are collected in a set \(N'\), i.e., \(G_{(X,i)}\) is now defined over the alphabet \(\varSigma _{N \cup N'}\). For mirrored derivations, each terminal rule in \(G_{(X,i)}\) can be the initial one thus we add a copy, extended by an additional \(B'\)-edge, to \(G_{(X,i)}\).

Example 19

The doubly-linked list HRG with production rules \(L \rightarrow H \mid J\) given in Fig. 12 (p. 13) is locally recursive at (L, 2). We introduce nonterminal \(B'\) and the rules \(B'\rightarrow J_t \mid J_n\), cf. Fig. 31. The terminal \(J_t\) corresponds to J with removed L-edge and attached external node ext(2). \(J_n\) is a copy of \(J_t\) with an additional \(B'\)-edge and replaced external node ext(1). Intuitively, local recursion is eliminated by introducing new production rules which allow “mirrored” derivations.

Lemma 8

Let \(G \in DSG_{\varSigma _{N}}\). For the grammar \(G_{(X,i)}\) over \(\varSigma ' = \varSigma _{N \cup N'}\) originating from G by eliminating the (Xi)-local recursion as described above, it holds that \(L_G(H) = L_{G_{(X,i)}}(H)\) for all \(H \in HG_{\varSigma _{N}}\).

Instead of proving Lemma 8 directly we show the following stronger result.

Lemma 9

Let \(G \in \mathrm {DSG}_{\varSigma _{N}}\). For the grammar \(G_{(X,i)}\) over \(\varSigma ' = \varSigma _{N \cup N'}\) originating from G by eliminating the (Xi)-local recursion, it holds that \(\forall H \in \mathrm {HG}_{{\varSigma _N}}, K \in \mathrm {HG}_{\varSigma }: (H \mathop {\Rightarrow }\limits ^{G} K) \Leftrightarrow (H \mathop {\Rightarrow }\limits ^{G_{(X,i)}} K)\).

Fig. 31
figure 31

Doubly-linked list: \(G_{(B',2)}\)

Proof

Note that, as HRGs are context free, for every \(H \in \mathrm {HG}_{{\varSigma _N}}\), and \(e_1, e_2 \in E_H\) we have that \(H[K_1 / e_1][K_2 / e_2] = H[K_2 / e_2][K_1 / e_1]\). Therefore we can reorder replacement sequences as long as we ensure that edges are not replaced before they are introduced, i.e., \(e_j \in E_{K_i} \Longrightarrow j>i\). For each replacement [K / e] of a sequence we define the set of successors of [K / e] recursively by

$$\begin{aligned} successors ([K / e]) = \bigcup _{e' \in E_K^N} successors ([K' / e']) \cup E_K^N. \end{aligned}$$

We split \(G_{(X,i)}\) into the set of old production rules \(P_{old}\), i.e., productions which are adopted from G without modification, and the set of new production rules \(P_{new}\), i.e., productions generated during elimination of local (Xi)-recursion. We know that all rules from \(P_{new}\) are of the form \(X \rightarrow H_B\) or \(B \rightarrow J_n \mid J_t\), where \(E_{X_n}\) contains an edge labelled with B.

  • \(\Longleftarrow \)”: We reorder a given derivation \(H \Rightarrow ^{*} K : H[K_1 / e_1]\dots [K_n / e_n] = K\) such that each \([K_i / e_i]\) with \({lab}(e_i) \rightarrow K_i \in P_{new}\) is directly succeeded by the elements of \(\{ [K / e] \in successors([K_i / e_i]) \mid {lab}(e) \rightarrow K \in P_{new} \}\), i.e., the successors contained in \(P_{new}\). This yields the following derivation:

    $$\begin{aligned} H\underbrace{[K_1 / e_1] \dots [K_{i-1} / e_{i-1}]}_{\in P_{old}} \underbrace{[K_i / e_i] \dots [K_j / e_j]}_{\in P_{new}}[K_{j+1} / e_j] \dots = K \end{aligned}$$

    Note that there can be more than one or no subderivation \([K_i / e_i]\dots [K_j / e_j]\) from \(P_{new}\). (*) The replacements before the first such subderivation do not employ any rules from \(P_{new}\) and thus no edge labelled with B is introduced. Therefore \(H[K_1/ e_1] \dots [K_{i-1} / e_{i-1}] \in \mathrm {HG}_{{\varSigma _N}}\) can be derived in G accordingly. Consider the grammar G given in Fig. 32. By construction we know that \(P_{new}\) is of the form given in Fig. 33 and the subderivation has the following form:

    $$\begin{aligned} H[e_i/K] [e_{i+1}/J_{n_1}]\dots [J_{N_{j-2}} / e_{j-1}] [J_t / e_j] = K' \in \mathrm {HG}_{{\varSigma _N}}. \end{aligned}$$

    We show by induction over the number m of \(B \rightarrow J_n\) rules that for every such sequence there exists a corresponding derivation \(H \Rightarrow ^{*} K'\) in G. (IB) \(m = 0\): We consider the derivation \(H_1 \mathop {\Rightarrow }\limits ^{X \rightarrow H_B} H_2 \mathop {\Rightarrow }\limits ^{B \rightarrow J_t} K'\) of \(G_{(X_i)}\), see Fig. 33 for rules. The corresponding derivation in G is given by \(H_1 \mathop {\Rightarrow }\limits ^{X \rightarrow X_n} H_2' \mathop {\Rightarrow }\limits ^{X \rightarrow X_t} K'\), where \(X \rightarrow X_n\) generates the nodes and edges from \(B \rightarrow J_t\) while the remaining graph is generated by \(X \rightarrow X_t\). (IH) A corresponding derivation can be found for an arbitrary but fixed number m of rule applications \(B \rightarrow J_n, m \in \mathbb {N}_0\). (IS) \(m \rightarrow m + 1\): Consider the derivation

    $$\begin{aligned} H \mathop {\Rightarrow }\limits ^{X \rightarrow H_B} H_1 \mathop {\Rightarrow }\limits ^{B \rightarrow J_{n_1}} \dots \mathop {\Rightarrow }\limits ^{B \rightarrow J_{n_{m+1}}} H_{m+1} \mathop {\Rightarrow }\limits ^{B \rightarrow J_t} K' \end{aligned}$$

    By (IH) we know that

    $$\begin{aligned} H \mathop {\Rightarrow }\limits ^{X \rightarrow H_B} H_1 \mathop {\Rightarrow }\limits ^{B \rightarrow J_{n_1}} \dots \mathop {\Rightarrow }\limits ^{B \rightarrow J_{n_m}} H_{m} \mathop {\Rightarrow }\limits ^{B \rightarrow J_t} = K' \end{aligned}$$

    can be simulated in G via

    $$\begin{aligned} H \mathop {\Rightarrow }\limits ^{X \rightarrow X_n} H'_1 \mathop {\Rightarrow }\limits ^{X \rightarrow X_{n_1}} \dots \mathop {\Rightarrow }\limits ^{X \rightarrow X_{n_m}} H'_{m} \mathop {\Rightarrow }\limits ^{X \rightarrow X_t} K'. \end{aligned}$$

    We extend this derivation by inserting the rule \(X \rightarrow X_{n_{m+1}}\) corresponding to \(B \rightarrow J_{n_{m+1}}\) at the beginning of the derivation. See Fig. 34 for the correspondence of the \(G_{(X,i)}\)- and the G-derivation. Note that graph parts derived during the jth step of \(G_{(X,i)}\) are derived in G by derivation step \((m + 1) - (j - 1) = m + 2 - j\), where \(m + 1\) is the overall number of steps.

    $$\begin{aligned} H \mathop {\Rightarrow }\limits ^{X \rightarrow X_n} H'_1 \mathop {\Rightarrow }\limits ^{X \rightarrow X_{n_{m+1}}} H'_2 \mathop {\Rightarrow }\limits ^{X \rightarrow X_{n_1}} \dots \mathop {\Rightarrow }\limits ^{X \rightarrow X_{n_m}} H'_{n+1} \mathop {\Rightarrow }\limits ^{X \rightarrow X_t} K'. \end{aligned}$$
  • \(\Longrightarrow \)”: Analogously, by considering subderivations of local (Xi)-recursive rules.

Fig. 32
figure 32

Grammar G, locally recursive at (Xi)

Fig. 33
figure 33

Grammar \(G_{(X,i)}\) after eliminating local recursion

Fig. 34
figure 34

Correspondence of derivations

The remaining derivation \(H[K_{j+1} / e_j] \dots = K\) is handled exactly in the same way, starting from (*). Thus every \(K \in \mathrm {HG}_{\varSigma }\) derivable in \(G_{(X,i)}\) is derivable in G. \(\square \)

Step 3: Generation of Greibach rules. Starting at the highest-order tentacle, for each \(G_{(X,i)}\) LGNF can be established by eliminating every non-reduction (Yj)-tentacle connected to external vertex i. That is because (Yj) is of higher order and thus already in LGNF. The elimination of rules is done as described in step 1 of the LGNF construction. We already saw that this does not change the induced language.

Step 4: Transforming new nonterminals to LGNF. In the final step we apply steps 1 to 3 to the newly added nonterminals from step 2. Obviously further nonterminals could be introduced. To avoid nontermination we merge nonterminals if the right-hand sides of the corresponding production rules are equal.

Theorem 7

After finitely many steps a nonterminal can be merged, thus the construction of LGNF terminates.

Proof

By construction we know that for every locally recursive rule exactly two production rules for the new nonterminal B are generated. The first is obtained from the respective original rule, where the locally recursive edge and corresponding external vertices are removed. The second is a copy of the first rule with an additional nonterminal edge labelled with B on the right-hand side. This means that the subgraph used as basis for the construction of production rules for new nonterminals remains the same. In addition exactly one edge (labelled with B) is used for construction. As there exist only finitely many possibilities to combine subgraph and edge, eventually a nonterminal will be produced whose associated rules coincide with those of an already existing nonterminal (up to renaming). Thus we can merge these nonterminals as any further constructed production rules would be redundant. \(\square \)

Note that step 2 is non-deterministic as the order on external vertices can be chosen arbitrarily. Unnecessary steps introduced by inappropriate orders can be avoided by considering permutations of external vertices to be isomorphic. If we reach an isomorphic nonterminal after some number of steps, all nonterminals in between represent the same language and thus can be merged as long as their rank permit this.

Example 20

Applying steps 1 to 3 to \(G_{(B',1)}\) results in a new nonterminal \(B''\) isomorphic to L. Thus we can merge L with \(B''\) and even \(B'\) as the latter occurred between the formers.

As all steps executed while constructing LGNF are language preserving, the generated rules for subgrammar \(G_{(X,i)}\) contain terminal edges at external node i for all X-rules, and the algorithm finally terminates, the presented algorithm is correct. Note that LGNF additionally ensures the increasingness property, as every production rule belongs to at least one \(G_{(X,i)}\) composed by rules with terminal edges at external vertex ext(i).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heinen, J., Jansen, C., Katoen, JP. et al. Juggrnaut: using graph grammars for abstracting unbounded heap structures. Form Methods Syst Des 47, 159–203 (2015). https://doi.org/10.1007/s10703-015-0236-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10703-015-0236-1

Keywords

Navigation