1 Introduction

Contextual string grammars were introduced by Solomon Marcus [14] with motivations arising from descriptive linguistics. A contextual string grammar consists of a finite set of strings (axioms) and a finite set of productions, which are pairs (sc) where s is a string, the selector, and c is the context, i. e., a pair of strings, \(c=(u,v),\) over the alphabet under consideration. Starting from an axiom, contexts iteratively are added as is indicated by the productions, which yields new strings. In contrast to usual sequential string grammars in the Chomsky hierarchy (e.g., see [20]), these contextual string grammars are pure grammars where new strings are not obtained by rewriting, but by adjoining strings. Several classes of contextual grammars have been introduced and investigated, e.g., see [3, 17] for surveys on the area.

The idea of contextual productions then was also introduced for multi-dimensional array grammars, for instance, to carry over ideas from formal languages to the processing of digital images. In the area of two-dimensional picture languages, e.g., see [12, 16, 18, 19], different kinds of array grammars, both isometric and non-isometric ones, have been proposed, motivated by many applications such as character recognition (also confer [4]), cluster analysis of patterns, and so on. Isometric contextual array grammars were introduced in [11].

Regulated rewriting with different control mechanisms has been studied extensively especially for string grammars (e.g., see [2]), for example, grammars with control languages and matrix grammars, but then also for array grammars, e.g., see [9]. Non-isometric contextual array grammars (with regulation) were considered in [7, 8, 13].

In this paper we consider matrix contextual array grammars and contextual array grammars with regular control and examine their generative power. In the 1-dimensional case, we obtain special results: the family of 1-dimensional array languages generated by contextual array grammars with regular control languages coincides with the family of regular 1-dimensional array languages over unary alphabets and with array images of the linear languages over alphabets with more than one letter; already for binary alphabets, regular control is strictly more powerful than matrix control, a phenomenon rarely observed in regulated rewriting (confer [10]).

2 Definitions

For notions and notations as well as results related to formal language theory we refer to books like [2]. The families of \(\lambda \)-free (\(\lambda \) denotes the empty string) regular string languages (over a k-letter alphabet) is denoted by \(\mathcal {L}\left( REG\right) \) (\(\mathcal {L}\left( REG^{k}\right) \)). For the definitions and notations for arrays and sequential array grammars we refer to [9, 18, 22].

Let \(\mathbb {Z}\) be the set of integers and \(\mathbb {N}\) be the set of positive integers. Let \(d\in \mathbb {N}\). A \( d \) -dimensional array \(\mathcal {A}\) over the alphabet V is a mapping \(\mathcal {A}:\mathbb {Z}^{d}\rightarrow V\cup \left\{ \#\right\} \) where \(shape\left( \mathcal {A}\right) =\left\{ v\in \mathbb {Z}^{d}\mid \mathcal {A}\left( v\right) \ne \#\right\} \) is finite and \(\#\notin V\) is called the blank symbol. We usually write \(\mathcal {A}=\left\{ \left( v,\mathcal {A}\left( v\right) \right) \mid v\in shape\left( \mathcal {A} \right) \right\} \). The set of all d-dimensional arrays over V is denoted by \(V^{*d}.\) The empty array \(\varLambda _{d}\) in \(V^{*d}\) satisfies \(shape(\varLambda _{d})=\emptyset \). Moreover, we define \(V^{+d}=V^{*d}\setminus \left\{ \varLambda _{d}\right\} .\)

Let \(v\in \mathbb {Z}^{d}\). Then the (linear) translation \(\tau _{v}:\mathbb {Z}^{d}\rightarrow \mathbb {Z}^{d}\) is defined by \(\tau _{v}\left( w\right) =w+v \) for all \(w\in \mathbb {Z}^{d}\), and for any array \(\mathcal {A}\in V^{*d} \) we define \(\tau _{v}\left( \mathcal {A}\right) \), the corresponding d-dimensional array translated by v, by \(\left( \tau _{v}(\mathcal {A})\right) \left( w\right) =\mathcal {A}\left( w-v\right) \) for all \(w\in \mathbb {Z}^{d}.\) The vector \(\left( 0,...,0\right) \in \mathbb { Z}^{d}\) is denoted by \(\varOmega _{d}\).

Usually (see [18]) arrays are regarded as equivalence classes of arrays with respect to linear translations. The equivalence class \(\left[ \mathcal {A}\right] \) of an array \(\mathcal {A}\in V^{*d}\) satisfies \(\left[ \mathcal {A}\right] =\left\{ \mathcal {B}\in V^{*d}\mid \mathcal {B}=\tau _{v}\left( \mathcal {A}\right) \text { for some }v\in \mathbb { Z}^{d}\right\} \). The set of all equivalence classes of d -dimensional arrays over V with respect to linear translations is denoted by \(\left[ V^{*d}\right] \), and this bracket notation carries over to classes of array languages, as well.

As many results for d-dimensional arrays for a specific d can be taken over immediately for higher dimensions, we introduce special notions:

Let \(n,m\in \mathbb {N}\) with \(n\le m.\) For \(n<m,\) the natural embedding \(i_{n,m}:\mathbb {Z}^{n}\rightarrow \mathbb {Z}^{m}\) is defined by \(i_{n,m}\left( v\right) =\left( v,\varOmega _{m-n}\right) \) for all \(v\in \mathbb {Z}^{n};\) for \(n=m\) we define \(i_{n,n}:\mathbb {Z}^{n}\rightarrow \mathbb {Z}^{n}\) by \(i_{n,n}\left( v\right) =v\) for all \(v\in \mathbb {Z}^{n}.\) To an n-dimensional array \(\mathcal {A}\in V^{+n}\) with \(\mathcal {A} =\left\{ \left( v,\mathcal {A}\left( v\right) \right) \mid v\in shape\left( \mathcal {A}\right) \right\} \) we assign the m-dimensional array \( i_{n,m}\left( \mathcal {A}\right) =\left\{ \left( i_{n,m}\left( v\right) ,\mathcal {A}\left( v\right) \right) \mid v\in shape\left( \mathcal {A}\right) \right\} .\)

We can use the well-known graph-theoretic notion of a connected graph to define connected arrays. Let W be a non-empty finite subset of \(\mathbb {Z}^{d}\). We associate a graph g(W) to W with vertex set W and an edge between \(v,w\in W\) if and only if \(\left\| v-w\right\| =1\), where the norm \(\left\| u\right\| \) of a vector \(u\in \mathbb {Z}^{d}\), \(u=\left( u\left( 1\right) ,...,u\left( d\right) \right) \), is defined by \(\left\| u\right\| =\max \left\{ \left| u\left( i\right) \right| \mid 1\le i\le d\right\} .\) Then W is said to be connected if g(W) is connected. There is a natural bijection between the (equivalence classes of) 1-dimensional connected arrays and strings: for any equivalence class of 1-dimensional arrays \(\mathcal {A}=\left[ \left\{ ((i-1),a_{i}) \mid 1\le i\le n\right\} \right] \) we define its string image as \(str(\mathcal {A})=a_{1}\ldots a_{n}\); the string \(w=a_{1}\ldots a_{n}\) can be interpreted as the array \(arr\left( w\right) =\left\{ \left\{ ((i-1),a_{i})\right\} \mid 1\le i\le n\right\} \). In the standard way, these notions are extended from strings and arrays to sets of strings and arrays.

Example 1

Consider the language \(L_{1}\) of connected 2-dimensional arrays

$$\begin{aligned} L_{1}=\bigg \{ \big \{ \left( \left( 0,i\right) ,a\right) \mid 0\le i\le n \big \} \cup \big \{ \left( \left( j,0\right) ,a\right) \mid 1\le j\le m \big \}\,\, \biggr |\,\, n,m\in \mathbb {N}\bigg \}. \end{aligned}$$
$$\begin{aligned}{}\begin{array}[b]{lllll} a &{} &{} &{} &{} \\ a &{} &{} &{} &{} \\ a &{} &{} &{} &{} \\ a &{} a &{} a &{} a &{} a \end{array} \end{aligned}$$

An example of these L-shaped arrays (for \(n=3\) and \(m=4\)) from \(\left[ L_{1}\right] \) can be depicted as shown on the left. Observe that both arms of these arrays can have arbitrary lengths. \(\square \)

Definition 1

A regular d -dimensional array grammar is specified as \(G=\left( d,N,T,\#,P,\left\{ \left( v_S,S\right\} \right) \right) \) where N is the alphabet of non-terminal symbols, T is the alphabet of terminal symbols, \(N\cap T=\emptyset ,\) \(\#\notin N\cup T\); P is a finite non-empty set of regular d-dimensional array productions over \(N\cup T,\) as well as \(v_S\in \mathbb {Z}^d\) and \(S\in N\) is the start symbol. A regular \( d \) -dimensional array production either is of the form \(A\rightarrow b\), \(A\in N\), \(b\in T\), or \(Av\#\rightarrow bC\), \(A,C\in N\), \(b\in T\), \(v\in \mathbb {Z}^{d}\) with \(\left\| v\right\| =1\). The application of \(A\rightarrow b\) means replacing A by b in a given array. \(Av\#\rightarrow bC\) can be applied if in the underlying array we find a position u occupied by A and a blank symbol at position \(u+v\); A then is replaced by b, and \(\#\) by C. The array language generated by G is the set of all d-dimensional arrays derivable from the initial array \(\left\{ \left( v _{S},S\right) \right\} \). The family of \(\varLambda \)-free d-dimensional array languages (of equivalence classes) of arrays over a k-letter alphabet generated by regular d-dimensional array grammars is denoted by \(\mathcal {L}\left( d\text {-}REGA^{k}\right) \) (\(\left[ \mathcal {L}\left( d\text {-}REGA^{k}\right) \right] \)). For arbitrary alphabets, we omit the superscript k.

The following results for 1-dimensional array languages are folklore:

Theorem 1

For all \(k\ge 1\), \(\left[ \mathcal {L}\left( 1\text {-}REGA^{k}\right) \right] = \left[ arr\left( \mathcal {L}\left( REG^{k}\right) \right) \right] \) and \(str\left( \left[ \mathcal {L}\left( 1\text {-}REGA^{k}\right) \right] \right) = \mathcal {L}\left( REG^{k}\right) \).

Let us mention the close similarities of the work of 1-dimensional regular array grammars and Lindenmayer systems with apical growth [21]. Another similar development can be found within Watson-Crick systems [15].

3 Contextual Array Grammars

We now turn our attention to the main variants of contextual array grammars considered in this paper.

Definition 2

A d -dimensional contextual array grammar (\(d\in \mathbb {N}\)) is a construct \(G=\left( d,V,\#,P,A\right) \) where V is an alphabet not containing the blank symbol \(\#,\) A is a finite set of axioms, i. e., of d-dimensional arrays in \(V^{+d},\) and P is a finite set of rules of the form \(\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) \) where

  • (i) \(U_{\alpha },U_{\beta }\subseteq \mathbb {Z}^{d},\) \(U_{\alpha }\cap U_{\beta }=\emptyset ,\) and \(U_{\alpha },U_{\beta }\) are finite and non-empty;

  • (ii) \(\alpha :U_{\alpha }\rightarrow V\ \)and \(\beta :U_{\beta }\rightarrow V.\)

\(\left( U_{\alpha },\alpha \right) \) corresponds with the selector and \(\left( U_{\beta },\beta \right) \) with the context of the production \(\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) ;\) \( U_{\alpha }\) is called the selector area, and \(U_{\beta }\) is the context area. As the sets \(U_{\alpha }\) and \(U_{\beta }\) are uniquely determined by \(\alpha \) and \(\beta \), we will also represent \(\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) \) by \(\left( \alpha ,\beta \right) \) only.

For \(\mathcal {C}_{1},\mathcal {C}_{2}\in V^{+d}\) we say that \(\mathcal {C}_{2}\) is directly derivable from \(\mathcal {C}_{1}\) by the contextual array production \(p\in P\), \(p=\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) \) (we write \(\mathcal {C}_{1}\Longrightarrow _{p}\mathcal {C}_{2}\)), if there exists a vector \(v\in \mathbb {Z}^{d}\) such that

  • \(\mathcal {C}_{1}\left( w\right) =\mathcal {C}_{2}\left( w\right) =\alpha \left( \tau _{-v}\left( w\right) \right) \) for all \(w\in \tau _{v}\left( U_{\alpha }\right) ,\)

  • \(\mathcal {C}_{1}\left( w\right) =\#\) for all \(w\in \tau _{v}\left( U_{\beta }\right) ,\)

  • \(\mathcal {C}_{2}\left( w\right) =\beta \left( \tau _{-v}\left( w\right) \right) \) for all \(w\in \tau _{v}\left( U_{\beta }\right) ,\)

  • \(\mathcal {C}_{1}\left( w\right) =\mathcal {C}_{2}\left( w\right) \) for all \(w\in \mathbb {Z}^{d}\setminus \tau _{v}\left( U_{\alpha }\cup U_{\beta }\right) .\)

Hence, if in \(\mathcal {C}_1\) we find a subpattern that corresponds with the selector \(\alpha \) and only blank symbols at the places corresponding with \(\beta ,\) we can add the context \(\beta \) thus obtaining \(\mathcal {C}_2\). For every \(\mathcal {B}_{1},\mathcal {B}_{2} \in \left[ V^{+d}\right] \) we say that \(\mathcal {B}_{2}\) is directly derivable from \(\mathcal {B}_{1}\) by the contextual array production \(p\in P\), \(p=\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) \), denoted \(\mathcal {B}_{1}\Longrightarrow _{p}\mathcal {B}_{2},\) if and only if \(\mathcal {C}_{1}\Longrightarrow _{p}\mathcal {C}_{2}\) for some \(\mathcal {C}_{1}\in \mathcal {B}_{1}\) and \(\mathcal {C}_{2}\in \mathcal {B}_{2}\). \(\mathcal {C}_{1}\Longrightarrow _{G}\mathcal {C}_{2}\) (\(\mathcal {B}_{1}\Longrightarrow _{G}\mathcal {B}_{2}\)) means that \(\mathcal {C}_{1}\Longrightarrow _{p}\mathcal {C}_{2}\) (\(\mathcal {B}_{1}\Longrightarrow _{p}\mathcal {B}_{2}\)) for some \(p\in P\).

The array language generated by G is defined as

$$\begin{aligned} \begin{array}{lcl} L\left( G\right)= & {} \left\{ \mathcal {C}\in V^{+d}\mid \mathcal {A} \Longrightarrow _{G}^{*}\mathcal {C}\text { for some }\mathcal {A}\in A\right\} . \end{array} \end{aligned}$$

The special type of d-dimensional contextual array grammars where axioms are connected and rule applications preserve connectedness is denoted by d-ContA, the corresponding family of d-dimensional array languages by \(\mathcal {L}\left( d\text {-}ContA\right) \); by \(\mathcal {L} \left( d\text {-}ContA^{k}\right) \) we denote the corresponding family of d -dimensional array languages over a k-letter alphabet.

Remark 1

As we mostly are interested in (families of) equivalence classes of arrays, a d-dimensional contextual array grammar \(\left[ G\right] \) for generating \(\left[ L\right] \) for \(L\in \mathcal {L}\left( d\text {-}ContA\right) \) being generated by a d -dimensional contextual array grammar \(G=\left( d,V,\#,P,A\right) \) with \(A=\left\{ \mathcal {A}_{i}\mid 1\le i\le n\right\} \) will be specified by writing \(\left[ G\right] =\left( d,V,\#,P,A^{\prime }\right) \) where \(A^{\prime }=\left\{ \mathcal {A}_{i}^{\prime }\mid 1\le i\le n\right\} \) such that \(\mathcal {A}_{i}^{\prime }\in \left[ \mathcal {A}_{i} \right] ,\) \(1\le i\le n\), which means specifying an axiom \(\mathcal {A}_{i}\) by one array from \(\left[ \mathcal {A}_{i}\right] \).

Example 2

Any finite d-dimensional array language of connected arrays \(L\subset T^{+d}\) is in \(\mathcal {L} \left( d\text {-}ContA\right) \) as \(L=L\left( G_{L}\right) \) where \( G_{L}=\left( d,T,\#,\emptyset ,L\right) \).    \(\square \)

Example 3

We now show how the language \(L_{1}\) from Example 1 can be generated by the contextual array grammar \(G_{1}\), i.e., \(L_{1}\in \mathcal {L}\left( 2\text {-} ContA^{1}\right) \): \(G_{1}=\left( 2,\left\{ a\right\} ,\#,P_{1}, \left\{ \mathcal {A}_{1}\right\} \right) \) where \(\mathcal {A}_{1}=\left\{ \left( \left( 0,0\right) ,a\right) ,\left( \left( 0,1\right) , a\right) ,\left( \left( 1,0\right) ,a\right) \right\} \) is the only axiom and \(P_{1}\) consists of the two productions \(p_{u}\) and \(p_{r}\):

$$\begin{aligned} \begin{array}{ccl} p_{u} &{} = &{} \left( \left\{ \left( 0,0\right) , \left( 0,1\right) \right\} , \left\{ \left( \left( 0,0\right) ,a\right) ,\left( \left( 0,1\right) ,a\right) \right\} , \left\{ \left( 0,2\right) \right\} ,\left\{ \left( \left( 0,2\right) ,a\right) \right\} \right) , \\ p_{r} &{} = &{} \left( \left\{ \left( 0,0\right) ,\left( 1,0\right) \right\} , \left\{ \left( \left( 0,0\right) ,a\right) , \left( \left( 1,0\right) ,a\right) \right\} , \left\{ \left( 2,0\right) \right\} ,\left\{ \left( \left( 2,0\right) ,a\right) \right\} \right) . \end{array} \end{aligned}$$

As the selector area \(U_{\alpha }\) and the context area \(U_{\beta }\) in a contextual array production of the form \(\left( U_{\alpha },\alpha ,U_{\beta },\beta \right) \) are disjoint, both \(\alpha \) and \(\beta \) can b represented within only one pattern, i. e., \(p_{u}\) and \(p_{r}\) can be represented in a more depictive way by the patterns shown on the right (the symbols of the selector are enclosed in boxes).

$$\begin{aligned} p_{u}=\begin{array}{c} a \\ \fbox {{ a}}\\ \fbox {{ a}}\end{array},\quad p_{r}=\begin{array}{ccc} \fbox {{ a}}&\fbox {{ a}}&a.\end{array}\end{aligned}$$

The example of the L-shaped array for \(n=3\) and \(m=4\) then is generated by twice applying rule \(p_{u}\) and three times applying rule \(p_{r}\), in any order. We also observe that every intermediate array obtained by applying these rules is in \(L_{1}\), too. Obviously, by the definition of equivalence classes of arrays, we also have \(\left[ L\left( G_{1}\right) \right] =\left[ L_{1}\right] \in \left[ \mathcal {L}\left( 2\text {-}ContA^{1}\right) \right] \).

\(\left[ \mathcal {A}_{1}\right] \) can be described in a more depictive way by \( \begin{array}{cc} a &{} \\ a &{} a \end{array} \), i.e., the contextual array grammar \(\left[ G_{1}\right] \) for \(\left[ L\left( G_{1}\right) \right] \) can also be written as \(\left[ G_{1}\right] =\left( 2,\left\{ a\right\} ,\#,P_{1},{\left\{ \begin{array}{cc} a &{} \\ a &{} a \end{array} \right\} } \right) \) (see Remark 1). In the following, the axiom(s) often will just be given in such a pictorial variant.    \(\square \)

Example 4

For the singleton language \(L_{\bot }= { \left\{ \begin{array}{lllll} &{} &{} a &{} &{} \\ &{} &{} a &{} &{} \\ a &{} a &{} a &{} a &{} a \end{array} \right\} } \subset \left[ \left\{ a\right\} ^{+2}\right] \), we have \(L_{\bot }\in \left[ \mathcal {L}\left( 2\text {-}ContA\right) \right] \setminus \left[ \mathcal {L}\left( 2\text {-}REGA\right) \right] \). As we can take \(L_{\bot }\) (as any finite language) as a set of axioms, containment in \(\left[ \mathcal {L}\left( 2\text {-}ContA\right) \right] \) is clear. Conversely, any regular array grammar has to scan the non-blank symbols of the array A, which is impossible, as the underlying graph g(shape(A)) is not Hamiltonian.    \(\square \)

Theorem 2

\(\left[ \mathcal {L}\left( 1\text {-} REGA^{1}\right) \right] \subseteq \left[ \mathcal {L}\left( 1\text {-} ContA^{1}\right) \right] \).

Proof

Due to the results from Theorem 1, it only remains to show that \(\left[ arr\left( \mathcal {L}\left( REG^{1}\right) \right) \right] \subseteq \left[ \mathcal {L}\left( 1\text {-}ContA^{1}\right) \right] \).

From [1, Theorem 4.4], we deduce that any infinite language \(L\subseteq \left\{ a\right\} ^{+}\) in \(\mathcal {L}\left( REG^{1}\right) \) can be written in the form \(L=\{a^{s_1},a^{s_2},\dots , a^{s_t}\}\cup \bigcup _{i=1}^m \{a^{k\cdot n+ d_i}\mid n\ge 0\}\) for some numbers \(k, k\le d_1<d_2<...<d_m<2k, 0\le s_1<s_2< ...<s_t<k\). The 1-dimensional contextual array grammar now is constructed using a context of length k and putting the words \(a^{s_j},\ 1\le j\le t\), and \(a^{d_i},\ 1\le i\le m\), into the set of axioms, i.e., we define the 1-dimensional contextual array grammar \(G\left( L\right) =\left( 1,\left\{ a\right\} ,\#,P,A\right) \) with \(A=\left\{ arr\left( a^{s_{j}}\right) \mid 1\le j\le t\right\} \cup \left\{ arr\left( a^{d_{i}}\right) \mid 1\le i\le m\right\} \) and \(P=\left\{ \fbox {a}^{k}a^{k}\right\} \). Obviously, \(\left[ L\left( G\left( L\right) \right) \right] =\left[ arr\left( L\right) \right] \). The 1-dimensional contextual array grammar \(\left[ G\left( L\right) \right] \) for \(\left[ L\left( G\left( L\right) \right) \right] \) can also be written as \(\left[ G\left( L\right) \right] =\left( 1,\left\{ a\right\} ,\#,P,A^{\prime }\right) \) with \(A^{\prime }=\left\{ a^{s_{j}}\mid 1\le j\le t\right\} \cup \left\{ a^{d_{i}}\mid 1\le i\le m\right\} \) (compare with Remark 1).

For the sake of completeness we mention that every finite array language \(A=\left\{ arr\left( a^{s_{j}}\right) \mid 1\le j\le t\right\} \) is generated by the 1-dimensional contextual array grammar \(G\left( L\right) =\left( 1,\left\{ a\right\} ,\#,P,A\right) \) with \(P=\emptyset \).    \(\square \)

Remark 2

Following the definition already given in [11], our d-dimensional extension of (external) contextual grammars only appends at one location, while external contextual string grammars as originally defined by Solomon Marcus, see [14], append to both ends of a string at the same time. This design decision has two main reasons. First, it is not quite clear what the d-dimensional counterpart of external contextual grammars would really mean: for instance, for \(d=2\), should we allow appending on both ends of a row or column at the same time, as we did in [8] for the case of non-isometric contextual array grammars? Or, should we rather append on ‘all ends’? Obviously, this situation becomes even more intricate for higher dimensions. Yet second and even more important, appending at both sides of a string, i.e., a 1-dimensional array, in parallel can easily be simulated sequentially by a matrix with two components. It is therefore easy to see that in the 1-dimensional case, the string images of the arrays generated by contextual array grammars with matrix control exactly correspond with the string languages generated by external contextual string grammars. This means that for the regulated variants discussed in the following, any variant that can be conceivably defined for the d-dimensional analogue of external contextual grammars, in the 1-dimensional case should lead to the same results as the original variant of contextual array grammars defined in [11] and taken as the basis in this paper, too.

3.1 Matrix Contextual Array Grammars

Definition 3

A d-dimensional matrix contextual array grammar is a pair \(G_{M}=\left( G,M\right) \) where \(G=\left( d,V,\#,P,A\right) \) is a d-dimensional contextual array grammar and M is a finite set of sequences, called matrices, of rules from P, i.e., each element of M is of the form \(\left\langle p_{1},\cdots ,p_{n}\right\rangle ,\,n\ge 1,\) where \(p_{i}\in P\) for \(1\le i\le n\). Derivations in a matrix contextual array grammar are defined as in a contextual array grammar except that a single derivation step now consists of the sequential application of the rules of one of the matrices in M, in the order in which the rules are given in the matrix. The array language generated by \(G_{M}\) is the set of all d-dimensional arrays which can be derived from any of the axioms in A. The family of d-dimensional array languages of arrays generated by d-dimensional matrix contextual array grammars (over a k-letter alphabet) is denoted by \(\mathcal {L}\left( d\text {-} MContA\right) \) (\(\mathcal {L}\left( d\text {-}MContA^{k}\right) \)).

Example 5

Consider the language \(L_{2}\) of connected arrays given by

$$\begin{aligned} L_{2}=\bigg \{\big \{\left( \left( 0,0\right) ,a\right) \big \}\cup \big \{ \left( \left( 0,i\right) ,a\right) ,\left( \left( i,0\right) ,a\right) \mid 1\le i\le n\big \}\,\,\biggr |\,\,n\in \mathbb {N}\bigg \}, \end{aligned}$$

which contains L-shaped arrays as \(L_{1}\) from Example , but now with both arms having the same length. \(L_{2}\in \mathcal {L}\left( 2\text {-}MContA^{1}\right) \), as it can be generated by the 2-dimensional matrix contextual array grammar \(G_{M}=\left( G_{1},M\right) \) where \(G_{1}\) is the 2-dimensional contextual array grammar from Example 3 and \(M=\left\{ \left\langle p_{u},p_{r}\right\rangle \right\} \). The only derivations possible in \(G_{M}^{\prime }\) for \(\left[ L_{2}\right] \in \left[ \mathcal {L}\left( 2\text {-}MContA^{1}\right) \right] \) (see Remark 1) are:

figure a

The single matrix , guarantees that both arms of the array grow in a synchronized way.    \(\square \)

Theorem 3

For any \(d\ge 2\) and any \(k\ge 1\), we have \(\mathcal {L}\left( d\text {-}ContA^{k}\right) \subsetneqq \mathcal {L} \left( d\text {-}MContA^{k}\right) \) and \(\left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \subsetneqq \left[ \mathcal {L}\left( d\text {-}MContA^{k}\right) \right] \).

Proof

The inclusion \(\mathcal {L}\left( d\text {-}ContA^{k}\right) \subseteq \mathcal {L}\left( d\text {-}MContA^{k}\right) \) and therefore also \(\left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \subseteq \left[ \mathcal {L}\left( d\text {-}MContA^{k}\right) \right] \) is obvious from general results for grammars working on various kinds of objects and with specific regulating mechanisms, see [10].

For showing the strictness of the inclusion, we prove that the array language \(L_{2}\) from Example 5 cannot be generated by a 2-dimensional contextual array grammar; for dimensions \(d>2\), we just take \(\left[ i_{2,d}\left( L_{2}\right) \right] .\)

Now assume we could find a 2-dimensional contextual array grammar \(\left[ G=\left( 2,\left\{ a\right\} ,\#,P,A\right) \right] \) that generates \(\left[ L_{2}\right] \). As contextual grammars are pure grammars, \(\left[ A\right] \) is a finite subset of \(\left[ L\left( G\right) \right] \). As \(\left[ L\left( G\right) \right] \) is infinite, we would need an infinite number of rules to get \(\left[ L_{2}\right] \) which resembles the case of external contextual string grammars; in fact, as soon as the arms get long enough, we have to apply a rule which only grows the arm going up or only grows the arm going to the right, resulting in an array which contradicts the definition of \(\left[ L_{2}\right] \). It is obvious that we also have \(\left[ i_{2,d}\left( L_{2}\right) \right] \in \left[ \mathcal {L} \left( d\text {-}MContA^{k}\right) \right] \setminus \left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \); this observation completes the proof.    \(\square \)

In the 1-dimensional case, the situation is different: as we shall prove later, see Theorem 6, \(\left[ \mathcal {L}\left( 1\text {-}ContA^{1}\right) \right] = \left[ \mathcal {L}\left( 1\text {-}MContA^{1}\right) \right] \), but for \(k\ge 2\), we still have \(\left[ \mathcal {L}\left( 1\text {-}ContA^{k}\right) \right] \subsetneqq \left[ \mathcal {L}\left( 1\text {-}MContA^{k}\right) \right] ,\) as the following example shows.

Example 6

Consider the non-regular language \(L_{n}=\left\{ a^{n}ba^{n}\mid n\ge 1\right\} \). By Theorem 1, there cannot exist an array grammar G of type 1-\(REGA^{2}\) such that \(\left[ L\left( G\right) \right] =\left[ arr\left( L_{n}\right) \right] \). Even more, there is no 1-dimensional contextual array grammar for \(L_n\). Namely, if this would be the case, then first observe that there must be rules that append something to the right, as well as to the left of the array, and this should be possible infinitely often. Otherwise, the sequence of context additions would happen (finally) only on one side, which means that this behavior can again be simulated by some regular array grammar, contradicting our previous reasoning. Hence, there must be a rule that contains a sequence of a’s as its selector, say, \(arr(a^{r_s})\), and also a sequence of a’s, say, \(arr(a^{r_c})\) as its context in order to append \(a^{r_c}\) to the right of the current array, and likewise, there must be a rule that contains a sequence of a’s as its selector, say, \(arr(a^{\ell _s})\), and also a sequence of a’s, say, \(arr(a^{\ell _c})\) as its context in order to append \(a^{l_c}\) to the left of the current array. For sufficiently long arrays \(arr(a^nba^n)\), both rules can be applied, and arrays like \(arr(a^nba^{n+r_c})\) can generated that do not belong to \(L_n\). Hence, \(L_n\notin \mathcal {L}(1\text {-}ContA)\).

Yet for the 1-dimensional matrix contextual array grammar \(\left[ G_{M}\right] =\left( G_{n},M_{n}\right) \) with \(\left[ G_{n}\right] =\left( 1,\left\{ a,b\right\} ,\#,P,\left\{ aba\right\} \right) \) where \(p_{l}= \begin{array}{cc} a&\fbox {a} \end{array} ,\) \(p_{r}= \begin{array}{cc} \fbox {a}&a \end{array} \), and \(M_{n}=\left\{ \left\langle p_{l},p_{r}\right\rangle \right\} \), we have \(\left[ L\left( G_{n}\right) \right] =\left[ arr\left( L_{n}\right) \right] \). The single matrix \(\left\langle p_{l},p_{r}\right\rangle \) guarantees that the number of symbols a grows to the left and to the right in a synchronized way.    \(\square \)

In addition, the following example even yields that for any \(k\ge 2\), \(\left[ \mathcal {L}\left( 1\text {-}MContA^{k}\right) \right] \) is incomparable with \(\left[ \mathcal {L}\left( 1\text {-}REGA^{k}\right) \right] .\)

Example 7

Consider the regular string language \(L_{r}=\left\{ ba^{n}b\mid n\ge 1\right\} \). Due to Theorem 1, there exists an array grammar of type 1-\(REGA^{2}\) \(G_{r}\) such that \(\left[ L\left( G_{r}\right) \right] =\left[ arr\left( L_{r}\right) \right] \). Yet on the other hand, there cannot exist an array grammar of type 1-\(MContA^{2}\) \(\left[ G\right] \) such that \(L\left( \left[ G\right] \right) =\left[ arr\left( L_{r}\right) \right] \), which can be proved by a simple pumping argument: The number of symbols a between the two symbols b can become arbitrarily large, but we only have a finite set of axioms A; as \(\left[ G\right] \) is a pure grammar, \(\left[ A\right] \subset \left[ L \right] \); yet \(\left[ G\right] \) can only grow these arrays in an external way, i.e., by adding symbols on the left or on the right, but in this way we are not able to grow the number of symbols a in the middle.    \(\square \)

3.2 Contextual Array Grammars with Regular Control

Definition 4

A d-dimensional contextual array grammar with regular control is a pair \(G_{C}=\left( G,L\right) \) where \(G=\left( d,V,\#,P,A\right) \) is a d-dimensional contextual array grammar and L is a regular string language over P. Derivations in a d-dimensional contextual array grammar with regular control are defined as in the contextual array grammar G except that in a successful derivation the sequence of applied rules has to be a word from L. The array language generated by \(G_{C}\) is the set of all d-dimensio nal arrays which can be derived from any of the axioms in A following a control word from L. The family of d-dimensional array languages of arrays generated by d-dimensional contextual array grammars over a k-letter alphabet with regular control is denoted by \(\mathcal {L}\left( \left( d\text {-}ContA,REG\right) \right) \). The corresponding family of array languages of equivalence classes of arrays is denoted by using brackets in the notations.

As a general result (following [10]) we can state:

Theorem 4

For any \(d\ge 1\) and any \(k\ge 1\),

$$\begin{aligned} \left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \subseteq \left[ \mathcal {L}\left( d\text {-}MContA^{k}\right) \right] \subseteq \left[ \mathcal {L}\left( \left( d\text {-}ContA^{k},REG\right) \right) \right] . \end{aligned}$$

Example 8

Consider the regular string language \(L_{r}=\left\{ ba^{n}b\mid n\ge 1\right\} \) from Example 7. We have shown that \(\left[ arr\left( L_{r}\right) \right] \in \left[ \mathcal {L} \left( 1\text {-}REGA^{2}\right) \right] \setminus \left[ \mathcal {L}\left( 1 \text {-}MContA^{2}\right) \right] \). Moreover, \(\left[ arr\left( L_{r} \right) \right] \in \left[ \mathcal {L}\left( \left( 1\text {-}ContA^{1},REG\right) \right) \right] \setminus \left[ \mathcal {L}\left( 1\text {-}MContA^{2}\right) \right] \): Consider \(G_{r}^{\prime }=\left( G_{r},C_{r}\right) \) with \(G_{r} =\left( 1,\left\{ a,b\right\} ,\#,P,\left\{ arr\left( ba\right) \right\} \right) \) and \(P=\left\{ p_{aa},p_{ab}\right\} \) with \(p_{aa}= \begin{array}{cc} \fbox {a}&a \end{array} \), and \(p_{ab}= \begin{array}{cc} \fbox {a}&b \end{array} \), as well as \(C_{r}=\left\{ p_{aa}\right\} ^{*}\left\{ p_{ab}\right\} \). It is easy to see that \(\left[ L\left( G_{r}^{\prime }\right) \right] = \left[ arr\left( L_{r}\right) \right] \).    \(\square \)

Theorem 5

For any \(d\ge 1\) and any \(k\ge 2\), we have:

$$\begin{aligned} \left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \varsubsetneqq \left[ \mathcal {L}\left( d\text {-}MContA^{k}\right) \right] \varsubsetneqq \left[ \mathcal {L}\left( \left( d\text {-}ContA^{k},REG\right) \right) \right] . \end{aligned}$$

Proof

The inclusions directly follow from Theorem 4. The strictness of the first inclusion follows from Example 6 by taking the non-regular string language \(L_{n}=\left\{ a^{n}ba^{n}\mid n\ge 1\right\} \). Then \(\left[ i_{1,d}\left( arr\left( L_{n}\right) \right) \right] \in \left[ \mathcal {L}\left( d\text {-}MContA^{2}\right) \right] \setminus \left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] .\) The strictness of the second inclusion follows from Example 8 by taking \(\left[ i_{1,d}\left( arr\left( L_{r} \right) \right) \right] \).    \(\square \)

On the other hand, in the 1-dimensional case, the following theorem says that even with the regulating mechanisms of matrix control or regular control languages, with 1-dimensional contextual array grammars over a one-letter alphabet we cannot go beyond regularity, i.e., beyond \(\left[ \mathcal {L}\left( 1\text {-}REGA^{1}\right) \right] \).

Theorem 6

\(\left[ \mathcal {L}\left( 1\text {-}REGA^{1}\right) \right] =\) \(\left[ \mathcal {L}\left( 1\text {-}ContA^{1},REG\right) \right] = \left[ \mathcal {L}\left( 1\text {-}MContA^{1}\right) \right] = \left[ \mathcal {L}\left( 1\text {-}ContA^{1}\right) \right] . \)

Proof

(Sketch) According to Theorems 4 and 2, we only have to show that \(\left[ \mathcal {L}\left( 1\text {-}REGA^{1}\right) \right] \supseteq \left[ \mathcal {L}\left( \left( 1\text {-}ContA^{1},REG\right) \right) \right] \). The main ideas of the corresponding technically non-trivial proof can be described as follows:

  • Without loss of generality, right-hand sides of rules have the form \(\fbox {a}^ma^n\).

  • Context information is irrelevant for the unary 1-dimensional case, assuming that the set of axioms collects all arrays of sufficient size.

  • The state information of the regular control is then encoded in the nonterminals of the regular array grammar.    \(\square \)

Allowing for more than one symbol, 1-dimensional contextual array grammars can generate exactly the array images of linear languages. The proof is based on the following normal form:

Lemma 1

For any 1-dimensional contextual array grammar with regular control \(G_{C}=(G,L)\), where \(G=\left( 1,V,\#,P,A\right) \), \(L\subseteq P^{*}\), we can construct an equivalent 1-dimensional contextual array grammar with regular control \(G_{C}^{\prime }=(G^{\prime },L^{\prime })\) with \(G^{\prime }=\left( 1,V,\#,P^{\prime },A^{\prime }\right) \), \(L^{\prime }\subseteq P^{\prime *}\), such that for \(P^{\prime }\) we have:

  • All rules in \(P^{\prime }\) are of the form \( \begin{array}{cc} \fbox {a}&b \end{array} \) or \( \begin{array}{cc} b&\fbox {a} \end{array} \) for some \(a,b\in V\), i.e., we only have the minimal non-empty size of selectors and minimal contexts of size 1.

  • If there is a rule of the form \( \begin{array}{cc} \fbox {a}&b \end{array} \) / \( \begin{array}{cc} b&\fbox {a} \end{array} \) in \(P^{\prime }\), then also all rules of the form \( \begin{array}{cc} \fbox {c}&b \end{array} \) or \( \begin{array}{cc} b&\fbox {c} \end{array} \) are in \(P^{\prime }\), for any \(c\in V\), i.e., the selector contents is irrelevant, only direction of growth of the array is important.

The rules in this normal form nicely correspond with the operations of left and right insertions for strings, which operations together with regular control languages also characterize the family of linear languages.

Theorem 7

\(\left[ \mathcal {L}\left( 1\text {-}ContA,REG\right) \right] = arr\left( \mathcal {L}\left( LIN\right) \right) \).

Proof

(Sketch) The main ideas of the proof can be described as follows:

  • Adding strings in a controlled way “on both ends” corresponds to applying linear rules, but in reverse order.

  • The information about the finitely many selectors possible can be stored in the nonterminal; on the other hand, the nonterminal can be stored in the state of the finite automaton of the control language.    \(\square \)

For \(d\ge 2\), i.e., in the case of at least two symbols, we can prove the incomparability of the families of array languages generated by contextual array grammars and those equipped with control mechanisms:

Theorem 8

For any \(d\ge 2\) and any \(k\ge 1\), all the three families

\(\left[ \mathcal {L}\left( d\text {-}ContA^{k}\right) \right] \), \(\left[ \mathcal {L}\left( d\text {-}MContA^{k}\right) \right] \), and \(\left[ \mathcal {L}\left( \left( d\text {-}ContA^{k},REG\right) \right) \right] \)

are incomparable with \(\left[ \mathcal {L}\left( d\text {-} REGA^{k}\right) \right] \).

Proof

For the singleton language \(L_{\bot }\) from Example 4, we have \(i_{2,d}\left( L_{\bot }\right) \in \)

\(\left( \left[ \mathcal {L}\left( d\text {-}ContA^{1}\right) \right] \cap \left[ \mathcal {L}\left( d\text {-}MContA^{1}\right) \right] \cap \right. \left. \left[ \mathcal {L}\left( \left( d\text {-}ContA^{1},REG\right) \right) \right] \right) \setminus \left[ \mathcal {L}\left( d\text {-}REGA^{1}\right) \right] . \)

On the other hand, for \(L_{r}\) from Example 7 we have \(i_{2,d}\left( \left[ arr\left( L_{r}\right) \right] \right) \in \)

\(\left[ \mathcal {L}\left( 1\text {-}REGA^{2}\right) \right] \setminus \left( \left[ \mathcal {L}\left( d\text {-}ContA^{1}\right) \right] \cup \right. \left. \left[ \mathcal {L}\left( d\text {-}MContA^{1}\right) \right] \cup \left[ \mathcal {L}\left( \left( d\text {-}ContA^{1},REG\right) \right) \right] \right) .\) Yet even for the case of one-letter alphabets we can find an array language of 2-dimensional arrays in \(\left[ \mathcal {L}\left( 2\text {-}REGA^{1}\right) \right] \setminus \left[ \mathcal {L}\left( \left( 2\text {-}ContA^{1},REG\right) \right) \right] \): we consider \(\bigsqcup \)-shaped arrays with the left vertical line having a length being a multiple of 3 and the right vertical line having a length being a multiple of 5. These arrays can easily be generated by a regular array grammar by first generating the left vertical line from up to down, followed by the horizontal line, finally generating the right vertical line upwards. On the other hand, this set of 2-dimensional arrays cannot be generated by a contextual array grammar even when using regular control: as soon as the vertical lines have become long enough, we cannot distinguish any more between the left and the right one, so either the lengths will not necessarily fulfill the constraints of being a multiple of 3 and 5, respectively, any more, or even worse, the lines might even be prolonged below the horizontal line yielding arrays of the shape of an \({\mathsf {H}}\).    \(\square \)

4 Decidability Questions

As the size of the arrays generated by contextual array grammars (even with any control mechanism) increases with every derivation step, the generated array languages are computable (i.e., recursive).

As an immediate consequence of Theorem 7, we obtain:

Corollary 1

Emptiness is decidable for \(\mathcal {L}\left( 1\text {-}ContA,REG\right) \).

Yet for higher dimensions, we obtain a completely different situation:

Theorem 9

Emptiness is not decidable for \(\mathcal {L}\left( d\text {-}ContA^{k},REG\right) \) for \(d\ge 2\), even for \(k=1\).

Proof

(Sketch) As, for example, described in [5], the derivation carpet of a Turing machine can be described using 2-dimensional contextual array productions in the t-mode of derivation, i.e., a derivation only stops if no rule can be applied any more. The goal of only halting with specific conditions being fulfilled can also be obtained using suitable regular control languages, as we can require specific final rules to be applied. Hence, we will obtain a non-empty array language if and only if there is a derivation simulating the acceptance of a string by the given Turing machine. The proof given in [5] does not bound the number of symbols used. Yet m symbols can be encoded by \(2 \times m\) rectangles with the k-th of these m symbols being encoded by leaving the k-th position in the second vertical line free, which then can be checked by the selector in the contextual array productions. Hence, simulating successful computations of the given Turing machine will result in the generation of \(2k\times mn\) rectangles for accepting computations.    \(\square \)

5 Picture Generation

Another interesting topic is to consider the generation of geometric objects such as solid rectangles and squares, which has been used to exhibit the generative power of various array grammar variants. Both of them, i.e., the 2-dimensional array language \(L_{rect}\) of all solid rectangles of size \(m\times n,\) \(m,n\ge 2,\) made of a single symbol a and the 2-dimensional array language \(L_{square}\) of all solid squares of side length n, \(n\ge 2,\) made of a single symbol a are well-known to be in \(\left[ \mathcal {L}\left( 2\text {-}REGA^{1}\right) \right] \), see [23], but as we are able to show they can also be generated by 2-dimensional contextual array grammars with regular control, i.e., \(\left\{ L_{rect},L_{square}\right\} \subset \left[ \mathcal {L}\left( \left( 2\text {-}ContA^{1},REG\right) \right) \right] \). We now only exhibit the contextual array grammar with regular control for the squares.

Example 9

\(L_{square}\) is generated by the 2-dimensional contextual array grammar with regular control \(G_{squareRC}=\left( G_{square},C_{square}\right) \) with \(G_{square}=(\{a\},P_{square},A_{square}),\) where \(A_{square}\) collects the \(2\times 2\) and \(3\times 3\) squares,

$$\begin{aligned} P_{square}= & {} \left\{ s_{ul},s_{dr},s_{ur},s_{dl},r_{ul},r_{uu},r_{dr},r_{dd}\right\} , \text { and} \\ C_{square}= & {} (\left\{ s_{ul}s_{dr}\right\} \left\{ r_{ul}r_{dr}\right\} ^{*}\left\{ r_{uu}r_{dd}\right\} ^{*}\left\{ s_{ur}s_{dl}\right\} )^{+}. \end{aligned}$$

The rules are listed in the following:

figure b

How to derive a \(4\times 4\) square is shown below:

$$ \begin{array}{cc} a &{} a \\ a &{} a \end{array}\Rightarrow _{s_{ul}} \begin{array}{ccc} a &{} a &{} \\ a &{} a &{} a \\ &{} a &{} a \\ \end{array}\Rightarrow _{s_{dr}} \begin{array}{cccc} a &{} a &{} &{} \\ a &{} a &{} a &{} \\ &{} a &{} a &{} a \\ &{} &{} a &{} a \end{array}\Rightarrow _{s_{ur}} \begin{array}{cccc} a &{} a &{} a &{} a \\ a &{} a &{} a &{} a \\ &{} a &{} a &{} a \\ &{} &{} a &{} a \end{array}\Rightarrow _{s_{dl}} \begin{array}{cccc} a &{} a &{} a &{} a \\ a &{} a &{} a &{} a \\ a &{} a &{} a &{} a \\ a &{} a &{} a &{} a \end{array} $$

Notice that the rules \(s_{ur}\) and \(s_{dl}\) check if a complete new border layer was actually generated, so they provide “keystones” as used in architecture, and it somehow replaces the t-mode of derivation, e.g., see [6].    \(\square \)

As already with the t-mode of derivation, e.g., see [6], only eight contextual array rules were needed in Example 9 to generate the squares. This shows that the ability of contextual array grammars to insert new parts on different positions in the current array allows for a significantly smaller number of rules when using specific control mechanisms as the t-mode of derivation or regular control languages, in comparison with the construction of an extended regular array grammar as described in [23], where the construction has to be carried out along a Hamiltonian path. The inserted pieces used in [23] in fact could also be used as arrays inserted by a contextual array grammar with regular control, yet even for the subset of squares of side lengths \(5k+16, \, k \ge 0\), as exhibited in [23], 27 rules (arrays) were used. As these are in fact a kind of macro-rules, a complete list of regular array rules based on [23] would correspond to about one thousand rules. This is an example showing that contextual array grammars may allow for a succinct description of specific picture languages with rather small descriptional complexity.