1 Introduction

In recent years, the Semantic Web area has been rapidly developed and attracted lots of attention. A central idea of the Semantic Web is that ontologies are a proper bridge among users and search engines, ensuring more accurate search results. Therefore, Web Ontology Language (OWL), built on the top of XML and RDF, serves as an important tool for specifying ontologies and reasoning about them. Together with rule languages, it serves as a main knowledge representation formalism for the Semantic Web.

The main semantical and logical foundation of OWL are description logics (DLs). Such logics represent the domain of interest in terms of concepts, individuals, and roles. A concept is interpreted as a set of individuals, while a role is interpreted as a binary relation between individuals. A knowledge base in a DL consists of an RBox of role axioms, a TBox of terminological axioms and an ABox of facts about individuals.

The second version OWL 2 of OWL, recommended by the W3C consortium in 2009, is based on the DL \(\mathcal {SROIQ}\). This logic is highly expressive but has intractable combined complexity (N2ExpTime-complete) and data complexity (NP-hard) for basic reasoning problems. Thus, W3C recommended also profiles OWL 2 EL, OWL 2 QL and OWL 2 RL, which are restricted sublanguages of OWL 2 Full and enjoy PTime data complexity. These profiles are based on the families of description logics \(\mathcal {EL}\) [3, 4], DL-Lite [5] and Description Logic Programs (DLP) [13], respectively. There are also more sophisticated fragments of DLs with PTime data complexity: Horn-\(\mathcal {SHIQ}\) [15], Horn-\(\mathcal {SROIQ}\) [21] and Horn-DL [20].

Rule languages provide very useful knowledge representation formalisms applicable to the Semantic Web. Some fragments of DLs like DLP [13] can be translated into rule languages. But most importantly, rule languages can be combined with DLs to develop more expressive formalisms. An early attempt to achieve such a combination was SWRL [14], a rule language using only concept names, role names and the equality predicate. However, without restrictions its combination with OWL DL is undecidable.

A knowledge base in other combined languages is usually specified as a pair \(\langle \mathcal {O},\mathcal {P}\rangle \), where \(\mathcal {O}\) is an ontology in some DL and \(\mathcal {P}\) is a set of rules, e.g., specified in Datalog or its suitable extension, which can use concept names and role names. Interaction between \(\mathcal {O}\) and \(\mathcal {P}\) is either one-way (\(\mathcal {O}\) affects \(\mathcal {P}\)) or two-way (where \(\mathcal {P}\) may also affect \(\mathcal {O}\)). The approach of defining a knowledge base as a pair \(\langle \mathcal {O},\mathcal {P}\rangle \) is adopted in a considerable number of works, including [8] (on \(\mathcal {AL}\)-log), [17] (on CARIN), [19] (on DL-safe rules), [24] (on \(\mathcal {DL}\)+log), [16, 18] (on hybrid MKNF), [9] (on hybrid programs), [23] (on OntoDLV), [10] (on dl-programs). In these works, if negation is allowed in \(\mathcal {P}\) then \(\mathcal {P}\) and its interaction with \(\mathcal {O}\) are interpreted using some nonmonotonic semantics (e.g., the stable model semantics, the MKNF semantics or the well-founded semantics). However, \(\mathcal {O}\) is always interpreted using the usual (monotonic) semantics.

In the current paper we treat such a pair \(\langle \mathcal {O},\mathcal {P}\rangle \) just as a layer and study the case when \(\mathcal {O}\) can be translated to an eDatalog\(^\lnot \) program and \(\mathcal {P}\) is an eDatalog\(^\lnot \) program. eDatalog\(^\lnot \) extends Datalog\(^\lnot \) by allowing two basic types (for individuals and data constants), external checkable predicates and the equality predicate (between individuals). Concept names and role names are allowed both in heads and bodies of program clauses. Our approach is novel in the following aspects:

  • Negation in \(\mathcal {O}\) is interpreted using a nonmonotonic semantics (the well-founded semantics, the stable model semantics, or the standard semantics for stratified knowledge bases); this differs from all the above-mentioned works [810, 1619, 23, 24].

  • We combine \(\mathcal {O}\) and \(\mathcal {P}\) into one set (called a layer, which is divided into a TBox consisting of concept inclusion axioms/program clauses and an ABox consisting of facts). This allows for a tighter integration between DLs and rules. It may seem similar to the approach of SWRL, but we also allow ordinary predicates, use a nonmonotonic semantics for negation, and design the language appropriately to get decidability and PTime data complexity (w.r.t. the well-founded semantics, and the standard semantics for stratified knowledge bases).

  • To reflect modularity of ontologies (e.g., the import feature of ontologies), we define a knowledge base to be a hierarchy of layers (a tree or a rooted directed acyclic graph of layers). Each layer in turn may be stratifiable and divided further into strata. The granulation is not substantial for the well-founded semantics, as the whole knowledge base will be flattened to a set of program clauses and facts.

  • However, it is substantial for the stable model semantics (see Example 8). Furthermore, when each layer of the considered knowledge base is stratifiable and the standard semantics is used for it, layers not only emphasize modularity but also affect the semantics (flattening the knowledge base may result in an unstratifiable layer).

The Web ontology rule language we define in this paper, WORL, combines a variant of OWL 2 RL with eDatalog\(^\lnot \). Similarly to our previous work on OWL 2 eRL\(^+\) [6], we:

  • disallow those features of OWL 2 RL that play the role of constraints (i.e., the ones that are translated to negative clauses of the form \(\varphi \rightarrow \bot \));

  • allow unary external checkable predicates;

  • allow additional features like negation and the constructor \(\ge \!n\,R.C\) to occur at the left-hand side of \(\sqsubseteq \) in concept inclusion axioms.

Some restrictions are adopted for the additional features to guarantee a translation of WORL programs into eDatalog\(^\lnot \). We also define the rule language SWORL (stratified WORL) and develop the well-founded semantics and the stable model semantics for WORL as well as the standard semantics for SWORL via translation into eDatalog\(^\lnot \). Both WORL with respect to the well-founded semantics and SWORL with respect to the standard semantics have PTime data complexity.

This paper is a revised and extended version of our conference paper [7]. Comparing to [7], in the current paper, we additionally provide the standard model semantics for WORL, a direct method for checking stratifiability of TBoxes, all the proofs and a number of illustrative examples. The three semantics for eDatalog\(^\lnot \) which we consider are now presented in a uniform manner.

The rest of this paper is structured as follows. In Sect. 2 we introduce eDatalog\(^\lnot \), stratified eDatalog\(^\lnot \), and their semantics. In Sect. 3 we present WORL, a translation of WORL into eDatalog\(^\lnot \), and its well-founded semantics and stable model semantics. Section 4 is devoted to SWORL and its standard semantics. Section 5 concludes “this work”. In the Appendix, we present a direct method for checking stratifiability of TBoxes.

2 Preliminaries

We denote the set of concept names by \(\mathrm {CNames}\), and the set of role names by \(\mathrm {RNames}\).

From the point of view of OWL, there are two basic types: individual (i.e. object) and literal [22] (i.e. data constant). We denote the individual type by \( IType \), and the literal type by \( LType \). Thus, a concept name is a unary predicate of type \(P( IType )\), a data type is a unary predicate of type \(P( LType )\), an object role name is a binary predicate of type \(P( IType \times IType )\), and a data role name is a binary predicate of type \(P( IType \times LType )\). For simplicity, we do not provide specific data types like integer, real or string. Apart from concept names and role names, we will also use a set \(\mathrm {OPreds}\) of ordinary predicates (including data types) and a set \(\mathrm {ECPreds}\) of external checkable predicates. We assume that the sets \(\mathrm {CNames}\), \(\mathrm {RNames}\), \(\mathrm {OPreds}\) and \(\mathrm {ECPreds}\) are finite and pairwise disjoint. By a set of defined predicates we mean:

$$\begin{aligned} \mathrm {DPreds}= \mathrm {CNames}\cup \mathrm {RNames}\cup \mathrm {OPreds}. \end{aligned}$$

With each \(k\)-ary predicate from \(\mathrm {OPreds}\) we associate its type \(P(T_1 \times \cdots \times T_k)\), where each \(T_i\) is either \( IType \) or \( LType \). A \(k\)-ary predicate from \(\mathrm {ECPreds}\) has the type \(P( LType ^k)\). We assume that each predicate from \(\mathrm {ECPreds}\) has a fixed meaning which is checkable in the following sense:

if \(p\) is a \(k\)-ary predicate from \(\mathrm {ECPreds}\) and \(d_1,\ldots ,d_k\) are constants of \( LType \), then the truth value of \(p(d_1,\ldots , d_k)\) is fixed and computable in polynomial time (in the number of bits used for \(d_1,\ldots ,d_k\)).

For example, one may want to use the binary predicates \(>\), \(\ge \), \(<\), \(\le \) on real numbers with the usual semantics.

We assume there is only one equality predicate ‘\(=\)’, which belongs to \(\mathrm {OPreds}\) and has the type \(P( IType \times IType )\). For data constants, we assume the Unique Names Assumption instead.

A term is either an individual (of type \( IType \)) or a literal (of type \( LType \)) or a variable (of type \( IType \) or \( LType \)). If \(p\) is a predicate of type \(P(T_1 \times \cdots \times T_k)\), and for \(1 \le i \le k\), \(t_i\) is a term of type \(T_i\), then \(p(t_1,\ldots ,t_k)\) is an atomic formula (also called an atom). An atom is ground if it contains no variables.

An interpretation\(\mathcal {I}= \langle \Delta _o^\mathcal {I}, \Delta _d^\mathcal {I}, \cdot ^\mathcal {I}\rangle \) consists of a non-empty set \(\Delta _o^\mathcal {I}\) called the object domain of \(\mathcal {I}\), a non-empty set \(\Delta _d^\mathcal {I}\) disjoint with \(\Delta _o^\mathcal {I}\) called the data domain of \(\mathcal {I}\), and a function \(\cdot ^\mathcal {I}\) which maps:

  • every individual \(a\) to an element \(a^\mathcal {I}\in \Delta _o^\mathcal {I}\),

  • every literal \(d\) to a uniqueFootnote 1 element \(d^\mathcal {I}\in \Delta _d^\mathcal {I}\),

  • every concept name \(A\) to a subset \(A^\mathcal {I}\) of \(\Delta _o^\mathcal {I}\),

  • every data type \( DT \) to a subset \( DT ^\mathcal {I}\) of \(\Delta _d^\mathcal {I}\),

  • every predicate of type \(P(T_1 \times \cdots \times T_k)\) in \(\mathrm {DPreds}\) different from ‘\(=\)’ to a subset of \(\Delta _1 \times \cdots \times \Delta _k\), where \(\Delta _i = \Delta _o^\mathcal {I}\) if \(T_i = IType \), and \(\Delta _i = \Delta _d^\mathcal {I}\) if \(T_i = LType \),

  • predicate ‘\(=\)’ to a congruence of \(\mathcal {I}\).Footnote 2

A Herbrand interpretation is a set of ground atoms of predicates from \(\mathrm {DPreds}\). An ABox is a finite Herbrand interpretation.

The size of a ground atom is the number of bits used for its representation. The size of an ABox is the sum of the sizes of its atoms.

By \( EqAxioms \) we denote the following set of axioms:

$$\begin{aligned} \begin{array}{l} x = x \\ x = y \rightarrow y = x \\ x = y \wedge y = z \rightarrow x = z \\ x_i = x'_i \wedge p(x_1,\ldots ,x_i,\ldots ,x_k) \rightarrow p(x_1,\ldots ,x'_i,\ldots ,x_k), \end{array} \end{aligned}$$

where \(p\) is any \(k\)-ary predicate of \(\mathrm {DPreds}\) different from ‘\(=\)’ and \(i\) is any natural number between 1 and \(k\) such that the \(i\)th argument of \(p\) is of type \( IType \).

A Herbrand interpretation \(\mathcal {H}\) is closed w.r.t.\( EqAxioms \) if for every ground instance \(\varphi _1 \wedge \cdots \wedge \varphi _k \rightarrow \psi \) (with \(k \ge 0\)) of an axiom in \( EqAxioms \) using the individuals and data constants occurring in \(\mathcal {H}\), if \(\{\varphi _1, \ldots , \varphi _k\} \subseteq \mathcal {H}\) then \(\psi \in \mathcal {H}\).

Given a Herbrand interpretation \(\mathcal {H}\) that is closed w.r.t. \( EqAxioms \), let \(\mathcal {I}\) be the interpretation specified as follows:

  • \(\Delta _o^\mathcal {I}\) is the set of all individuals occurring in \(\mathcal {H}\),

  • \(\Delta _d^\mathcal {I}\) is the set of all data constants occurring in \(\mathcal {H}\),

  • for every \(k\)-ary predicate \(p \in \mathrm {DPreds}\),

    $$\begin{aligned} p^\mathcal {I}= \{\langle t_1,\ldots ,t_k\rangle \mid p(t_1,\ldots ,t_k) \in \mathcal {H}\}. \end{aligned}$$

Observe that \(=^\mathcal {I}\) is a congruence of \(\mathcal {I}\). We call the quotient \(\mathcal {I}/_=\) of \(\mathcal {I}\) by the congruence \(=^\mathcal {I}\) the traditional interpretation corresponding to\(\mathcal {H}\).

2.1 The rule language eDatalog\(^\lnot \)

In [6], we defined eDatalog as an extension of Datalog with the equality predicate, external checkable predicates, and a relaxed range-restrictedness condition. In this subsection, we define the rule language eDatalog\(^\lnot \) similarly as an extension of Datalog\(^\lnot \), but using the full range-restrictedness condition.

An eDatalog\(^\lnot \)program clause is a formula of the form

$$\begin{aligned}&(\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k \nonumber \\&\quad {}\wedge \xi _1 \wedge \cdots \wedge \xi _l \wedge \lnot \zeta _1\wedge \cdots \wedge \lnot \zeta _m) \rightarrow \alpha \end{aligned}$$
(1)

where \(h,k,l,m \ge 0, \varphi _1, \ldots , \varphi _h, \psi _1, \ldots , \psi _k, \alpha \) are atoms of predicates from \(\mathrm {DPreds}\), and \(\xi _1, \ldots , \xi _l, \zeta _1, \ldots , \zeta _m\) are atoms of predicates from \(\mathrm {ECPreds}\), with the property that every variable occurring in \(\alpha \) or some \(\psi _i\), \(\xi _i\) or \(\zeta _i\) occurs also in some atom \(\varphi _j\) (this is the range-restrictedness condition).

The atom \(\alpha \) in (1) is called the head of the program clause. If \(p\) is the predicate of \(\alpha \) then the clause is called a program clause defining p. The formula at the left-hand side of \(\rightarrow \) in (1) is called the body of the program clause.

An eDatalog\(^\lnot \)program is a finite set of eDatalog \(^\lnot \) program clauses. An eDatalog\(^\lnot \)knowledge base is a pair \(\langle \mathcal {P},\mathcal {A}\rangle \) consisting of an eDatalog\(^\lnot \) program \(\mathcal {P}\) and an ABox \(\mathcal {A}\). A query is defined to be a formula that can be the body of an eDatalog \(^\lnot \) program clause.

Example 1

Let \(\mathcal {P}\) be the following eDatalog \(^\lnot \) program:

$$\begin{aligned} \begin{array}{l} [\textit{acceptable}(X) \wedge \textit{hasPrice}(X,Y)\\ \;\,\wedge \ \textit{acceptable}(X') \wedge \textit{hasPrice}(X',Y') \wedge Y < Y']\\ \quad \rightarrow \textit{excluded}(X') \\ \textit{acceptable}(X) \wedge \lnot \textit{excluded}(X) \rightarrow \textit{preferable}(X) \end{array} \end{aligned}$$

and let \(\mathcal {A}= \{\textit{acceptable}(a), \textit{acceptable}(b), \textit{hasPrice}(a,100), \textit{hasPrice}(b,120)\}\). Then \( KB = \langle \mathcal {P},\mathcal {A}\rangle \) is an eDatalog\(^\lnot \) knowledge base. Here, ‘\(<\)’ is an external checkable predicate with the usual semantics; \(X\) and \(X'\) are variables of type \( IType \); \(Y\) and \(Y'\) are variables of type \( LType \); \(a\) and \(b\) are objects (of type \( IType \)); \(100\) and \(120\) are data constants (of type \( LType \)).

2.2 Stratified eDatalog\(^\lnot \)

A stratification of an eDatalog\(^\lnot \) program \(\mathcal {P}\) is a sequence of eDatalog\(^\lnot \) programs \(\mathcal {P}_1, \ldots , \mathcal {P}_n\) such that:

  • \(\{\mathcal {P}_1,\ldots ,\mathcal {P}_n\}\) is a partition of \(\mathcal {P}\cup EqAxioms \),

  • for some mapping \(f : \mathrm {DPreds}\rightarrow \{1,\ldots ,n\}\), every predicate \(p \in \mathrm {DPreds}\) satisfies the following conditions:

    • the program clauses in \(\mathcal {P}\cup EqAxioms \) defining \(p\) are in \(\mathcal {P}_{f(p)}\),

    • if \(\mathcal {P}\cup EqAxioms \) contains a program clause defining \(p\) in the form

      $$\begin{aligned}&(\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k \wedge \xi _1 \wedge \cdots \wedge \xi _l\\&\quad \wedge \ \lnot \zeta _1 \wedge \cdots \wedge \lnot \zeta _m) \rightarrow \alpha \end{aligned}$$

      then for every \(1 \le i \le h\) and \(1 \le j \le k\,\):

      • if \(p'_i\) is the predicate of \(\varphi _i\) then \(f(p'_i) \le f(p)\),

      • if \(p''_j\) is the predicate of \(\psi _j\) then \(f(p''_j) < f(p)\).

Given a stratification \(\mathcal {P}_1, \ldots , \mathcal {P}_n\) of \(\mathcal {P}\), each \(\mathcal {P}_i\) is called a stratum of the stratification, and \(f\) is called the stratification mapping. Let us emphasize that \(f(\mathrm `=' ) \le f(p)\) for all \(p \in \mathrm {DPreds}\).

An eDatalog\(^\lnot \) program \(\mathcal {P}\) is called a stratified eDatalog\(^\lnot \)program if it has a stratification. It is called a semipositive eDatalog\(^\lnot \)program if it has a stratification with only one stratum.Footnote 3

A pair \(\langle \mathcal {P},\mathcal {A}\rangle \) is called a stratified eDatalog\(^\lnot \)knowledge base if it is an eDatalog\(^\lnot \) knowledge base with \(\mathcal {P}\) being a stratified eDatalog\(^\lnot \) program.

Example 2

The program \(\mathcal {P}\) given in Example 1 is a stratified eDatalog\(^\lnot \) program with two strata. Each program clause of \(\mathcal {P}\) forms a stratum.

2.3 Semantics of eDatalog\(^\lnot \)

Let \(\langle \mathcal {P},\mathcal {A}\rangle \) be an eDatalog\(^\lnot \) knowledge base. By \(\mathcal {P}^{ gr }_\mathcal {A}\) we denote the set of all ground instances of the program clauses of \(\mathcal {P}\cup EqAxioms \) that use only individuals and data constants occurring in \(\mathcal {P}\) or \(\mathcal {A}\).

By \(\mathcal {P}_\mathcal {A}\) we denote the set of all clauses

$$\begin{aligned} (\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k) \rightarrow \alpha \end{aligned}$$

such that \(\mathcal {P}^{ gr }_\mathcal {A}\) contains a program clause

$$\begin{aligned}&(\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k \nonumber \\&\quad \wedge \ \xi _1 \wedge \cdots \wedge \xi _l \wedge \lnot \zeta _1\wedge \cdots \wedge \lnot \zeta _m) \rightarrow \alpha \end{aligned}$$
(2)

where all \(\xi _1, \ldots , \xi _l\) are true and all \(\zeta _1, \ldots , \zeta _m\) are false (by the fixed meaning of external checkable predicates).

Example 3

Consider the eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) given in Example 1. Then \(\mathcal {P}_\mathcal {A}\) consists of a number of ground instances of clauses of \( EqAxioms \) and the following clauses:

$$\begin{aligned} \begin{array}{l} {}[\textit{acceptable}(a) \wedge \textit{hasPrice}(a,100)\, \wedge \\ \;\,\textit{acceptable}(a) \wedge \textit{hasPrice}(a,120)] \rightarrow \textit{excluded}(a)\\ {}[\textit{acceptable}(a) \wedge \textit{hasPrice}(a,100)\, \wedge \\ \;\,\textit{acceptable}(b) \wedge \textit{hasPrice}(b,120)] \rightarrow \textit{excluded}(b)\\ {}[\textit{acceptable}(b) \wedge \textit{hasPrice}(b,100)\, \wedge \\ \;\,\textit{acceptable}(a) \wedge \textit{hasPrice}(a,120)] \rightarrow \textit{excluded}(a)\\ {}[\textit{acceptable}(b) \wedge \textit{hasPrice}(b,100)\, \wedge \\ \;\,\textit{acceptable}(b) \wedge \textit{hasPrice}(b,120)] \rightarrow \textit{excluded}(b)\\ \textit{acceptable}(a) \wedge \lnot \textit{excluded}(a) \rightarrow \textit{preferable}(a)\\ \textit{acceptable}(b) \wedge \lnot \textit{excluded}(b) \rightarrow \textit{preferable}(b). \end{array} \end{aligned}$$

Note that the predicate ‘\(<\)’ does no longer occur in \(\mathcal {P}_\mathcal {A}\).

Note that \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) is a ground Datalog\(^\lnot \) program. Furthermore, if \(\langle \mathcal {P},\mathcal {A}\rangle \) is a stratified eDatalog\(^\lnot \) knowledge base then \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) is a ground stratified Datalog\(^\lnot \) program. We define:

  • the well-founded model of an eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be the well-founded model of the ground Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) [11],

  • a stable model of an eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be a stable model of the ground Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup ~\mathcal {A}\) [12],

  • the standard model of a stratified eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be the standard model of the stratified Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) [1].

Let \(\varphi \) be a query and \(\theta \) be a ground substitution for all the variables of \(\varphi \). We say that \(\theta \) is an answer to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the well-founded semantics if \(\varphi \theta \) holds in the well-founded model of \(\langle \mathcal {P},\mathcal {A}\rangle \).Footnote 4 Similarly, \(\theta \) is called an answer to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the stable model semantics if \(\varphi \theta \) holds in a stable model of \(\langle \mathcal {P},\mathcal {A}\rangle \). If \(\langle \mathcal {P},\mathcal {A}\rangle \) is stratifiable then \(\theta \) is called an answer to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the standard semantics if \(\varphi \theta \) holds in the standard model of \(\langle \mathcal {P},\mathcal {A}\rangle \).

As a Datalog\(^\lnot \) program may have zero or more than one stable model, an eDatalog\(^\lnot \) knowledge base may also have zero or more than one stable model. Note that we adopt the answer set programming approach to deal with the case when an eDatalog\(^\lnot \) knowledge base has more than one stable model.

Proposition 1

The data complexity of eDatalog\(^\lnot \) with respect to the well-founded semantics is in PTime.

Proof

Let \(\langle \mathcal {P},\mathcal {A}\rangle \) be an eDatalog\(^\lnot \) knowledge base. The set \(\mathcal {P}^{ gr }_\mathcal {A}\) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). As the truth values of the atoms of external checkable predicates that occur in \(\mathcal {P}^{ gr }_\mathcal {A}\) can be computed in polynomial time, \(\mathcal {P}_\mathcal {A}\) can also be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). It is well known that the well-founded model of the Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) (see, e.g., [1]). Thus, the well-founded model of \(\langle \mathcal {P},\mathcal {A}\rangle \) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). Consequently, answering queries to \(\langle \mathcal {P},\mathcal {A}\rangle \) w.r.t. the well-founded semantics can be done in polynomial time in the size of \(\mathcal {A}\).\(\square \)

Lemma 1

Given an eDatalog\(^\lnot \) knowledge base \( KB = \langle \mathcal {P},\mathcal {A}\rangle \) with \(\mathcal {P}\) being a semipositive eDatalog\(^\lnot \) program, the standard Herbrand model of \( KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\).

Proof

Recall that \(\mathcal {P}^{ gr }_\mathcal {A}\) has polynomial size in the size of \(\mathcal {A}\) (when \(\mathcal {P}\) is fixed).  Let \(\mathcal {P}_\mathcal {A}'\) be the set of all the program clauses

$$\begin{aligned} \varphi _1 \wedge \cdots \wedge \varphi _h \rightarrow \alpha \end{aligned}$$

such that \(\mathcal {P}^{ gr }_\mathcal {A}\) contains a program clause

$$\begin{aligned}&(\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k \\&\wedge \ \xi _1 \wedge \cdots \wedge \xi _l \wedge \lnot \zeta _1\wedge \cdots \wedge \lnot \zeta _m) \rightarrow \alpha \end{aligned}$$

where \(\{\psi _1, \ldots , \psi _k\} \cap \mathcal {A}= \emptyset \), all \(\xi _1, \ldots , \xi _l\) are true and all \(\zeta _1, \ldots , \zeta _m\) are false (by the fixed meaning of external checkable predicates). The set \(\mathcal {P}_\mathcal {A}'\) is a Datalog program, which can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). The least Herbrand model of \(\mathcal {P}_\mathcal {A}'\) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {P}_\mathcal {A}'\) (see, e.g., [1]). Thus, it can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). That model is the same as the standard Herbrand model of \( KB \). \(\square \)

Corollary 1

Given a stratified eDatalog\(^\lnot \) knowledge base \( KB = \langle \mathcal {P},\mathcal {A}\rangle \), the standard Herbrand model of \( KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). As a consequence, the data complexity of stratified eDatalog\(^\lnot \) with respect to the standard semantics is in PTime.

3 The web ontology rule language WORL

3.1 Syntax and notation of WORL

We use:

  • the truth symbol \(\top \) to denote \(\textit{owl:Thing}\) [22],

  • \(a\) and \(b\) to denote individuals (i.e. objects),

  • \(d\) to denote a literal (i.e. a data constant),

  • \(A\) and \(B\) to denote concept names (i.e. \(\textit{Class}\) elements [22]),

  • \(C\) and \(D\) to denote concepts (i.e. \(\textit{ClassExpression}\) elements [22]),

  • \(lC_\pm \) and \(lC\) to denote concepts like a \(\textit{subClassExpression}\) of [22],

  • \(rC\) to denote a concept like a \(\textit{superClassExpression}\) of [22],

  • \(eC\) to denote a concept like an \(\textit{equivClassExpression}\) of [22],

  • \( DT \) to denote a data type (i.e. a \(\textit{Datatype}\) of [22]),

  • \( DR \) to denote a data range (i.e. a \(\textit{DataRange}\) of [22]),

  • \(p_{uec}\) to denote a unary predicate from \(\mathrm {ECPreds}\),

  • \(r\) and \(s\) to denote object role names (i.e. \(\textit{ObjectProperty}\) elements [22]),

  • \(R\) and \(S\) to denote object roles (i.e. \(\textit{ObjectPropertyExpr.}\) elements [22]),

  • \(\sigma \) and \(\varrho \) to denote data role names (i.e. \(\textit{DataProperty}\) elements [22]).

The families of \(R\), \( DR \), \(lC_\pm \), \(lC\), \(rC\), \(eC\) are defined by the following BNF grammar, where \(n \ge 2\,\):

$$\begin{aligned} R&:= r \mid r^- \\ DR&:= DT \mid DT \sqcap DR \\ lC_\pm&:= A \mid \lnot A \mid \{a\} \mid lC_\pm \sqcap lC_\pm \mid lC_\pm \sqcup lC_\pm \mid \exists R.lC_\pm \mid \\&\exists R.\top \mid \ \ge \! n\,R.lC_\pm \mid \exists \sigma . DR \mid \exists \sigma .p_{uec}\mid \exists \sigma .\{d\} \\ lC&:=A \mid \{a\} \mid lC \sqcap lC_\pm \mid lC_\pm \sqcap lC \mid lC \sqcup lC \mid \exists R.lC_\pm \!\mid \\&\exists R.\top \mid \ \ge \! n\,R.lC_\pm \mid \exists \sigma . DR \mid \exists \sigma .p_{uec}\mid \exists \sigma .\{d\} \\ rC&:= A \mid rC \sqcap rC \mid \forall R.rC \mid \exists R.\{a\} \mid \forall \sigma . DR \mid \exists \sigma .\{d\} \mid \\&\le \!1\,R.lC_\pm \mid \ \le \!1\,R.\top \\ eC&:= A \mid eC \sqcap eC \mid \exists R.\{a\} \mid \exists \sigma .\{d\} \end{aligned}$$

Here, by \(r^-\) we denote the inverse of an object role \(r\). Notice the occurrences of \(lC_\pm \) in the definition of \(lC\). They are accompanied by \(lC\) or \(R\) to guarantee the so called safeness (range-restrictedness) condition.

Comparing with [6], it can be seen that \(\lnot A\), \(\,\ge \! n\,R.lC_\pm \) and \(\exists \sigma .p_{uec}\) for \(lC_\pm \) are additional features w.r.t. OWL 2 RL.

The class constructor \(\textit{ObjectOneOf}\) [22] can be written as \(\{a_1,\ldots ,a_k\}\) and expressed as \(\{a_1\} \sqcup \cdots \sqcup \{a_k\}\). We will use the following abbreviations: \(\mathsf {Func}\) (Functional), \(\mathsf {InvFunc}\) (InverseFunctional), \(\mathsf {Sym}\) (Symmetric), \(\mathsf {Trans}\) (Transitive), \(\mathsf {Key}\) (HasKey).

A DL TBox axiom, like a \(\textit{ClassAxiom}\) or a \(\textit{Datatype}\textit{Definition}\) or a \(\textit{HasKey}\) axiom of OWL 2 RL  [22], is an expression of one of the following forms, where \(h, k \ge 0\) and \(h+k \ge 1\):

$$\begin{aligned} \begin{array}{cc} lC \sqsubseteq rC,\; eC = eC',\\ DT = DR ,\; \mathsf {Key}(lC_\pm ,R_1,\ldots ,R_h, \sigma _1,\ldots ,\sigma _k). \end{array} \end{aligned}$$
(3)

An RBox axiom, like an \(\textit{ObjectPropertyAxiom}\) or a \(\textit{Data}\textit{PropertyAxiom}\) of OWL 2 RL  [22], is an expression of one of the following forms:

$$\begin{aligned}&R_1 \circ \cdots \circ R_k \sqsubseteq S,\; R = S,\; R = S^-,\; \exists R.\top \sqsubseteq rC, \nonumber \\&\quad \top \sqsubseteq \forall R.rC,\mathsf {Func}(R),\; \mathsf {InvFunc}(R),\; \mathsf {Sym}(R), \mathsf {Trans}(R),\nonumber \\&\quad \sigma \sqsubseteq \varrho ,\; \sigma = \varrho ,\; \exists \sigma \sqsubseteq rC,\; \top \sqsubseteq \forall \sigma . DR . \end{aligned}$$
(4)

Note that axioms of the form \(R = S\), \(R = S^-\), \(\mathsf {Sym}(R)\) or \(\mathsf {Trans}(R)\) are expressible by axioms of the form \(R_1 \circ \cdots \circ R_k \sqsubseteq S\), and hence can be deleted from the above list.

An RBox axiom of the form \(\exists R.\top \sqsubseteq rC\) (resp. \(\top \sqsubseteq \forall R.rC\), \(\exists \sigma \sqsubseteq rC\), \(\top \sqsubseteq \forall \sigma . DR \)) stands for an \(\textit{ObjectPropertyDomain}\) (resp. \(\textit{ObjectPropertyRange}\), \(\textit{Data}\textit{PropertyDomain}\), \(\textit{DataPropertyRange}\)) axiom as in [22].

One can classify these latter axioms as DL TBox axioms instead of RBox axioms. Similarly, \(\mathsf {Key}(\ldots )\) axioms can be classified as RBox axioms instead.

We accept the following definitions:

  • A (WORL) TBox axiom is either a DL TBox axiom (as defined by (3)) or an RBox axiom (as defined by (4)) or an eDatalog\(^\lnot \) program clause.

  • A (WORL) TBox is a finite set of TBox axioms.

  • A WORL knowledge layer is a pair \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) consisting of a TBox \(\mathcal {T}\) and an ABox \(\mathcal {A}\).

Note that we defined an ABox to be a finite set of ground atoms of predicates from \(\mathrm {DPreds}\). If one wants to add an assertion of the form \(C(a)\) to a WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \), where \(C\) is a complex concept belonging to the \(rC\) family, he or she can add the assertion \(A(a)\) to \(\mathcal {A}\) and add the axiom \(A \sqsubseteq C\) to \(\mathcal {T}\), where \(A\) is a fresh concept name.

WORL knowledge bases are defined inductively as follows:

  • a WORL knowledge layer is a WORL knowledge base,

  • if \(\mathcal {L}\) is a WORL knowledge layer and \( KB _1, \ldots , KB _k\) are WORL knowledge bases then \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) is a WORL knowledge base.

A WORL knowledge base \(\langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) can be thought of as an ontology with \(\mathcal {L}\) being a set of direct statements, and \( KB _1, \ldots , KB _k\) being subontologies.

Example 4

This example is based on the ones of [1, 11, 12]. It is about a two players game with states \(a,b,c,d,e,f,g\). A player wins if the opponent has no moves. The allowed moves are illustrated below:

figure a

We use a concept name \( winning \) and a role name \( move \). Let \(\mathcal {T}\) be the TBox consisting of only the axiom

$$\begin{aligned} \exists move .\lnot winning \sqsubseteq winning \end{aligned}$$

and let \(\mathcal {A}\) be the ABox consisting of the assertions \( move (a,b),\)\(\ldots , move (f,g)\) that correspond to the edges in the above graph. Then \( KB = \langle \mathcal {T},\mathcal {A}\rangle \) is a WORL knowledge base.

3.2 Translating WORL into eDatalog\(^\lnot \)

We first define a translation \(\pi \) that translates a TBox axiom to a set of formulas of classical first-order logic. After that we will refine \(\pi \) to get a translation that converts a TBox to an eDatalog\(^\lnot \) program.

For an eDatalog\(^\lnot \) program clause \(\varphi \), let \(\pi (\varphi ) = \{\varphi \}\).

For a DL TBox axiom or an RBox axiom \(\varphi \), let \(\pi (\varphi )\) be defined as in Fig. 1, where \(\pi _{(x)}\) is an auxiliary translation that translates each concept or data range to a formula, where \(x\) denotes a variable.

Fig. 1
figure 1

The translation \(\pi \) for DL TBox axioms and RBox axioms. All variables for \(\pi (.)\) like \(x\), \(y\), \(z\), \(u\), \(v\) are fresh (new) variables. Variables \(y\) and \(z\) used for \(\pi _{(x)}(.)\) are also fresh variables. For \(\pi (\mathsf {Key}(\ldots ))\), note that no new objects will be “created” and \(x\), \(y\) will only be instantiated by named individuals

For \(\pi _{(x)}(\varphi )\) in the cases when \(\varphi \) is \(\exists R.C\), \(\exists R.\top \), \(\ge \!n\,R.C\), \(\exists \sigma . DR \) or \(\exists \sigma .p_{uec}\), note that \(\varphi \) occurs in the left-hand side of \(\rightarrow \) and the introduced variables are existentially quantified. Those quantifiers change to universal when taken out of the scope of \(\rightarrow \).

The translation \(\pi \) is very intuitive and we use it also for specifying the meanings of TBox axioms. Given an interpretation \(\mathcal {I}\) and a DL TBox axiom or an RBox axiom \(\varphi \), we define that \(\mathcal {I}\models \varphi \) iff \(\mathcal {I}\models \pi (\varphi )\), where the latter satisfaction relation \(\models \) is defined as usual. We say that \(\mathcal {I}\) is a model of a TBox \(\mathcal {T}\), denoted by \(\mathcal {I}\models \mathcal {T}\), if \(\mathcal {I}\models \varphi \) for all \(\varphi \in \mathcal {T}\).

Example 5

Continuing Example 4, we have that:

$$\begin{aligned}&\pi (\exists move .\lnot winning \sqsubseteq winning ) \\&\quad = \{ move (x,y) \wedge \lnot winning (y) \rightarrow winning (x)\}. \end{aligned}$$

Example 6

For \(\varphi = (\exists r.(A_1 \sqcup A_2) \sqsubseteq \forall r.B)\), we have

$$\begin{aligned} \pi (\varphi ) = \{r(x,y) \wedge (A_1(y) \vee A_2(y)) \rightarrow (r(x,z) \rightarrow B(z))\}. \end{aligned}$$

As for free variables, \(x\), \(y\) and \(z\) are universally quantified. The only formula of \(\pi (\varphi )\) is not an eDatalog\(^\lnot \) program clause. The intended translation of \(\varphi \) to a set of eDatalog\(^\lnot \) program clauses is

$$\begin{aligned} \pi _3(\varphi )&= \{r(x,y) \wedge A_1(y) \wedge r(x,z) \rightarrow B(z), \\&\;\, r(x,y) \wedge A_2(y) \wedge r(x,z) \rightarrow B(z)\}. \end{aligned}$$

To specify \(\pi _3\), we use auxiliary translations \(\pi _{2,l}\) and \(\pi _2\) such that:

  • when \(\pi _{2,l}\) is applicable to a formula \(\psi \) of predicate logic, \(\pi _{2,l}(\psi )\) is a set of conjunctions of atomic formulas such that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \bigvee \pi _{2,l}(\psi )\) iff \(\mathcal {I}\models \psi \); for example,

    $$\begin{aligned}&\pi _{2,l}(r(x,y) \wedge (A_1(y) \vee A_2(y)))\\&\quad = \{r(x,y) \wedge A_1(y),\ r(x,y) \wedge A_2(y) \}; \end{aligned}$$
  • when \(\pi _2\) is applicable to a formula \(\psi \) of predicate logic, \(\pi _2(\psi )\) is a set of eDatalog\(^\lnot \) program clauses such that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \bigwedge \pi _2(\psi )\) iff \(\mathcal {I}\models \psi \).

We define:

$$\begin{aligned} \begin{array}{l} \pi _{2,l}(\xi ) = \{\xi \}\;\; {\hbox {if}}\; \xi \; \mathrm{is\; not\; of\; any\; of\; the\; forms\;} \varphi \wedge \psi ,\\ \qquad \qquad \qquad \varphi \vee \psi , r^-(x,y) \\ \pi _{2,l}(r^-(x,y)) = \{r(y,x)\} \\ \pi _{2,l}(\varphi \vee \psi ) = \pi _{2,l}(\varphi ) \cup \pi _{2,l}(\psi ) \\ \pi _{2,l}(\varphi \wedge \psi ) = \{\varphi ' \wedge \psi ' \mid \varphi ' \in \pi _{2,l}(\varphi ) \mathrm {and\;} \psi ' \in \pi _{2,l}(\psi )\} \\ \pi _2(\xi ) = \{\xi \}\;\; {\hbox {if}}\; \xi \; \mathrm{is\; not\; of\; any\; of\; the\; forms\;} \varphi \wedge \psi ,\\ \qquad \qquad \quad \varphi \rightarrow \psi ,\; r^-(x,y) \\ \pi _2(r^-(x,y)) = \{r(y,x)\} \\ \pi _2(\varphi \rightarrow \psi ) = \\ \quad \{\varphi ' \wedge \xi ' \rightarrow \zeta ' \mid \varphi ' \in \pi _{2,l}(\varphi ) \mathrm {and\;} (\xi ' \rightarrow \zeta ') \in \pi _2(\psi )\}\ \cup \\ \quad \quad \; \{\varphi ' \rightarrow \psi ' \mid \varphi ' \in \pi _{2,l}(\varphi ),\; \psi ' \in \pi _2(\psi ) \mathrm {and\;} \psi ' \mathrm{ is\; not\;} \\ \quad \qquad \mathrm {of\; the\; form\;} \xi ' \rightarrow \zeta ' \}\\ \pi _2(\varphi \wedge \psi ) = \pi _2(\varphi ) \cup \pi _2(\psi ). \end{array} \end{aligned}$$

We also need the following definitions of \(\pi _3\):

  • if \(\varphi \) is an eDatalog\(^\lnot \) program clause then \(\pi _3(\varphi ) = \{\varphi \}\),

  • if \(\varphi \) is a DL TBox axiom or an RBox axiom \(\varphi \) then

    $$\begin{aligned} \pi _3(\varphi ) = \bigcup _{\psi \in \pi (\varphi )} \pi _2(\psi ), \end{aligned}$$
  • if \(\varphi \) is a TBox \(\mathcal {T}\) then \(\pi _3(\mathcal {T}) = \bigcup _{\varphi \in \mathcal {T}} \pi _3(\varphi )\).

Lemma 2

For any (WORL) TBox \(\mathcal {T}\), \(\pi _3(\mathcal {T})\) is an eDatalog\(^\lnot \) program equivalent to \(\mathcal {T}\) in the sense that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \pi _3(\mathcal {T})\) iff \(\mathcal {I}\models \mathcal {T}\).

Proof

Let \(\psi \) denote a formula of classical first-order logic. It can be proved by induction on the structure of \(\psi \) that \(\pi _{2,l}(\psi )\) and \(\pi _2(\psi )\) are sets of formulas such that, for any interpretation \(\mathcal {I}\),

  • \(\mathcal {I}\models \bigvee \pi _{2,l}(\psi )\) iff \(\mathcal {I}\models \psi \),

  • \(\mathcal {I}\models \bigwedge \pi _2(\psi )\) iff \(\mathcal {I}\models \psi \).

Consequently, for any interpretation \(\mathcal {I}\) and any DL TBox axiom or RBox axiom \(\varphi \), \(\mathcal {I}\models \pi _3(\varphi )\) iff \(\mathcal {I}\models \pi (\varphi )\). By definition, \(\mathcal {I}\models \varphi \) iff \(\mathcal {I}\models \pi (\varphi )\). Therefore, \(\pi _3(\mathcal {T})\) is equivalent to \(\mathcal {T}\).

It remains to show that \(\pi _3(\mathcal {T})\) is an eDatalog\(^\lnot \) program.

In the following, let \(\alpha \) denote an atomic formula. We define the families of \(l\psi _\pm \), \(l\psi \) and \(r\psi \) as follows (by using BNF grammar for \(l\psi _\pm \) and \(r\psi \)):

$$\begin{aligned}&l\psi _\pm := \alpha \mid \lnot \alpha \mid r^-(t,t') \mid l\psi _\pm \wedge l\psi _\pm \mid l\psi _\pm \vee l\psi _\pm \\&l\psi := l\psi _\pm \mathrm {with\; the\; safeness\; condition} \\&r\psi := \alpha \mid r^-(t,t') \mid r\psi \wedge r\psi \mid l\psi \rightarrow r\psi \end{aligned}$$

where a formula \(\psi \) of the \(l\psi _\pm \) family satisfies the safeness condition if translating \(\psi \) to the conjunctive normal form by using the distributive laws of \(\wedge \) and \(\vee \) results in \(\psi _1 \vee \cdots \vee \psi _k\) (where each \(\psi _i\) does not contains \(\vee \)) such that every variable occurring in some \(\psi _i\) occurs (among others) in some positive atom of \(\psi _i\).

It is straightforward to prove by induction on the structure of \(C\) that:

  • if \(C\) is a concept of the \(lC\) family then \(\pi _{(x)}(C)\) is a formula \(\psi \) of the \(l\psi \) family such that translating \(\psi \) to the conjunctive normal form by using the distributive laws of \(\wedge \) and \(\vee \) results in \(\psi _1 \vee \ldots \vee \psi _k\) (where each \(\psi _i\) does not contains \(\vee \)) such that variable \(x\) occurs in each \(\psi _i\),

  • if \(C\) is a concept of the \(rC\) family then \(\pi _{(x)}(C)\) is a formula of the \(r\psi \) family such that if a variable \(y\) different from \(x\) occurs in the formula then it occurs (among others) in the left-hand side of some \(\rightarrow \) in the formula.

Next, it can be proved by induction on the structure of \(\varphi \) that:

  • if \(\psi \) is a formula of the \(l\psi \) family then \(\pi _{2,l}(\psi )\) is a set of formulas of the \(l\psi \) family without the connective \(\vee \) and atoms of the form \(r^-(t,t')\),

  • if \(\varphi \) is a DL TBox axiom or an RBox axiom then \(\pi (\varphi )\) is a set of formulas of the \(r\psi \) family such that every variable occurring in a formula from \(\pi (\varphi )\) occurs (among others) in some positive atom of the formula in the left-hand side of some \(\rightarrow \),

  • if \(\varphi \) is a DL TBox axiom or an RBox axiom and \(\psi \in \pi (\varphi )\) then \(\pi _2(\psi )\) is a set of eDatalog\(^\lnot \) program clauses.

Therefore, \(\pi _3(\mathcal {T})\) is an eDatalog\(^\lnot \) program.\(\square \)

3.3 The well-founded semantics of WORL

The flattened version of a WORL knowledge base \( KB \) is the WORL knowledge layer denoted by \(\textit{flatten}( KB )\) and defined as follows:

  • if \( KB \) is a layer then \(\textit{flatten}( KB ) = KB \),

  • else if \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \), \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) and \(\textit{flatten}( KB _i) = \langle \mathcal {T}_i,\mathcal {A}_i\rangle \) for \(1 \le i \le k\), then

    $$\begin{aligned} \textit{flatten}( KB ) = \langle \mathcal {T}\cup \mathcal {T}_1 \cup \cdots \cup \mathcal {T}_k, \mathcal {A}\cup \mathcal {A}_1 \cup \cdots \cup \mathcal {A}_k\rangle . \end{aligned}$$

Given a WORL knowledge base \( KB \) with \(\textit{flatten}( KB ) = \langle \mathcal {T},\mathcal {A}\rangle \), the well-founded (Herbrand) model of \( KB \), denoted by \(\mathsf {WF}_ KB \), is defined to be the well-founded model of the eDatalog\(^\lnot \) knowledge base \( KB ' = \langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \).

An answer to a query \(\varphi \) w.r.t. that \( KB \) and the well-founded semantics is an answer to \(\varphi \) w.r.t. that \( KB '\) and the well-founded semantics of eDatalog\(^\lnot \).

The data complexity of WORL w.r.t. the well-founded semantics is the complexity of the problem of finding all answers to a query \(\varphi \) w.r.t. a WORL knowledge base \( KB \) and the well-founded semantics, measured w.r.t. the sum of the sizes of all ABoxes used in \( KB \) when assuming that \(\mathrm {DPreds}\), \(\varphi \) and all the TBoxes used in \( KB \) are fixed and checking whether a ground atom of an external checkable predicate is true or false can be done in polynomial time.

The following theorem immediately follows from Proposition 1.

Theorem 1

The data complexity of WORL with respect to the well-founded semantics is in PTime.

Example 7

Let \(A\), \(B\), \(C\), \(D\) be concept names and let \(\mathcal {T}_1\), \(\mathcal {T}_2\), \(\mathcal {T}\) be the TBoxes and \(\mathcal {A}_1\), \(\mathcal {A}_2\), \(\mathcal {A}\) be the ABoxes specified below:

$$\begin{aligned} \begin{array}{l@{\quad }l} \mathcal {T}_1 = \{A \sqcap \lnot B \sqsubseteq C\}&{} \mathcal {A}_1 = \{A(u),A(v),B(u)\} \\ \mathcal {T}_2 = \{A \sqcap \lnot C \sqsubseteq B\}&{} \mathcal {A}_2 = \{A(u),A(v)\} \\ \mathcal {T}= \{B \sqcap C \sqsubseteq D\}&{} \mathcal {A}= \emptyset \end{array} \end{aligned}$$

Then \( KB _1 = \langle \mathcal {T}_1,\mathcal {A}_1\rangle \), \( KB _2 = \langle \mathcal {T}_2,\mathcal {A}_2\rangle \) and \( KB = \langle \langle \mathcal {T},\mathcal {A}\rangle ,\{ KB _1, KB _2\}\rangle \) are WORL knowledge bases. The knowledge base \( KB \) consists of the main layer \(\langle \mathcal {T},\mathcal {A}\rangle \) and the additional layers \( KB _1\) and \( KB _2\). Flattening \( KB \) results in

$$\begin{aligned} KB ' = \langle \mathcal {T}_1 \cup \mathcal {T}_2 \cup \mathcal {T}, \{A(u),A(v),B(u)\}\rangle . \end{aligned}$$

The well-founded model of \( KB '\) is

$$\begin{aligned} \{A(u), A(v), B(u), \lnot C(u), \lnot D(u), u=u, v=v, u\ne v, v\ne u\}. \end{aligned}$$

The remaining atoms \(B(v)\), \(C(v)\) and \(D(v)\) have value “unknown”. The query \(D(x)\) w.r.t. \( KB \) and the well-founded semantics has no answers, while the query \(\lnot D(x)\) has one answer \(\{x/u\}\).

3.4 The stable model semantics of WORL

An answer set of a WORL knowledge base is defined inductively as follows:

  • An answer set of a WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) is defined to be the set of all ground atoms of predicates from \(\mathrm {DPreds}\) that hold in a stable model of \(\langle \mathcal {T},\mathcal {A}\rangle \) (Each stable model of \(\langle \mathcal {T},\mathcal {A}\rangle \) gives an answer set).

  • An answer set of a WORL knowledge base \( KB \) of the form \(\langle \mathcal {L},\{ KB _1,\ldots , KB _k\}\rangle \), where \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \), is defined to be an answer set of the WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\cup \mathcal {A}_1 \cup \cdots \cup \mathcal {A}_k\rangle \), where each \(\mathcal {A}_i\) is an answer set of the WORL knowledge base \( KB _i\).

Let \(\varphi \) be a query and \(\theta \) be a ground substitution for all the variables of \(\varphi \). We say that \(\theta \) is an answer to \(\varphi \) w.r.t. a WORL knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) and the stable model semantics if \(\varphi \theta \) holds in the interpretation that corresponds to an answer set of \(\langle \mathcal {P},\mathcal {A}\rangle \) (Notice that the answer set programming approach is adopted here).

Example 8

Reconsider the WORL knowledge bases \( KB _1\), \( KB _2\) and \( KB \) given in Example 7. The knowledge base \( KB _1\) has only one answer set

$$\begin{aligned} \{A(u), A(v), B(u), C(v), u=u, v=v\}. \end{aligned}$$

The knowledge base \( KB _2\) has only one answer set

$$\begin{aligned} \{A(u), A(v), B(u), B(v), u=u, v=v\}. \end{aligned}$$

Consequently, the knowledge base \( KB \) has only one answer set

$$\begin{aligned} \{A(u), A(v), B(u), B(v), C(v), D(v), u=u, v=v\}. \end{aligned}$$

The query \(D(x)\) w.r.t. \( KB \) and the stable model semantics has the only answer \(\{x/v\}\), and the query \(\lnot D(x)\) has the only answer \(\{x/u\}\). Notice the difference between the stable model semantics and the well-founded semantics.

Also observe that the flattened version \( KB '\) of \( KB \) (given in Example 7) has two answer sets:

$$\begin{aligned} \begin{array}{l} \{A(u), A(v), B(u), B(v), u=u, v=v\},\\ \{A(u), A(v), B(u), C(v), u=u, v=v\}. \end{array} \end{aligned}$$

4 Stratified WORL

A TBox \(\mathcal {T}\) is said to be stratifiable if \(\pi _3(\mathcal {T})\) is a stratified eDatalog\(^\lnot \) program. In the “Appendix” we present a direct method for checking stratifiability of a TBox without using translation.

A WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) is called a SWORL knowledge layer if \(\mathcal {T}\) is stratifiable. A WORL knowledge base is called a SWORL knowledge base if it is either a SWORL knowledge layer or a pair \(\langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) where \(\mathcal {L}\) is a SWORL knowledge layer and each \( KB _i\) is a SWORL knowledge base.

Note that flattening a SWORL knowledge base \(\langle \mathcal {L}, \{ KB _1,\)\( \ldots , KB _k\}\rangle \) may result in a WORL knowledge layer that is not stratifiable.

Let \( KB \) be a SWORL knowledge base. The standard Herbrand model of \( KB \), denoted by \(\mathcal {H}_ KB \), is defined as follows:

  • If \( KB \) is a SWORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \).

  • If \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) and \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\rangle \).

The standard model of a SWORL knowledge base \( KB \) is defined to be the traditional interpretation corresponding to \(\mathcal {H}_ KB \) and is denoted by \(\mathcal {M}_ KB \).

The notion of answer to a query w.r.t. a SWORL knowledge base and the data complexity of SWORL are defined as usual:

  • Given a SWORL knowledge base \( KB \) and a query \(\varphi \), a (correct) answer to \(\varphi \) w.r.t. \( KB \) and the standard semantics is a ground substitution \(\theta \) for all the variables of \(\varphi \) such that \(\mathcal {M}_ KB \models \varphi \theta \), where \(\models \) is the satisfaction relation defined in the usual way.

  • The data complexity of SWORL w.r.t. the standard semantics is the complexity of the problem of finding all answers to a query \(\varphi \) w.r.t. a SWORL knowledge base \( KB \) and the standard semantics, measured w.r.t. the sum of the sizes of all ABoxes used in \( KB \) when assuming that \(\mathrm {DPreds}\), \(\varphi \), the structure of \( KB \) and all the TBoxes used in \( KB \) are fixed and checking whether a ground atom of an external checkable predicate is true or false can be done in polynomial time.

Theorem 2

The data complexity of SWORL with respect to the standard semantics is in PTime.

Proof

Let \( KB \) be a SWORL knowledge base and \(n\) be the sum of the sizes of all ABoxes used in \( KB \). We prove by induction on the structure of \( KB \) that the standard Herbrand model \(\mathcal {H}_ KB \) of \( KB \) can be computed in polynomial time and has polynomial size in \(n\,\):

  • If \( KB \) is a SWORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \), and by Corollary 1, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in \(n\).

  • If \( KB = \langle \langle \mathcal {T},\mathcal {A}\rangle , \{ KB _1, \ldots , KB _k\}\rangle \) then:

  • By the inductive assumption, \(\mathcal {H}_{ KB _1}\), ..., \(\mathcal {H}_{ KB _k}\) can be computed in polynomial time and have polynomial size in \(n\).

  • \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\rangle \), and by Corollary 1, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\).

  • Hence, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in \(n\).

As a consequence, the data complexity of SWORL w.r.t. the standard semantics is in PTime. \(\square \)

The standard semantics of SWORL coincides with the well-founded semantics when restricting to SWORL knowledge bases that are single layers and to queries of the form \((\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \xi _1 \wedge \cdots \wedge \xi _l \wedge \lnot \zeta _1 \wedge \cdots \wedge \lnot \zeta _m)\), where \(\varphi _1\), ..., \(\varphi _h\) are atoms of predicates from \(\mathrm {DPreds}\) and \(\xi _1\), ..., \(\xi _l\), \(\zeta _1\), ..., \(\zeta _m\) are atoms of predicates from \(\mathrm {ECPreds}\).

4.1 Example: apartment renting

In this subsection we discuss apartment renting, a common activity that is often tedious and time-consuming. The example is based on the one of [2]. The difference is that we use SWORL instead of defeasible logic.

We begin by presenting the potential renter’s requirements:

  • Carlos is looking for an apartment of at least 45 m\(^2\) with at least two bedrooms. If it is on the third floor or higher, the house must have an elevator. Also, pet animals must be allowed.

  • Carlos is willing to pay $300 for a centrally located 45 m\(^2\) apartment, and $250 for a similar flat in the suburbs. In addition, he is willing to pay an extra $5 per m\(^2\) for a larger apartment, and $2 per m\(^2\) for a garden.

  • He is unable to pay more than $400 in total. If given the choice, he would go for the cheapest option. His second priority is the presence of a garden; his lowest priority is additional space.

We use the following predicates to describe properties of apartments:

  • \(\textit{hasSize}(X,Y)\) : \(Y\) is the size of apartment \(X\),

  • \(\textit{bedrooms}(X,Y)\) : apartment \(X\) has \(Y\) bedrooms,

  • \(\textit{hasPrice}(X,Y)\) : \(Y\) is the rent price of apartment \(X\),

  • \(\textit{floor}(X,Y)\) : apartment \(X\) is on the \(Y^{ th }\) floor,

  • \(\textit{garden}(X,Y)\) : apartment \(X\) has a garden of size \(Y\),

  • \(\textit{withLift}(X)\) : there is an elevator in the house of \(X\),

  • \(\textit{allowsPets}(X)\) : pets are allowed in apartment \(X\),

  • \(\textit{central}(X)\) : apartment \(X\) is centrally located.

The predicates \(\textit{hasSize}\), \(\textit{bedrooms}\), \(\textit{hasPrice}\), \(\textit{floor}\) and \(\textit{garden}\) are data role names, while the predicates \(\textit{withLift}\), \(\textit{allowsPets}\) and \(\textit{central}\) are concept names. These predicates are specified by ABox assertions.

We define a number of predicates. The first one is \(\textit{withGarden}\), specified by:

$$\begin{aligned} \textit{garden}(X,Y) \rightarrow \textit{withGarden}(X). \end{aligned}$$
(5)

We use predicate \(\textit{offers}(X,N,Y,Z)\) defined as follows:

$$\begin{aligned}&{}[\textit{hasSize}(X,Y) \wedge \textit{central}(X) \wedge \lnot \textit{withGarden}(X)] \nonumber \\&\rightarrow \textit{offers}(X,1,Y,0) \end{aligned}$$
(6)
$$\begin{aligned}&{}[\textit{hasSize}(X,Y) \wedge \textit{central}(X) \wedge \textit{garden}(X,Z)] \nonumber \\&\rightarrow \textit{offers}(X,2,Y,Z) \end{aligned}$$
(7)
$$\begin{aligned}&{}[\textit{hasSize}(X,Y) \wedge \lnot \textit{central}(X) \wedge \lnot \textit{withGarden}(X)] \nonumber \\&\rightarrow \textit{offers}(X,3,Y,0) \end{aligned}$$
(8)
$$\begin{aligned}&{}[\textit{hasSize}(X,Y) \wedge \lnot \textit{central}(X) \wedge \textit{garden}(X,Z)] \nonumber \\&\rightarrow \textit{offers}(X,4,Y,Z). \end{aligned}$$
(9)

The predicate \(\textit{offers}(X,N,Y,Z)\) means Carlos is willing to pay \(f(N,Y,Z)\) dollars for apartment \(X\), where \(f(N,Y,Z)\) is defined as

$$\begin{aligned} f(N,Y,Z) = \left\{ \begin{array}{l@{\quad }l} 300 + 5(Y-45) &{} \mathrm if N = 1 \\ 300 + 5(Y-45) + 2.Z\; &{} \mathrm if N = 2 \\ 250 + 5(Y-45) &{} \mathrm if N = 3 \\ 250 + 5(Y-45) + 2.Z &{} \mathrm if N = 4 . \end{array} \right. \end{aligned}$$

This function is used only to specify the external checkable predicate

$$\begin{aligned} \textit{tooExpensive}(N,Y,Z,P) \equiv (f(N,Y,Z) < P), \end{aligned}$$

which in turn is used in the following program clause:

$$\begin{aligned} \begin{array}{l} [\textit{offers}(X,N,Y,Z) \wedge \textit{hasPrice}(X,P)\, \wedge \\ \quad \quad \textit{tooExpensive}(N,Y,Z,P)] \rightarrow \textit{excluded}_0(X). \end{array} \end{aligned}$$
(10)

Thus, \(\textit{excluded}_0(X)\) means apartment \(X\) is unacceptable.

Apartments acceptable to Carlos are defined by the following DL TBox axiom:

$$\begin{aligned}&[\exists \textit{hasSize}.(\ge 45) \sqcap \exists \textit{bedrooms}.(\ge 2) \sqcap (\exists \textit{floor}.(\le 2) \nonumber \\&\quad \sqcup \ \textit{withLift})\ \sqcap \textit{allowsPets}\ \sqcap \lnot \textit{excluded}_0\nonumber \\&\quad \sqcap \ \exists \textit{hasPrice}.(\le \!{400})] \sqsubseteq \textit{acceptable}. \end{aligned}$$
(11)

In the above axiom, (\(\ge \)45), (\(\ge \)2), (\(\le \)2) and (\(\le \)400) are unary external checkable predicates.

Among the acceptable apartments, the cheapest ones are preferable:

$$\begin{aligned}&[\textit{acceptable}(X) \wedge \textit{hasPrice}(X,Y)\, \wedge \nonumber \\&\quad \textit{acceptable}(X') \wedge \textit{hasPrice}(X',Y') \wedge Y < Y'] \nonumber \\&\quad \rightarrow \textit{excluded}_1(X') \end{aligned}$$
(12)
$$\begin{aligned} \textit{acceptable}(X) \wedge \lnot \textit{excluded}_1(X) \rightarrow \textit{preferable}_1(X). \end{aligned}$$
(13)

Among the cheapest apartments that are acceptable, the ones with a garden are more preferable:

$$\begin{aligned}&[\textit{preferable}_1(X) \wedge \lnot \textit{withGarden}(X)\, \wedge \nonumber \\&\quad \textit{preferable}_1(X') \wedge \textit{withGarden}(X')] \nonumber \\&\quad \rightarrow \textit{excluded}_2(X) \end{aligned}$$
(14)
$$\begin{aligned} \textit{preferable}_1(X) \wedge \lnot \textit{excluded}_2(X) \rightarrow \textit{preferable}_2(X). \end{aligned}$$
(15)

Among those apartments, Carlos will rent a largest one:

$$\begin{aligned}&[\,\textit{preferable}_2(X) \wedge \textit{hasSize}(X,Y)\, \wedge \nonumber \\&\quad \textit{preferable}_2(X') \wedge \textit{hasSize}(X',Y') \wedge Y < Y'\,]\nonumber \\&\quad \rightarrow \textit{excluded}_3(X) \end{aligned}$$
(16)
$$\begin{aligned} \textit{preferable}_2(X) \wedge \lnot \textit{excluded}_3(X) \rightarrow \textit{mayRent}(X). \end{aligned}$$
(17)

In the program clauses (12) and (16), ‘\(<\)’ is a binary external checkable predicate.

Let \(\mathcal {T}\) = {(5), ..., (17)}. It is a stratifiable TBox. Only (11) is a DL TBox axiom, while the other axioms are eDatalog\(^\lnot \) program clauses. The program clauses (5), (13), (15) and (17) can also be expressed as DL TBox axioms, treating \(\textit{withGarden}\), \(\textit{acceptable}\), \(\textit{excluded}_1\), \(\textit{preferable}_1\), \(\textit{excluded}_2\), \(\textit{preferable}_2\), \(\textit{excluded}_3\) and \(\textit{mayRent}\) as concept names.

Translating the TBox \(\mathcal {T}\) to a stratified eDatalog\(^\lnot \) program \(\mathcal {P}= \pi _3(\mathcal {T})\), the DL TBox axiom (11) is replaced by the following eDatalog\(^\lnot \) program clauses:

$$\begin{aligned}&[\textit{hasSize}(X,Y_1) \wedge Y_1 \ge 45 \wedge \textit{bedrooms}(X,Y_2) \wedge Y_2 \ge 2 \nonumber \\&\quad \wedge \ \textit{floor}(X,Y_3) \wedge Y_3 \le 2 \wedge \textit{allowsPets}(X) \wedge \lnot \textit{excluded}_0(X)\nonumber \\&\quad \wedge \ \textit{hasPrice}(X,Y_4) \wedge Y_4 \le 400\,] \rightarrow \textit{acceptable}(X) \end{aligned}$$
(18)
$$\begin{aligned}&[\textit{hasSize}(X,Y_1) \wedge Y_1 \ge 45 \wedge \textit{bedrooms}(X,Y_2) \wedge Y_2 \ge 2 \nonumber \\&\quad \wedge \ \textit{withLift}(X) \wedge \textit{allowsPets}(X) \wedge \lnot \textit{excluded}_0(X)\, \nonumber \\&\quad \wedge \ \textit{hasPrice}(X,Y_4) \wedge Y_4 \le 400] \rightarrow \textit{acceptable}(X). \end{aligned}$$
(19)

A possible stratification of \(\mathcal {P}\) is: {(5)}, {(6), (7), (8), (9), (10)}, {(18), (19), (12)}, {(13), (14)}, {(15), (16)}, {(17)}.

Let \(\mathcal {A}\) be the ABox consisting of the ground atoms of predicates \(\textit{bedrooms}\), \(\textit{hasSize}\), \(\textit{central}\), \(\textit{floor}\), \(\textit{withLift}\), \(\textit{allowsPets}\), \(\textit{garden}\) and \(\textit{hasPrice}\) that reflect the information contained in the following table:

Flat

Bedrooms

Size

Central

Floor

Lift

Pets

Garden

Price

a1

1

50

Yes

1

No

Yes

 

300

a2

2

45

Yes

0

No

Yes

 

335

a3

2

65

No

2

No

Yes

 

350

a4

2

55

No

1

Yes

No

15

330

a5

3

55

Yes

0

No

Yes

15

350

a6

2

60

Yes

3

No

No

 

370

a7

3

65

Yes

1

No

Yes

12

375

For example, \(\textit{bedrooms}(a1,1)\), \(\textit{hasSize}(a1,50)\), \(\textit{central}\)\((a1)\), \(\textit{floor}(a1,1)\), \(\textit{allowsPets}(a1)\) and \(\textit{hasPrice}(a1,300)\) are the atoms of \(\mathcal {A}\) that involve apartment \(a1\). As ABoxes contain only positive information, only atom \(\textit{withLift}(a4)\) of predicate \(\textit{withLift}\) occurs in \(\mathcal {A}\).

The pair \( KB = \langle \mathcal {T},\mathcal {A}\rangle \) is a SWORL knowledge layer (and a SWORL knowledge base). The standard Herbrand model \(\mathcal {H}_ KB \) contains atoms \(\textit{acceptable}(X)\) only for \(X \in \{a3\), \(a5\), \(a7\}\) and atoms \(\textit{preferable}_1(X)\) only for \(X \in \{a3\), \(a5\}\). Only atom \(\textit{preferable}_2(a5)\) of predicate \(\textit{preferable}_2\) and atom \(\textit{mayRent}(a5)\) of predicate \(\textit{mayRent}\) occur in \(\mathcal {H}_ KB \).

5 Conclusions

We have developed the Web ontology rule languages WORL and SWORL together with the well-founded semantics and the stable model semantics for WORL and the standard semantics for SWORL. Both WORL with respect to the well-founded semantics and SWORL with respect to the standard semantics have PTime data complexity.

As WORL can be translated into eDatalog\(^\lnot \) and SWORL can be translated into stratified eDatalog\(^\lnot \), the languages WORL and SWORL are not more expressive than eDatalog\(^\lnot \) and stratified eDatalog\(^\lnot \), respectively. However, WORL and SWORL allow using also syntax of description logic (and hence also OWL). This has the same benefits as in the case OWL 2 RL compared to eDatalog, and is very useful for applications of the Semantic Web. As Web ontology rule languages, WORL and SWORL have the advantage of using efficient computational methods of Datalog\(^\lnot \) (extended for eDatalog\(^\lnot \)).

Using nonmonotonic semantics for negation in concept inclusion axioms is a novelty of our approach. Modularity of SWORL is also worth mentioning.