Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Satisfiability Modulo Theories (SMT) is the problem of deciding the satisfiability of first-order formulas with respect to first-order theories [5, 27] (e.g., the theory of linear arithmetic over the rationals, \(\mathcal {LRA}\)). In the last decade, SMT solvers –powered by very efficient Conflict-Driven-Clause-Learning (CDCL) engines for Boolean Satisfiability [16] combined with a collection of \(\mathcal {T}\)-Solvers, each one handling a different theory \(\mathcal {T}\)– have risen to be a pervasive and indispensable tool for dealing with many problems of industrial interest, e.g. formal verification of hardware and software systems, resource planning, temporal reasoning and scheduling of real-time embedded systems.

Optimization Modulo Theories (\(\text {OMT}\)) is an extension of SMT, which allows for finding models that make a given objective optimum through a combination of SMT and optimization procedures [8, 9, 11, 12, 14, 15, 21, 28,29,30,31]. Latest advancements in \(\text {OMT}\) have further broadened its horizon by making it incremental [8, 31] and by supporting objectives defined in other theories than linear arithmetic (e.g. Bit-Vectors) [8, 9, 17]. Moreover, \(\text {OMT}\) has been extended with the capability of handling multiple objectives at the same time either independently (aka boxed optimization) or through their linear, min-max/max-min, lexicographic or Pareto combination [8, 9, 31].

We focus on an important strict sub-case of \(\text {OMT}\), (partial weighted)Footnote 1 MaxSMT –or equivalently \(\text {OMT}\) with Pseudo-Boolean (PB) objective functions [25], \(\text {OMT+PB}\)– which is the problem of finding a model for an input formula which both satisfies all hard clauses and maximizes the cumulative weight of all soft clauses satisfied by the model [11, 12, 21]. We identify two main approaches which have been adopted in the literature (see related work). One specific-purpose approach, which we call MaxSAT -based, is to embed some MaxSAT engine within the SMT solver itself, and use it in combination with dedicated \({\mathcal {T}}{} \textit{-solvers}\) [3, 8, 9] or with SMT solvers used as blackboxes [12]. One general-purpose approach, which we call OMT-based, is to encode MaxSMT/\(\text {OMT+PB}\) into general \(\text {OMT}\) with linear-real-arithmetic cost functions [29].

We compare the two approaches and notice the following facts.

The MaxSAT-based approach can exploit the efficiency of state-of-the-art MaxSAT procedures and solvers. Unfortunately it suffers from some limitations that make it impractical or inapplicable in some cases. First, to the best of our knowledge, available MaxSAT engines deal with integer weights only; some applications, e.g., (Machine) Learning Modulo Theories, LMT [34] –a hybrid Machine Learning approach in which \(\text {OMT}\) is used as an oracle for Support Vector Machines [34]– may require the weight of soft constraints to be high-precision rational values.Footnote 2 (In this context, it is preferable not to round the weights associated with soft-clauses since it affects the accuracy of the Machine Learning approach; also multiplying all rational coefficients for their lowest common multiple of the denominators is not practical, because such values tend to become huge.) Second, a MaxSAT engine cannot be directly used when dealing with an \(\text {OMT}\) problem with multiple-independent objectives that need to be optimized at the same time [15],Footnote 3 or when the objective function is given by combinations of PB and arithmetic terms –like, e.g., for Linear Generalized Disjunctive Programming problems [26, 29] or LMT problems [34].

The \(\text {OMT}\)-based approach does not suffer from the above limitations, because it exploits the infinite-precision linear-arithmetic package on the rationals of OMT solvers, and it treats PB functions as any other arithmetic functions. Nevertheless this approach may result in low performances when dealing with MaxSMT/\(\text {OMT+PB}\) problems.

We analyze the latter fact and identify a major source of inefficiency by noticing that the presence of same-weight soft clauses entails the existence of symmetries in the solution space that may lead to a combinatorial explosion of the partial truth assignments generated by the CDCL engine during the optimization search. To cope with this fact, we introduce and describe a solution based on (bidirectional) sorting networks [1, 4, 32]. We implemented this idea within the OptiMathSAT \(\text {OMT}\) solver [30].

We run an empirical evaluation on a large amount of problems comparing MaxSAT-based and \(\text {OMT}\)-based techniques, with and without sorting networks, implemented on top of OptiMathSAT [30] and [9]. The results are summarized as follows.  

(a) :

Comparing MaxSAT-based wrt. \(\text {OMT}\)-based approaches on problems where the former are applicable, it turns out that the former provide much better performances, in particular when adopting the maximum-resolution [8, 18] MaxSAT engine.

(b) :

Evaluating the benefits of bidirectional sorting-network encodings, it turns out that they improve significantly the performances of \(\text {OMT}\)-based approaches, and often also of MaxSAT-based ones.

(c) :

Comparing and OptiMathSAT, it turns out that the former performed better on MaxSAT -based approaches, whilst the latter performed seomtimes equivalently and sometimes significantly better on \(\text {OMT}\)-based ones, in particular when enhanced by the sorting-network encoding.

 

Related Work. The idea of MaxSMT and of optimization in SMT was first introduced by Nieuwenhuis & Oliveras [21], who presented a general logical framework of “SMT with progressively stronger theories” (e.g., where the theory is progressively strengthened by every new approximation of the minimum cost), and presented implementations for MaxSMT based on this framework. Cimatti et al. [11] introduced the notion of “Theory of Costs” \(\mathcal {C}\) to handle Pseudo-Boolean (PB) cost functions and constraints by an ad-hoc “\(\mathcal {C}\)-solver” in the standard lazy SMT schema, and implemented a variant of MathSAT tool able to handle SMT with PB constraints and to minimize PB cost functions. Cimatti et al. [12] presented a “modular” approach for MaxSMT, combining a lazy SMT solver with a MaxSAT solver, where the SMT solver is used as an oracle generating \(\mathcal {T}\)-lemmas that are then learned by the MaxSAT solver so as to progressively narrow the search space toward the optimal solution.

Sebastiani and Tomasi [28, 29] introduced a wider notion of optimization in SMT, namely Optimization Modulo Theories (OMT) with \(\mathcal {LRA}\) cost functions, \(\text {OMT}(\mathcal {LRA} \cup \mathcal {T})\), which allows for finding models minimizing some \(\mathcal {LRA}\) cost term –\(\mathcal {T}\) being some (possibly empty) stably-infinite theory s.t. \(\mathcal {T}\) and \(\mathcal {LRA}\) are signature-disjoint– and presented novel \(\text {OMT}(\mathcal {LRA} \cup \mathcal {T})\) tools which combine standard SMT with LP minimization techniques. (\(\mathcal {T}\) can also be a combination of Theories \(\bigcup _i\mathcal {T} _i\).) Eventually, \(\text {OMT}(\mathcal {LRA} \cup \mathcal {T})\) has been extended so that to handle costs on the integers,incremental OMT, multi-objective, and lexicographic OMT and Pareto-optimality [8, 9, 14, 15, 30, 31]. To the best of our knowledge only four OMT solvers are currently implemented: bclt [14], (aka Z3Opt) [8, 9], OptiMathSAT [30, 31], and Symba [15]. Remarkably, bclt, and OptiMathSAT currently implement also specialized procedures for MaxSMT, leveraging to SMT level state-of-the-art MaxSAT procedures; in addition, features a Pseudo-Boolean \({\mathcal {T}}{} \textit{-solver}\) which can generate sorting circuits on demand for Pseudo-Boolean inequalities featuring sums with small coefficients when a Pseudo-Boolean inequality is used some times for unit propagation/conflicts [7, 9].

Content. The paper is structured as follows. Section 2 briefly reviews the background; Sect. 3 describes the source of inefficiency arising when MaxSMT is encoded in \(\text {OMT}\) as in [29]; Sect. 4 illustrates a possible solution based on bidirectional sorting networks; in Sect. 5 we provide empirical evidence of the benefits of this approach on two applications of \(\text {OMT}\) interest. Section 6 provides some conclusions with some considerations on the future work.

2 Background

We assume the reader is familiar with the main theoretical and algorithmic concepts in SAT and SMT solving (see [5, 16]). Optimization Modulo Theories (\(\text {OMT}\)) is an extension of SMT which addresses the problem of finding a model for an input formula \(\varphi \) which is optimal wrt. some objective function \({obj} \) [21, 28]. The basic minimization scheme implemented in state-of-the-art \(\text {OMT}\) solvers, known as linear-search scheme [21, 28], requires solving an SMT problem with a solution space that is progressively tightened by means of unit linear constraints in the form \(\lnot (ub_i \le {obj})\), where \(ub_i\) is the value of \({obj} \) that corresponds to the optimum model of the most-recently found truth assignment \(\mu _i\) s.t. \(\mu _i \models \varphi \). The \(ub_i\) value is computed by means of a specialized optimization procedure embedded within the \({\mathcal {T}}{} \textit{-solver} \) which, taken as input a pair \(\langle \mu , {obj} \rangle \), returns the optimal value ub of \({obj} \) for such \(\mu \). The \(\text {OMT}\) search terminates when such procedure finds that \({obj} \) is unbounded or when the SMT search is \(\textsc {unsat} \), in which case the latest value of \({obj} \) (if any) and its associated model \(M_i\) is returned as optimal solution value. (Alternatively, binary-search schemes can also be used [28, 29].)

An important subcase of \(\text {OMT}\) is that of MaxSMT, which is a pair \(\langle \varphi _h, \varphi _s\rangle \), where \(\varphi _h\) denotes the set of “hard” \(\mathcal {T}\)-clauses, \(\varphi _s\) is a set of positive-weighted “soft” \(\mathcal {T}\)-clauses, and the goal is to find the maximum-weight set of \(\mathcal {T}\)-clauses \(\psi _s\), \(\psi _s\subseteq \varphi _s\), s.t. \(\varphi _h \cup \psi _s\) is \(\mathcal {T}\)-satisfiable [3, 11, 12, 21]. As described in [29], MaxSMT \(\langle \varphi _h,\varphi _s\rangle \) can be encoded into a general \(\text {OMT}\) problem with a Pseudo-Boolean objective: first introduce a fresh Boolean variable \(A_i\) for each soft-constraint \(C_i\in \varphi _s\) as follows

$$\begin{aligned} \varphi ^* \mathop {=}\limits ^{\text {def}} \varphi _h \cup \bigcup _{C_i\in \varphi _s}\{(A_i\vee C_i)\}; \,\, {obj} \mathop {=}\limits ^{\text {def}} \sum _{C_i\in \varphi _s}w_i A_i \end{aligned}$$
(1)

and then encode the problem into \(\text {OMT}\) as a pair \(\langle \varphi , {obj} \rangle \) where \(\varphi \) is defined as

$$\begin{aligned} \varphi&\mathop {=}\limits ^{\text {def}}&\textstyle \varphi ^* \wedge \bigwedge _{i} ((\lnot A_i \vee (x_i=w_i)) \wedge (A_i \vee (x_i=0))) \wedge \end{aligned}$$
(2)
$$\begin{aligned}&\textstyle \bigwedge _{i} ((0 \le x_i) \wedge (x_i \le w_i)) \wedge \end{aligned}$$
(3)
$$\begin{aligned}&\textstyle ({obj} = \sum _i x_i),\ \ \ \ \ x_i,\ {obj}\ fresh. \end{aligned}$$
(4)

Notice that, although redundant from a logical perspective, the constraints in (3) serve the important purpose of allowing early-pruning calls to the \(\mathcal {LRA}\)-\(\mathsf {Solver}\) (see [5]) to detect a possible \(\mathcal {LRA}\) inconsistency among the current partial truth assignment over variables \(A_i\) and linear cuts in the form \(\lnot (ub \le {obj})\) that are pushed on the formula stack by the \(\text {OMT}\) solver during the minimization of \({obj} \). To this extent, the presence of such constraints improves performance significantly.

3 Problems with OMT-based Approaches

Consider first the case of a MaxSMT-derived \(\text {OMT}\) problem as in (1) s.t. all weights are identical, that is: let \(\langle \varphi , {obj} \rangle \) be an \(\text {OMT}\) problem, where \({obj} = \sum _{i=1}^{n} w \cdot A_i\), where the \(A_i\)s are Boolean variables, and let \(\mu \) be a satisfiable truth assignment found by the \(\text {OMT}\) solver during the minimization of \({obj} \). Given \(A_{T}=\{A_i | \mu \models A_i\}\) and \(k = |A_{T}|\), then the upper bound value of \({obj} \) in \(\mu \) is \(ub = w\cdot k\). As described in [28, 29], the \(\text {OMT}\) solver adds a unit clause in the form \(\lnot (ub \le {obj})\) in order to (1) remove the current truth assignment \(\mu \) from the feasible search space and (2) seek for another \(\mu '\) which improves the current upper-bound value ub. Importantly, the unit clause \(\lnot (ub \le {obj})\) does not only prune the current truth assignment \(\mu \) from the feasible search space, but it also makes inconsistent any other (partial) truth assignment \(\mu '\) which sets exactly k (or more) \(A_i\) variables to True. Thus, each new unit clause in this form prunes \(\gamma = {{n}\atopwithdelims (){k}}\) truth assignments from the search space, where \(\gamma \) is the number of possible permutations of \(\mu \) over the variables \(A_i\). A dual case occurs when some lower-bound unit clause \(\lnot ({obj} \le lb)\) is learned (e.g., in a binary-search step, see [28]).

Unfortunately, the inconsistency of a truth assignment \(\mu '\) which sets exactly k variables to True wrt. a unit clause \(\lnot (ub \le {obj})\), where \(ub = w \cdot k\), cannot be determined by simple Boolean Constraint Propagation (BCP). In fact, \(\lnot (ub \le {obj})\) being a \(\mathcal {LRA}\) term, the CDCL engine is totally oblivious to this inconsistency until when the \({\mathcal {T}}{} \textit{-solver}\) for linear rational arithmetic (\(\mathcal {LRA}\)-\(\mathsf {Solver}\)) is invoked, and a conflict clause is generated. Therefore, since the \(\mathcal {LRA}\)-\(\mathsf {Solver}\) is much more resource-demanding than BCP and it is invoked less often, it is clear that the performance of an \(\text {OMT}\) solver can be negatively affected when dealing with this kind of objectives.

Fig. 1.
figure 1

A simple example of \(\text {OMT}\) search.

Example 1

Figure 1 shows a toy example of \(\text {OMT}\) search execution over the pair \(\langle \varphi , {obj} \rangle \), where \(\varphi \) is some SMT formula and \({obj} \mathop {=}\limits ^{\text {def}} \sum _{i=1}^{4} A_i\) (i.e., \(w_i=1\) for every i). We assume the problem has been encoded as in (2)–(4), so that the truth assignment \(\mu _0\mathop {=}\limits ^{\text {def}} \cup _{i=1}^4\{{(0\le x_i),(x_i\le 1)}\} \cup \{{({obj} =\sum _{i=1}^4x_i)}\} \) is immediately generated by BCP, and is part of all truth assignments generated in the search. In the first branch (left) a truth assignment \(\mu \mathop {=}\limits ^{\text {def}} \mu _0\cup \{A_1,(x_1=1),A_2,(x_2=1),\lnot A_3,(x_3=0),\lnot A_4,(x_4=0)\}\) is found s.t. \({obj} = 2\), resulting from the decisions \(A_1\), \(A_2\), \(\lnot A_3\) and \(\lnot A_4\). Then the unit clause \(\lnot (2\le {obj})\) is learned and the Boolean search is restarted in order to find an improved solution. In the second branch (center) \(A_1\) and \(A_2\) are decided, forcing by BCP the assignment \(\mu '\mathop {=}\limits ^{\text {def}} \mu _0\cup \{{\lnot (2\le {obj}),A_1,(x_1=1),A_2,(x_2=1)}\} \) which is \(\mathcal {LRA}\)-inconsistent. However, it takes a (possibly-expensive) intermediate call to the \(\mathcal {LRA}\)-\(\mathsf {Solver}\) to reveal such an inconsistency.Footnote 4 If so, a new conflict clause \(\lnot A_1 \vee \lnot A_2\) is learned, forcing the solver to back-jump and toggle the value of \(A_2\) (right). The search continues with the new decision \(A_3\), which is again \(\mathcal {LRA}\) inconsistent, causing a new conflict clause as before, and so on. In this way, the solver might uselessly enumerate and check all the up-to \({{4}\atopwithdelims (){2}}\) assignments that assign two \(A_i\)’s to true and are consistent with \(\varphi \), even though they are intrinsically incompatible with \(\lnot (2\le {obj})\).    \(\diamond \)

The performance issue identified with the previous case example can be generalized to any objective \({obj} \) in which groups of \(A_{i}\)’s share the same weights:

$$\begin{aligned}&{obj} = \tau _1 + \ldots + \tau _m, \end{aligned}$$
(5)
$$\begin{aligned}&\textstyle \bigwedge _{j=1}^m\ ((\tau _j = w_j \cdot \sum _{i=1}^{k_j}A_{ji}) \,\, \wedge \,\,(0 \le \tau _j) \wedge (\tau _j \le w_j \cdot k_j)), \end{aligned}$$
(6)

where the logically-redundant constraints \((0 \le \tau _j) \wedge (\tau _j \le w_j \cdot k_j)\) are added for the same reason as with (3).

4 Combining OMT with Sorting Networks

Notationally, the symbols \(\top ,\bot ,*\) denote respectively “true”, “false” and “unassigned”. We represent truth assignment as sets (or conjunctions) of literals s.t. a positive [resp. negative] literal denotes the fact that the corresponding atom is assigned to \(\top \) [resp. \(\bot \)]. Given a Boolean formula \(\varphi \) and two truth assignments \(\mu ,\eta \) on the atoms in \(\varphi \), “\(\langle {\varphi ,\mu }\rangle \vdash _{\mathsf{bcp}}\eta \)” denotes the fact that all literals in \(\eta \) are inferred by BCP on \(\varphi \) if all literals in \(\mu \) are asserted. (Notice that “\(\langle {\varphi ,\mu }\rangle \vdash _{\mathsf{bcp}}\eta \)” is stronger than “\(\varphi \wedge \mu \models \eta \)”.)

Fig. 2.
figure 2

The basic schema of a bidirectional sorting network.

When dealing with MaxSMT and \(\text {OMT}\) with PB objectives in the form

$$\begin{aligned} \textstyle {obj} = w \cdot \sum _{i=1}^{n} A_{i} \end{aligned}$$
(7)

a solution for improving search efficiency is to reduce the dependency on the expensive \(\mathcal {LRA}\)-\(\mathsf {Solver}\) by better exploiting BCP with the aid of Boolean bidirectional sorting networks.

Definition 1

Let \(\mathsf{SN}[\underline{A},\underline{B}]\) be a CNF Boolean formula on n input Boolean variables \(\underline{A}{}\mathop {=}\limits ^{\text {def}} \{{A_{1}, \ldots , A_{n}}\} \) and n output Boolean variables \(\underline{B}{}\mathop {=}\limits ^{\text {def}} \{{B_{1}, \ldots , B_{n}}\} \), possibly involving also auxiliary Boolean variables which are not mentioned.

We say that \(\mathsf{SN}[\underline{A},\underline{B}]\) is a bidirectional sorting network if and only if, for every m and k s.t. \(n\ge m \ge k \ge 0\) and for every partial truth assignment \(\mu \) s.t. \(\mu \) assigns exactly k input variables \(A_{i}\) to \(\top \) and \(n-m\) variables \(A_{i}\) to \(\bot \):

$$\begin{aligned}&\langle {\mathsf{SN}[\underline{A},\underline{B}],\mu }\rangle \vdash _{\mathsf{bcp}}\{{ {\lnot }B_{1}, \ldots , {\lnot }B_{k}}\},\end{aligned}$$
(8)
$$\begin{aligned}&\langle {\mathsf{SN}[\underline{A},\underline{B}],\mu }\rangle \vdash _{\mathsf{bcp}}\{{\lnot B_{m+1},\ldots ,\lnot B_{n}}\}. \end{aligned}$$
(9)
$$\begin{aligned}&\langle {\mathsf{SN}[\underline{A},\underline{B}],\mu \cup \{{\lnot B_{k+1}}\}}\rangle \vdash _{\mathsf{bcp}}\{{\lnot A_{i}\ s.t.\ A_{i}\ unassigned\ in\ \mu }\},\end{aligned}$$
(10)
$$\begin{aligned}&\langle {\mathsf{SN}[\underline{A},\underline{B}],\mu \cup \{{ {\lnot }B_{m}}\}}\rangle \vdash _{\mathsf{bcp}}\{{A_{i}\ s.t.\ A_{i}\ unassigned\ in\ \mu }\}. \end{aligned}$$
(11)

The schema of a bidirectional sorting network is depicted in Fig. 2.

(8)–(9) state that the output values \(\underline{B}\) of \(\mathsf{SN}[\underline{A},\underline{B}]\) are propagated from the inputs \(\underline{A}\) via BCP. (10)–(11) describe how assigning output variables \(\underline{B}\) propagates back to input variables \(\underline{A}\): (10) states that, when k \(A_i\)’s are true and \(B_{k+1}\) is false, then all other \(A_i\)’s are forced to be false by BCP; dually, (11) states that, when \(n-m\) \(A_i\)’s are false and \(B_{m}\) is true, then all other \(A_i\)’s are forced to be true by BCP. (If any of the above BCP assignments conflicts with some previous assignment, a conflict is produced.)

Given an \(\text {OMT}\) problem \(\langle {\varphi ,{obj}}\rangle \), where \({obj} \) is as in (7), and a Boolean formula \(\mathsf{SN}[\underline{A},\underline{B}]\) encoding a bidirectional sorting network relation as in Definition 1, we extend \(\varphi \) in (2)–(4) as follows:

$$\begin{aligned} \varphi ' = \varphi \wedge \mathsf{SN}[\underline{A},\underline{B}]\wedge \bigwedge _{i=1}^{n} {\left\{ \begin{array}{ll} (\lnot B_{i} \vee (i\cdot w \le {obj}))\ \wedge \\ (B_{i} \vee ({obj} \le (i - 1) \cdot w))\ \wedge \\ (\lnot (i\cdot w \le {obj}) \vee \lnot ({obj} \le (i - 1) \cdot w)) \end{array}\right. } \end{aligned}$$
(12)

and optimize \({obj} \) over \(\varphi '\). Notice here that the third line in Eq. 12 is \(\mathcal {LRA}\)-valid, but it allows for implying the negation of \(({obj} \le (i - 1) \cdot w)\) from \((i\cdot w \le {obj})\) (and vice versa) directly by BCP, without any call to the \(\mathcal {LRA}\)-\(\mathsf {Solver}\).

Consider (8)–(9) and assume that \(\mu \) assigns k \(A_i\)s to \(\top \) and \(n-m\) to \(\bot \) as in Definition 1. Then (8) with (12) forces the unit-propagation of \(B_{1},\ldots ,B_{k}\), and then, among others, of \((k\cdot w \le {obj})\), while (9) with (12) forces the unit-propagation of \(\lnot B_{m+1},\ldots ,\lnot B_{n}\), and then, among others, of \(({obj} \le m\cdot w)\). This automatically restricts the range of \({obj}\) to \([k\cdot w,m\cdot w]\), obtaining the same effect as (2)–(4).

The benefits of the usage of \(\mathsf{SN}[\underline{A},\underline{B}]\) are due to both (10) and (11). When the optimization search finds a new minimum \(k\cdot w\) and a unit clause in the form \(\lnot (k\cdot w \le {obj})\) is learned (see e.g. [28]) and \(\lnot B_{k}\) is unit-propagated on (12), then as soon as \(k-1\) \(A_{i}\)s are set to True, all the remaining \(n-k+1\) \(A_{i}\)s are set to False by BCP (10). A dual case occurs when some lower-bound unit clause \(\lnot ({obj} \le k\cdot w)\) is learned (e.g., in a binary-search step [28]) and \(B_{k+1}\) is unit-propagated on (12): as soon as \(n-k-1\) \(A_{i}\)s are set to False, then all the remaining \(k+1\) \(A_{i}\)s are set to True by BCP (11).

Fig. 3.
figure 3

An example of \(\text {OMT}\) search with sorting networks.

Example 2

Figure 3 considers the same scenario as in Example 1, in which we extend the encoding with a bidirectional sorting-network relation as in (12). The behaviour is identical to that of Example 1 until the assignment \(\mu \) is generated, the unit clause \(\lnot (2 \le {obj})\), and the procedure backtracks for the first time (Fig. 3 left). This causes the unit-propagation of \(\lnot B_{2}\) on (12). As soon as \(A_{1}\) is picked as new decision, \(\lnot A_{2}, \lnot A_{3}, \lnot A_{4}\) are unit propagated (10), saving up to \({{4}\atopwithdelims (){2}}\) (expensive) calls to the \(\mathcal {LRA}\)-\(\mathsf {Solver}\) (Fig. 3 center). Then \(\lnot (1 \le {obj})\) is learned, and the search proceeds (Fig. 3 right).    \(\diamond \)

We generalize this approach to deal with the general objectives as in (5)–(6). In this case a separate sorting circuit is generated for each term \(\tau _j\), and constraints in the form

$$\begin{aligned} \textstyle \bigwedge _{j=1}^m \bigwedge _{i=1}^{k_j} (\ \lnot (w_j \cdot i \le {obj}) \rightarrow \lnot (w_j \cdot i \le \tau _j)), \end{aligned}$$
(13)

are added to ensure that the circuit is activated by BCP.

4.1 Bidirectional Sorting Networks

Unlike the usage of sorting networks in other contexts, which consider only (8) and (10) as relevant properties (e.g. [32]), we are interested in sorting networks which propagate both \(\top \) and \(\bot \) values in both directions (i.e., which comply with all properties (8)–(11)). To this extent, we have considered two encodings: the sequential counter encoding in [32], which we have extended to comply with all properties (8)–(11), and the cardinality network encoding in [1, 4].

Bidirectional Sequential Counter Encoding. The sequential counter encoding \(LT_{SEQ}^{n,k}\) for \(\le k (A_{1}, \ldots , A_{n})\) presented in [32] consists of \(O(k\cdot n)\) clauses and variables and complies with (8) and (10). The circuit is given by the composition of n sub-circuits, each of which computes \(S_i = \sum _{j=1}^{i}A_{j}\), represented in unary form with the bits \(S_{i, j}\), i.e., \(S_{i,j}=\top \) if \( \sum _{r=1}^{i}A_{r}\ge j\), so that \(B_{j}\mathop {=}\limits ^{\text {def}} S_{n,j}\), \(j\in [1 \ldots n]\). The (CNF version of the)Footnote 5 following formula is the encoding of \(LT_{SEQ}^{n,k}\) presented in [32], with \(k\mathop {=}\limits ^{\text {def}} n\):

$$\begin{aligned} \textstyle (A_{1}\rightarrow S_{1,1}) \wedge \bigwedge _{i=2}^{n} \{ ((A_{i}\vee S_{i-1,1})\rightarrow S_{i,1}) \} \wedge \end{aligned}$$
(14)
$$\begin{aligned} \textstyle \bigwedge _{i=2}^{n} \{ (\lnot A_{i}\vee \lnot S_{i-1,n}) \} \wedge \textstyle \bigwedge _{j=2}^{n} \{ (\lnot S_{1,j}) \} \wedge \end{aligned}$$
(15)
$$\begin{aligned} \textstyle \bigwedge _{i,j=2}^{n} \{ (((A_{i}\wedge S_{i-1,j-1})\vee S_{i-1,j})\rightarrow S_{i,j}) \} \end{aligned}$$
(16)

Notice that, in order to reduce the size of the encoding, in (14)–(16) only right implications “\(\rightarrow \)” were used to encode each gate in the Boolean sorting circuit [32], so that (14)–(16) complies with (8) and (10) but not with (9) and (11). To cope with this fact, we have added the following part, which reintroduces the left implications “\(\leftarrow \)” of the encoding of each gate in (14) and (16), making it compliant also with (9) and (11):

$$\begin{aligned} \textstyle (A_{1}\leftarrow S_{1,1}) \wedge \bigwedge _{i=2}^{n} \{ ((A_{i}\vee S_{i-1,1})\leftarrow S_{i,1}) \} \wedge \end{aligned}$$
(17)
$$\begin{aligned} \textstyle \bigwedge _{i,j=2}^{n} (((A_{i}\wedge S_{i-1,j-1})\vee S_{i-1,j}) \leftarrow S_{i,j}). \end{aligned}$$
(18)

Bidirectional Cardinality Network Encoding. The cardinality network encoding presented in [1, 4, 13], based on the underlying sorting scheme of the well-known merge-sort algorithm, has complexity \(O(n \log ^2 k)\) in the number of clauses and variables. Due to space limitations, we refer the reader to [1, 4] for the encoding of cardinality networks we used in our own work. Notice that, differently than in the previous case, this sorting network propagates both \(\top \) and \(\bot \) values in both directions (i.e., it complies with all properties (8)–(11) [1, 4] and it is thus suitable to be used within \(\text {OMT}\) without modifications.

Both of the previous encodings are istantiated assuming \(k = n\), since the sorting network is generated prior to starting the search. Therefo re, the cardinality network circuit looks more appealing than the sequential counter encoding due to its lower complexity in terms of clauses and variables employed.

5 Experimental Evaluation

We extended OptiMathSAT with a novel internal preprocessing step, which automatically augments the input formula with a sorting network circuit of choice between the bidirectional sequential counter and the cardinality network, as described in Sect. 4. To complete our comparison, we also implemented in OptiMathSAT two MaxSAT-based approaches, the max-resolution approach implemented in [8, 18] and (for MaxSMT only) the lemma-lifting approach of [12], using Maxino [2] as external MaxSAT solver.

Here we present an extensive empirical evaluation of various MaxSAT-based and \(\text {OMT}\)-based techniques in OptiMathSAT [23, 30] and [9, 22]. Overall, we considered >20,000 OMT problems and run >270,000 job pairs. The problems were produced either by CGM-Tool [10] from optimization of Constrained Goal Models [19, 20] (a modeling and automated-reasoning tool for requirement engineering) or by PyLMT [24] from (Machine) Learning Modulo Theories [34]. We partition these problems into two distinct categories. In Sect. 5.1 we analyze problems which are solvable by MaxSAT-based approaches, like those with PB objective functions or their lexicographic combination, so that to allow both and OptiMathSAT to use their MaxSAT-specific max-resolution engines (plus others). In Sect. 5.2 we analyze problems which cannot be solved by MaxSAT-based approaches, because the objective functions involve some non-PB components, forcing to restrict to OMT-based approaches only.

The goal of this empirical evaluation is manyfold:

  1. (i)

    compare the performance of MaxSAT-based approaches wrt. \(\text {OMT}\)-based ones, on the kind of OMT problems where the former are applicable;

  2. (ii)

    evaluate the benefits of sorting-network encodings with \(\text {OMT}\)-based approaches (and also with MaxSAT-based ones);

  3. (iii)

    compare the performances of OptiMathSAT with those of .

For goals (i) and (ii) we used the following configurations of OptiMathSAT.

  • \(\text {OMT}\)-based: standard, enriched with the bidirectional sequential-counter and cardinality sorting network;

  • MaxSAT-based: the above-mentioned max-resolution implementation, with and without the cardinality sorting network, and lemma-lifting (for pure MaxSMT only).

For goal (iii) we also used the following configurations of .Footnote 6

  • \(\text {OMT}\)-based: standard (encoded as in (2)–(4)).

  • MaxSAT-based: using alternatively the internal implementations of the max-resolution [8, 18] and Wmax [21] procedures.

Each job pair was run on one of two identical Ubuntu Linux machines featuring 8-core Intel-Xeon@2.20 GHz CPU, 64 GB of ram and kernel 3.8-0-29. Importantly, we verified that all tools/configurations under test agreed on the results on all problems when terminating within the timeout. (The timeout varies with the benchmark sets, see Sects. 5.1 and 5.2.) All benchmarks, as well as our experimental results and all the tools which are necessary to reproduce the results, are available [33].

5.1 Problems Suitable for MaxSAT-Based Approaches

Test Set #1: CGMs with Lexicographic PB Optimization. In our first experiment we consider the set of all problems produced by CGM-Tool [10] in the experimental evaluation in [19]. They consist of 18996 automatically-generated formulas which encode the problem of computing the lexicographically-optimum realization of a constrained goal model [19], according to a prioritized list of (up to) three objectives \(\langle {{obj} _1,{obj} _2,{obj} _3}\rangle \). A solution optimizes lexicographically \(\langle {obj_1,\ldots ,obj_k}\rangle \) if it optimizes \(obj_1\) and, if more than one such \(obj_1\)-optimum solutions exists, it also optimizes \(obj_2\),..., and so on; both \(\text {OMT}\)-based and MaxSAT-based techniques handle lexicographic optimization, by optimizing \({obj} _1,{obj} _2,\ldots \) in order, fixing the value of each \(obj_i\) to its optimum as soon as it is found [8, 9, 30, 31]. In this experiment, we set the timeout at 100 s. The results are reported in Fig. 4 (top and middle).

Fig. 4.
figure 4

[Top, table] Results of various solvers, configurations and encodings on all the problems encoding CGM optimization with lexicographic PB optimization of [19, 20]. (Values in boldface denote the best performance of each category; values in denote the absolute best performance.) [Middle, scatterplots]. Pairwise comparison on OptiMathSAT (OMT-based) with/out sequential-counter encoding (left) and with/out cardinality-network encoding (right). ( points denote unsatisfiable benchmarks, denote satisfiable ones and ones represent timeouts.) [Bottom, tables] Effect of splitting the PB sums into chunks of maximum variable number (no split, 10, 15, 20 variables) with the sequential-counter encoding (left) and the cardinality-network encoding (right). (Color figure online)

As far as OptiMathSAT (\(\text {OMT}\)-based) is concerned, extending the input formula with either of the sorting networks increases the number of benchmarks solved within the timeout. Notably, the cardinality network encoding –which has the lowest complexity– scores the best both in terms of number of solved benchmarks and solving time. On the other hand, the sequential counter network is affected by a significant performance hit on a number of benchmarks, as it is witnessed by the left scatter plot in Fig. 4. This not only affects unsatisfiable benchmarks, for which using sorting networks appears to be not beneficial in general, but also satisfiable ones.

A possible strategy for overcoming this performance issue is to reduce the memory footprint determined by the generation of the sorting network circuit. This can be easily achieved by splitting each Pseudo-Boolean sum in smaller sized chunks and generating a separate sorting circuit for each splice. The result of applying this enhancement, using chunks of increasing size, is shown in Fig. 4 (bottom). The data suggest that the sequential counter encoding can benefit from this simple heuristic, but it does not reach the performances of the cardinality network, which are not affected by this strategy. (In next experiments this strategy will be no more considered.)

As far as OptiMathSAT (MaxSAT-based) is concerned, we notice that it significantly outperforms all \(\text {OMT}\)-based techniques. Remarkably, extending the input formula with the sorting networks improves the performance also of this configuration.

As far as (MaxSAT-based) is concerned, we notice that when using the max-resolution algorithm it outperforms all other techniques by solving all problems.

Test Set #2: CGMs with Weight-1 PB Optimization. Our second experiment is a variant of the previous one, in which we consider only single-objective optimizations and we fix all weights to 1, so that each problem is encoded as a plain un-weighted MaxSMT problem. We set the timeout to 100 s. The results are reported in Fig. 5.

Fig. 5.
figure 5

Results of various solvers, configurations and encodings on CGM-encoding problems of [19, 20] with single-objective weight-1.

As far as OptiMathSAT (\(\text {OMT}\)-based) is concerned, extending the input formula with either of the sorting networks increases the number of benchmarks solved within the timeout. Surprisingly, this time the sequential counter network performs significantly better than the cardinality network, despite its bigger size. (We do not have a clear explanation of this fact.)

As far as OptiMathSAT (MaxSAT-based) is concerned, we notice that it significantly outperforms all \(\text {OMT}\)-based techniques, solving all problems. Extending the input formula with the cardinality networks slightly worsens the performances. Also the lemma-lifting techniques outperforms all \(\text {OMT}\)-based techniques, solving only two problem less than the previous MaxSAT-based techniques.

As far as (MaxSAT-based) is concerned, we notice that using the max-resolution MaxSAT algorithm it is the best scorer, although the differences wrt. OptiMathSAT (MaxSAT-based) are negligible, whilst by using the wmax engine the performances decrease drastically.

5.2 Problems Unsuitable for MaxSAT-Based Approaches

Here we present a couple of test sets which cannot be supported by any MaxSAT-based technique in OptiMathSAT or and, to the best of our knowledge, no encoding of these problem into MaxSMT has ever been conceived. Thus the solution is restricted to \(\text {OMT}\)-based techniques. (To this extent, with OptiMathSAT we have used the linear-search strategy rather than the default adaptive linear/binary one to better compare with the linear strategy adopted by .)

Fig. 6.
figure 6

[Table:] results of various solvers with OMT-based configurations on CGMencoding problems of [19, 20] with max-min objective functions. [Left scatterplot:] OptiMathSAT + card. network vs. plain OptiMathSAT. [Right scatterplot:] vs. OptiMathSAT + card. network.

Test Set #3: CGMs with Max-Min PB Optimization. In our third experiment we consider another variant of the problems in Test Set #1, in which the three PB/MaxSMT objectives \(\langle {{obj} _1,{obj} _2,{obj} _3}\rangle \) are subject to a max-min combination: each objective \({obj} _j\) is normalized so that its range equals [0, 1] (i.e., it is divided by \(\sum _i w_{ji}\)), then \(\bigwedge _{j=1}^{3} ({obj} _j \le {obj})\) s.t. \({obj}\) is a fresh \(\mathcal {LRA}\) variable is added to the main formula, and the solver is asked to find a solution making \({obj}\) minimum (see [30]). Notice that max-min optimization guarantees a sort of “fairness” among the objectives \({obj} _1,\ldots ,{obj} _3\). Since the problem is more complex than the previous ones and the most-efficient MaxSAT-based techniques are not applicable, we increased the timeout to 300 s. The results are shown in Fig. 6. (Unlike with Fig. 4, since the difference in performance between OptiMathSAT with the two sorting networks is minor, here and in Fig. 7 we have dropped the scatterplot with the sequential-counter encoding and we introduced one comparing with instead.)

Looking at the table and at the scatterplot on the left, we notice that enhancing the \(\text {OMT}\)-based technique of OptiMathSAT by adding the cardinality networks improves significantly the performances. Also, looking at the table and at the scatterplot on the right, we notice that \(\text {OMT}\)-based technique of OptiMathSAT, with the help of sorting networks, performs equivalently or slightly better than that of .

Fig. 7.
figure 7

[Table:] results of various solvers with \(\text {OMT}\)-based configurations on LMT-encoding problems of [34] with complex objective functions. [Left scatterplot:] OptiMathSAT + card. network vs. plain OptiMathSAT. [Right scatterplot:] vs. OptiMathSAT + card. network.

Test Set #4: LMT with Mixed Complex Objective Functions. In our fourth experiment we consider a set of 500 problems taken from PyLMT [24], a tool for Structured Learning Modulo Theories [34] which uses OptiMathSAT as back-end oracle for performing inference in the context of machine learning in hybrid domains. The objective functions \({obj} \) are complex combinations of PB functions in the form:

$$\begin{aligned} {obj}&\mathop {=}\limits ^{\text {def}}&\sum _j w_j \cdot B_j + cover - \sum _k w_k \cdot C_k - | K - cover |, \end{aligned}$$
(19)
$$\begin{aligned} s.t.\ cover&\mathop {=}\limits ^{\text {def}}&\textstyle \sum _i w_i A_i, \end{aligned}$$
(20)

\(A_i,B_j,C_k\) being Boolean atoms, \(w_i,v_j,z_k,K\) being rational constants. We imposed a timeout of 600 s. The results are presented in Fig. 7.

Looking at the table and at the scatterplot on the left, we notice that enhancing the \(\text {OMT}\)-based technique of OptiMathSAT by adding the cardinality networks improves the performances, although this time the improvement is not dramatic. (We believe this is due that the values of the weights \(w_i,v_j,z_k,K\) are very heterogeneous, not many weights share the same value.) Also, looking at the table and at the scatterplot on the right, we notice that \(\text {OMT}\)-based technique of OptiMathSAT performs significantly better than that of , even without the help of sorting networks.

Discussion. We summarize the results as follows.

  1. (a)

    When applicable, MaxSAT-based approaches performed much better than \(\text {OMT}\)-based ones, in particular when adopting Maximum-Resolution as MaxSAT engine.

  2. (b)

    Bidirectional sorting-network encodings improved significantly the performances of \(\text {OMT}\)-based approaches, and often also of MaxSAT-based ones.

  3. (c)

    performed better than OptiMathSAT on MaxSAT -based approaches, whilst the latter performed sometimes similarly and sometimes significantly better on \(\text {OMT}\)-based ones, in particular when enhanced by the sorting-network encodings.

6 Conclusion and Future Work

MaxSMT and OMT with Pseudo-Boolean objective functions are important sub-cases of OMT, for which specialized techniques have been developed over the years, in particular exploiting state-of-the-art MaxSAT procedures. When applicable, these specialized procedures seem to be more efficient than general-purpose OMT. When they are not applicable, OMT-based technique can strongly benefit from the integration with bidirectional sorting networks to deal with PB components of objectives.

OMT is a young technology, with large margins for improvements. Among others, one interesting research direction is that of integrating MaxSAT-based techniques with standard \(\text {OMT}\)-based ones for efficiently handling complex objectives and constraints, so that to combine the efficiency of the former with the expressivity of the latter.