1 Introduction

The optimal operation of water distribution networks (WDNs) requires the satisfaction of multiple criteria, some of which may be conflicting, in order to deliver increasing water demand cost-efficiently (Newman et al. 2014). Some objectives include the reduction of leakage, improvements in network resilience, optimization of water quality, and an efficient use of energy in pumping. The optimal management of pressure is one of the most effective methods to reduce leakage; a small decrease in average zone pressure has been shown to result in significant reductions in leakage (Lambert 2001). The sectorization of water distribution systems has also provided some benefit in this respect. Networks are subdivided into smaller sectors, called District Metered Areas (DMAs), so that flow into and out of each DMA is continuously monitored, improving the management of pressure and leakage (Farley and Trow 2003). On the other hand, this has severely reduced network redundancy, affecting resilience and water quality negatively. In particular, the action of valves to sectorize the network and reduce average zone pressure can generate high diurnal pressure variability across the nodes (Wright et al. 2014; Wright et al. 2015). Over the past few years, various studies have established the influence of pressure variability on pipe failures (Lambert and Thornton 2011). Moreover, the recent studies Martínez-Codina et al. (2015); Hoskins and Stoianov (2015); Rezaei et al. (2015) have highlighted the influence of different pressure variability indicators on the probability of occurrence of pipes’ failures. As a consequence, an effective pressure management has to consider both the minimization of average zone pressure and pressure variability.

As standard in WDN management, here we consider pressure management that is actuated by controlling pressure reducing valves (PRVs), which regulate the pressure at their downstream node, and boundary valves, which can allow a range of pressure differentials across a pipe as their setting is varied between fully open, closed or anything in between the two. In standard approaches, these valves are optimally controlled once engineers have installed them at locations chosen using practical knowledge.

In this paper, we depart from the standard practice and consider the problem of optimizing the location of the actuator (i.e. boundary valves and PRVs), and the optimal pressure settings simultaneously – what is referred to as a co-design optimization problem. Although such an approach generally offers better performance, the mathematical formulation for the simultaneous optimal design and operation problem presents significant challenges; it requires the solution of a difficult nonlinear optimization problem with both continuous and discrete variables. In the present study we consider nodal pressures, pipe flows and valve locations as decision variables, where binary variables are needed to model the placement of valves. In addition, the inclusion of hydraulic conservation laws results in non-convex, nonlinear constraints. In this work, we investigate the co-design problem with respect to two objectives, i.e. the minimization of average zone pressure and pressure variability via the optimal placement and control of valves. Therefore, a multi-objective optimization problem is formulated to establish the trade-offs between these two criteria. The resulting problem is a multiobjective mixed integer nonlinear program (MINLP).

Due to their underlying complex structure, multiobjective co-design optimization problems for water distribution networks are commonly studied using evolutionary algorithms (EAs), which can be easy to implement by coupling them with simulation software (Maier et al. 2014). However, EAs have some disadvantages. Firstly, they do not guarantee optimality of the solutions, not even local optimality. Secondly, a large number of function evaluations and simulations of the hydraulic model are required in order to generate useful solutions, which do not scale well with the size of the search space. Unlike mathematical optimization approaches, evolutionary methods do not explicitly or accurately handle nonlinear constraints.

Since optimal control profiles for pressure management need to satisfy strict physical, quality and economic constraints, we seek a method that guarantees an accurate handling of constraints. Moreover, we aim to propose a scalable approach for large scale water systems, exploiting the particular sparse structure of water distribution networks. Therefore, in the present work we propose the application of state-of-the-art optimization methods for the solution of multiobjective co-design problems for water distribution networks.

Although the natural aspiration is to find a feasible optimal configuration for all the objectives, typically no single solution that simultaneously optimises all conflicting objectives exists. Therefore, the mathematical notion of Pareto optimality is adopted to characterise the best compromises between conflicting objectives.

A feasible point for a multiobjective optimization problem is a global Pareto optimum if there is no other feasible point that improves one or more of the objectives without making another worse (Miettinen 1998). A decision vector is called a local Pareto optimum if it is a global Pareto optimum only when the optimization is restricted to its neighbourhood. However, for non-convex problems all the deterministic gradient-based methods for the generation of Pareto optima guarantee convergence only to local optima. Since in the present work we consider a non-convex multiobjective optimization problem, we will refer to local Pareto optima simply as Pareto optima, unless otherwise stated.

For a generic nonlinear optimization problem with m objectives, the set of Pareto optima is uncountable and very complex; it is a stratified set, a pairwise disjoint union of differentiable manifolds of dimension (m−1) with boundaries and corners of lower dimensions (Wan 1978; Smale 1976). For instance, the Pareto set is a pairwise disjoint union of segments of curves for the case of 2 objectives. Continuation methods (Hillermeier 2001) try to track the local structure of the Pareto optimal manifolds near some initial solutions. However, only few continuation methods can deal with the non-differentiable breakpoints introduced by inequality constraints (Martin et al. 2014) and handle non-convex and nonlinear problems, where the Pareto set is disconnected (Hartikainen and Lovison 2014). Moreover, in the case of multiobjective mixed integer nonlinear optimization problems, as considered in the present work, discrete decision variables introduce additional complexities – the Pareto set may be composed of disconnected segments of curves and isolated points (Das 2000). Since in general the differential composite structure of the Pareto set is not explicitly known, algorithmic mathematical methods focus on the space of the objective functions rather than the decision variable space; for an exhaustive survey see Miettinen (1998), Marler and Arora (2004) and Andersson (2000).

In the present work, we are interested in techniques which approximate the image of the Pareto set through the objective functions, called Pareto front. The aim is to provide a wide and uniform distribution of points of the front so that the decision maker can choose the Pareto optimal solution which better fits its needs. This is called an a posteriori approach since the articulation of the preferences by the decision maker is made after the generation of the approximated Pareto front (Marler and Arora 2004). Common a posteriori approaches include scalarization methods (Marler and Arora 2004; Das and Dennis 1998; Messac et al. 2003; Kim and De Weck 2005), which parametrize the multiobjective problem into a series of single-objective optimization problems that can be solved using standard nonlinear programming techniques. A popular scalarization method is the weighted sum (WS) of the objectives (Marler and Arora 2004). Despite its simplicity, some disadvantages of the WS approach include its inability to generate Pareto points in the non-convex parts of the Pareto front and the fact that an even spread of weights may not correspond an uniform distribution of points of the front (Das and Dennis 1997). In the past 20 years new methods have been proposed to deal with these drawbacks, in particular, the Normal Boundary Intersection (NBI) method (Das and Dennis 1998) and the Normalized Normal Constraint (NNC) method (Messac et al. 2003). These techniques can be combined with efficient gradient-based algorithms to find optimal solutions even for large-scale and highly constrained optimization problems, see for example (Logist et al. 2010).

However, these methods have been developed for continuous problems and their effectiveness for mixed integer problems is not (well) studied. The non-connectivity of the Pareto sets and the presence of isolated Pareto optimal points can cause severe problems for the NBI method. On the other hand, WS and NNC methods are less likely to suffer from the disconnected nature of the Pareto set. In fact, we will show this to be the case in Section 5, where disconnected Pareto branches in Fig. 6 are discussed.

The application of the scalarization methods to multiobjective MIPs considered in this article relies on the solution of a series of single-objective non-convex optimization problems belonging to the class of mixed integer nonlinear programming (MINLP). The solution of these problems requires handling non-convex nonlinear constraints in a discrete framework and hence it is particularly challenging, for a general surveys on MINLP see Lee and Leyffer (2012). A possible approach for non-convex MINLP is the application of a branch and bound algorithm to find at least local solutions (D’Ambrosio and Lodi 2013). However, in our case this would result in infeasible computational time for large scale water distribution networks, since it is necessary to solve a series of MINLPs to generate the Pareto front. Since the integer variables involved in our optimization problem are binaries, they can be reformulated as complementarity constraints, enforcing the variables to take only one of the two complementary values 0 and 1. Nonetheless, the complementarity constraints result is a special feasible set which violates standard requirements for the application of gradient-based solution algorithms and specialized approaches are needed to solve these problems (Scheel and Scholtes 2000; Leyffer 2006; Raghunathan and Biegler 2005; Ralph and Wright 2004; Hu and Ralph 2004). In particular we focus on relaxation methods: the MPPC is converted into a series of nonlinear programs with relaxed feasible sets which satisfy standard regularity assumptions for the application of state-of-the-art nonlinear programming (NLP) solvers. The sequence of solutions of the relaxed NLP sub-problems will converge to a solution for the original MPCC, see Herty and Steffensen (2012), Leyffer et al. (2006) and Scholtes (2001).

In this article, in order to solve the difficult multiobjective mixed-integer nonlinear problems, we propose coupling a relaxation approach for solving MIPs with state-of-the-art scalarization methods for generating Pareto fronts. Moreover, sparse techniques are used within the NLP solvers to improve the performance of relaxation methods since the large scale mixed integer nonlinear programs that arise in the framework of optimal co-design for water distribution networks have sparse constraints. The Jacobians and Hessians of the relaxed NLPs retain this sparsity, which we take advantage of.

In the following section, we motivate and formulate the multiobjective optimization problem for the minimization of average zone pressure and pressure variability. In Section 3, three different scalarization schemes, weighted sum, normal boundary intersection and normalized normal constraint methods are studied. We then outline the relaxation approach for the solution of mixed integer nonlinear programs in Section 4. Finally, in the last section we apply the presented methods to approximate the Pareto front of our multiobjective optimization problem using a selected benchmarking network as case study.

2 Problem formulation

Consider a water distribution networks with n 0 fixed head nodes (e.g. water sources), n n demand nodes and n p pipes. This can be modeled as a direct graph \(\mathcal {G}=(V,E)\) where V is the set of nodes (|V| = n n + n 0) and E the set of links. Since the direction of flow in each pipe is itself an unknown, we associate two directed flows with each pipe. The bi-directional flows are included by introducing, for each physical pipe j, the two links j and j + n p and so the graph representation has that |E|=2n p . In this work, the optimization is performed by considering the network operation over an extended time horizon – we consider n l different time steps in a diurnal operation and associated demand conditions.

Let k∈{1,…,n l } be a time step. Then we have that heads at water sources \({h_{0}^{k}} \in \mathbb {R}^{n_{0}}\) (expressed in meters), demands at nodes \(d^{k} \in \mathbb {R}^{n_{n}}\) (measured in \(\frac {m^{3}}{s}\)), nodal elevations \(e \in \mathbb {R}^{n_{n}}\) (in meters) are assumed known data. The unknown vector \(p^{k} \in \mathbb {R}^{n_{n}}\) represents pressure heads at each demand node, expressed in meters. Data for the set of links include the vectors of lengths \(L \in \mathbb {R}^{2n_{p}}\), diameters \(D \in \mathbb {R}^{2n_{p}}\) (both measured in meters) and the non-dimensional Hazen-Williams coefficients \(C \in \mathbb {R}^{2n_{p}}\) . Moreover, \(q^{k} \in \mathbb {R}^{2n_{p}}\) represents the unknown flow through the links, expressed in \(\frac {m^{3}}{s}\). The friction head loss across each link j can be represented using the semi-empirical Hazen-Williams (HW) formula: \(HW_{f}({q^{k}_{j}})=r_{j} ({q_{j}^{k}})^{1.852}\), where the resistance of the pipe \(r_{j}= \frac {10.670 L_{j}}{C_{j}^{1.852}D_{j}^{4.871}}\). Since the HW formula is non-smooth, it is difficult to handle for most nonlinear programming solvers. Therefore we define an accurate polynomial approximation. Given a generic link with length L, diameter D and HW coefficient C we model the head loss using the function h f (q) = a q 2 + b q with a and b determined by the minimization of the integral of square errors \(J(a,b)=\int (aq^{2}+bq-rq^{1.852})^{2} dq\). Once for each link the quadratic approximation coefficients are computed and we can define the friction head loss function as \(h_{f}(q^{k}):=({h_{f}(q^{k})}_{1},\ldots ,{h_{f}(q^{k})}_{2n_{p}})\) with

$${h_{f}(q^{k})}_{j}:=a^{*}_{j} ({q^{k}_{j}})^{2} + b^{*}_{j} {q^{k}_{j}},\; \forall j=1,\ldots,2n_{p}. $$

Finally, since we aim to solve a co-design problem for optimal valve placement and control we introduce the vector of unknown binary variable \(v \in \{0,1\}^{2n_{p}}\) with

$$v_{j}=\left\{\begin{array}{ll} 1, & \text{if there is a valve on link }j \\ 0, &\text{otherwise}. \end{array}\right. $$

In conclusion, the unknowns of the optimization problem can be represented by the vector x=[p 1,q 1,…,\( p^{n_{l}},q^{n_{l}},v]^{T} \in \mathbb {R}^{N}\) with N = n l (n n +2n p )+2n p . Note that B= {n l (n n +2n p )+1,…n l (n n +2n p )+2n p } is the index set of the components of \(x \in \mathbb {R}^{N}\) which correspond to binary variables.

The first objective that we are interested to minimize is average zone pressure (AZP), which is expressed in meters using the weighted sum:

$$ \mu_{1}(x):=\frac{1}{W}\sum\limits_{k=1}^{n_{l}}\sum\limits_{i=1}^{n_{n}}w_{i} {p_{i}^{k}} $$
(1)

where \(w_{i}={\sum }_{j \in I(i)}L_{j}/2\) and I(i) is the set of indices for links incident at node i, counted only once. The normalization factor \(W={\sum }_{i=1}^{n_{n}} w_{i}\) is is chosen so that, for each time step, \(\frac {1}{W}{\sum }_{i=1}^{n_{n}}w_{i} {p_{i}^{k}}\) is a convex combination of pressures.

We aim to investigate the trade-offs between AZP and pressure variability (PV), the latter of which can be can be expressed in meters as:

$$ \mu_{V}(x):= \sum\limits_{i=1}^{n_{n}}({p_{i}^{1}}-p_{i}^{n_{l}})^{2}+\sum\limits_{k=2}^{n_{l}} \sum\limits_{i=1}^{n_{n}}({p_{i}^{k}} - p_{i}^{k-1})^{2}. $$
(2)

Note that μ V encodes the pressure variability indicators that in Martínez-Codina et al. (2015) are linked with pipe failures. The function in (2) is a quadratic form and can be written as x T A x, with \(A \in \mathbb {R}^{N \times N}\) symmetric positive semidefinite. Moreover, since the quadratic form only depends on pressure head values, we have that x T A x = p T A p p with p=[p 1,…,p k]T and \(A_{p} \in \mathbb {R}^{n_{l}n_{n} \times n_{l}n_{n}}\) symmetric positive semidefinite. A preliminary numerical analysis on the considered optimization problem has shown that gradient-based nonlinear programming solvers perform better if at least matrix A p is positive defined, hence we add a small regularization term δ>0 to the diagonal elements of A p so that it becomes diagonal dominant and then positive defined. Let \(\tilde {A_{p}}:=A_{p}+\delta I_{n_{l}n_{n}}\) and \(\tilde {A}\) the matrix such that \(x^{T}\tilde {A}x=p^{T}\tilde {A_{p}}p\). If the perturbation term δ is sufficiently small it will not significantly affect the optimization process. This is a standard regularization scheme, see Nocedal and Wright (2006). Finally, we define the second objective function as \(\mu _{2}(x):=x^{T}\tilde {A}x\).

The optimization problem is primarily subject to the hydraulic constraints:

$$ A_{12}^{T} q^{k} = d^{k}, $$
(3a)
$$ Q(q^{k})(-A_{12}p^{k} - A_{12}e-A_{10}{h_{0}^{k}}-h_{f}(q^{k}))\geq 0, $$
(3b)
$$ -A_{12}p^{k} - A_{12}e-A_{10}{h_{0}^{k}}-h_{f}(q^{k})-M^{k}v\leq 0, $$
(3c)

for each k=1,…,n l . (3a) represents the conservation of mass at each junction node, where \(A_{12} \in \mathbb {R}^{2n_{p} \times n_{n}}\) is the branch-node incidence matrix for the unknown pressure head nodes. Inequalities (3b) and (3c) enforce the conservation of energy across each link. Matrix \(A_{10} \in \mathbb {R}^{2n_{p} \times n_{0}}\) is the branch-node incidence matrix for fixed head nodes. Moreover, \(Q(q^{k})=diag({q^{k}_{1}},..,q^{k}_{2n_{p}})\) and \(M^{k}=diag({M^{k}_{1}},..,M^{k}_{2n_{p}})\) are two diagonal matrices in \(\mathbb {R}^{2n_{p} \times 2n_{p}}\). The positive constants \({M^{k}_{j}}\) are fixed for each time step k and link j and have to be sufficiently large in the following sense. Assume that a link \(i_{1} \xrightarrow {j} i_{2}\) connects two junction nodes (the case of a link connected to a water source is analogous). Then the two inequalities (3b) and (3c) yield:

$$\begin{array}{@{}rcl@{}} {q^{k}_{j}}(h^{k}_{i_{1}}-h^{k}_{i_{2}}-h_{f}({q^{k}_{j}}))& \geq 0 \\ h^{k}_{i_{1}}-h^{k}_{i_{2}}-h_{f}({q^{k}_{j}})-{M^{k}_{j}}v_{j} & \leq 0 \end{array} $$

If \({q^{k}_{j}}>0\) and v j =0 then the two inequalities are equivalent to the standard Bernoulli equation. But when a valve is placed on the link j and v j =1, provided that \({M^{k}_{j}}\) is big enough, the second inequality is inactive (always satisfied) and this couple of constraints is equivalent to \({q^{k}_{j}}(h^{k}_{i_{1}}-h^{k}_{i_{2}}-h_{f}({q^{k}_{j}}))\geq 0\). This means that the head loss across the link j is larger than just the friction loss: the action of the valve is increasing the head loss and reducing pressure at the downstream node. When \({q^{k}_{j}}=0\) and v j =0, the inequalities imply \(h_{i_{1}}\leq h_{1_{2}}\) and if v j =1 then both constraints are always satisfied. Using this analysis, it is possible to show that a solution where flows in both direction i 1i 2 and i 2i 1 are positive is infeasible. In view of the above discussion, we select \({M^{k}_{j}}\) equal to the maximum possible head loss across link j at time step k.

Additional physical, economic and quality requirements are enforced using the optimization constraints:

$$v_{j} + v_{j+n_{p}}\leq 1, \; \forall j=1,\ldots,n_{p},$$
(4a)
$$\sum\limits_{j=1}^{2n_{p}}v_{j}= n_{v},$$
(4b)
$$p^{k}_{min}\leq p^{k} \leq p^{k}_{max}, \; \forall k=1,\ldots,n_{l},$$
(4c)
$$0 \leq q^{k} \leq q^{max}, \; \forall k=1,\ldots,n_{l}$$
(4d)
$$v \in \{0,1\}^{2n_{p}},$$
(4e)

where the maximum number of valves n v is a surrogate for economic constraints, and demand needs to be supplied with minimum pressure constraints. Note that the presented set of constraints consider the placement of a pressure reducing valve (PRV) or a boundary valve (BV) within the same mathematical formulation. For example, the placement of a valve on a link with zero flow leaves the pressure head difference across the same link to be a free variable, which coincides with the placement of a closed boundary valve. On the other hand, the placement of a valve on a link with non-zero flow models the placement of a PRV: the optimal setting of the valve determines the pressure at its downstream node. Therefore, the co-design optimization process allows to identify which physical actuator should be used and, at the same time, addresses its optimal placement and operation.

The problem of simultaneously minimizing the two objectives μ 1 and μ 2, subject to the constraints (3a) and (4a), defines a multiobjective mixed integer nonlinear program, which can be written in a compact form as

$$\begin{array}{@{}rcl@{}} \min_{x \in \mathbb{R}^{N}} \quad &&\mu(x):=(\mu_{1}(x),\mu_{2}(x)) \end{array} $$
(5a)
$$\begin{array}{@{}rcl@{}} \text{subject to:} \quad &&g(x) \leq 0, \end{array} $$
(5b)
$$\begin{array}{@{}rcl@{}} &&h(x)=0, \end{array} $$
(5c)
$$\begin{array}{@{}rcl@{}} &&x_{j} \in \{0,1\}, \quad \forall j \in B. \end{array} $$
(5d)

For each \(x \in \mathbb {R}^{N}\), we have set g(x):=(g i (x)) iI and h(x):=(h i (x)) iE , the vectors corresponding to the rows of inequality constraints (3b), (3c), (4a), (4c), (4d) and equality constraints (3a), (4b), respectively. Therefore, \(g:\mathbb {R}^{N} \rightarrow \mathbb {R}^{|I|}\) is a polynomial non-convex function, while \(h: \mathbb {R}^{N} \rightarrow \mathbb {R}^{|E|}\) is affine.

3 Mathematical methods for the solution of multiobjective optimization problems

Problem (5a) aims to investigate the trade-offs between minimization of pressure variability (PV) and average zone pressure (AZP) in a water distribution network, by the optimal placement and control n v control valves. In Fig. 1, we present the average zone pressure profile corresponding to solutions of single objective minimization problems for AZP and PV, with n v =3. It is clear that the configuration which minimizes pressure variability does not correspond to minimization of average zone pressure.

Fig. 1
figure 1

Average zone pressure corresponding to individual optimal solutions for the minimization of μ 1 (AZP) and μ 2 (PV)

Therefore, it is necessary to search for optimal compromises between these two objectives. Let’s define the feasible set of Problem (5a) as \(\mathcal {T}:=\{x \in \mathbb {R}^{N} \, | \, g(x)\geq 0, \, h(x)=0, \, x_{j} \in \{0,1\} \, \forall j \in B\}.\) Then, we present some technical definitions that characterise these trade-offs as follows:

Definition 1

(Miettinen 1998, Section 2.2) A point \(x \in \mathcal {T}\) is said dominated by \(y \in \mathcal {T}\) if μ j (y)≤μ j (x) ∀j∈{1,2} and μ(y)≠μ(x).

Definition 2

(Miettinen 1998, Section 2.2) A point \(x^{*} \in \mathcal {T}\) is called global Pareto optimum for Problem (5a) if it is not dominated by any point in \(\mathcal {T}\).

A point x is called local Pareto optimum if there exists a neighbourhood U(x ) such that x it is not dominated by any point in \(\mathcal {T} \cap U(x^{*})\) (Miettinen 1998, Section 2.2). Since our problem is non-convex, gradient-based methods will find only local Pareto optima. Therefore, we will generically call these points Pareto optima. Let \(\theta \subset \mathcal {T}\) be the set of all the Pareto optima in the space of decision variables. Then the subset of the objectives space μ(𝜃) is called Pareto front. In the following, we present three approximation methods that allow us to generate the Pareto front of Problem (5a). We will first define some terms that are useful in the mathematical formulation of the three scalarization methods.

  • \(x_{i}^{*}\) is minimizer of objective μ i . The point \(\mu _{i}(x_{i}^{*})\) is called the i-th Anchor Point

  • \(\mu ^{u}:=(\mu _{1}(x_{1}^{*}),\mu _{2}(x_{2}^{*}))^{T}\) is the utopia point, a point in the objective function space that is in general infeasible,

  • utopia line is the line joining the points \(\mu (x_{1}^{*})\) and \(\mu (x_{2}^{*})\),

  • the pay-off matrix

    $${\Phi}:= \left[\begin{array}{lll} 0 & \mu_{1}(x_{2}^{*})-\mu_{1}(x_{1}^{*}) \\ \mu_{2}(x_{1}^{*})-\mu_{2}(x_{2}^{*}) & 0 \end{array}\right], $$
  • \(\textbf {e}:=(1,1)^{T} \in \mathbb {R}^{2}\).

3.1 Weighted Sum Method

For ω∈[0,1] consider the MINLPWS(ω)

$$\begin{array}{@{}rcl@{}} \min\limits_{x \in \mathbb{R}^{N}} \quad &&(1-\omega) \bar{\mu}_{1}(x) + \omega \bar{\mu}_{2}(x) \end{array} $$
(6a)
$$\begin{array}{@{}rcl@{}} \text{subject to: } \quad &&g(x) \leq 0, \end{array} $$
(6b)
$$\begin{array}{@{}rcl@{}} &&h(x)=0, \end{array} $$
(6c)
$$\begin{array}{@{}rcl@{}} &&x_{j} \in \{0,1\}, \quad \forall j \in B, \end{array} $$
(6d)

where the objective functions are normalized using the utopia point as:

$$\bar{\mu}_{1}(x):=\frac{\mu_{1}(x)-\mu_{1}(x_{1}^{*})}{\mu_{1}(x_{2}^{*})-\mu_{1}(x_{1}^{*})}, \qquad \bar{\mu}_{2}(x):=\frac{\mu_{2}(x)-\mu_{2}(x_{2}^{*})}{\mu_{2}(x_{1}^{*})^{*}-\mu_{2}(x_{2}^{*})}. $$

It is easy to prove that a minimum of Problem (6a) is a Pareto optimal point for Problem (5a), see Miettinen (1998).

The WS method solves MINLPWS(ω) for a given distribution of weights in order to generate various Pareto optimal points and approximate the Pareto front. However, as pointed out before, this method has some drawbacks: it is not possible to generate points on the non-convex part of the Pareto Front and in most cases the distribution of points is not uniform.

3.2 Normal boundary intersection method

The Normal Boundary Intersection method was developed in order to deal with the disadvantages of the weighted sum method. Again, the approximation of the Pareto front is obtained solving a sequence of single-objective problems: for ω∈[0,1] consider MINLPNBI(ω)

$$\begin{array}{@{}rcl@{}} \max\limits_{(x,\tau) \in \mathbb{R}^{N} \times \mathbb{R}} \quad && \tau \end{array} $$
(7a)
$$\begin{array}{@{}rcl@{}} \text{subject to: } \quad &&g(x)\leq 0, \end{array} $$
(7b)
$$\begin{array}{@{}rcl@{}} &&h(x)=0, \end{array} $$
(7c)
$$\begin{array}{@{}rcl@{}} && \mu^{u} + {\Phi} \left( \begin{array}{ll} 1-\omega \\ \omega \end{array}\right) -\tau {\Phi} \textbf{e} = \mu(x) , \end{array} $$
(7d)
$$\begin{array}{@{}rcl@{}} && x_{j} \in \{0,1\}, \quad \forall j \in B. \end{array} $$
(7e)

Note that \(\mu ^{u} + {\Phi } \left (\begin {array}{ll} 1-\omega \\ \omega \end {array}\right ) \) is a point belonging to the utopia line. Moreover, −Φe is the quasi-normal direction to the utopia line, pointing towards μ u.

This approach generates a uniform spread of Pareto points given an even distribution of ω∈[0,1] and also points belonging to the non-convex part of the Pareto front can be generated, see Fig. 2. However, the NBI method is dependent on the anchor points \(\mu _{i}(x_{i}^{*})\), different local minima \(x_{1}^{*}\), \(x_{2}^{*}\) of the individual minimizations of μ 1 and μ 2 may result in the generation of different local Pareto points. Moreover, it is well known that the method can generate non Pareto points (Das and Dennis 1998; Logist and Van Impe 2012; Marler and Arora 2004).

Fig. 2
figure 2

NBI method: searches the intersection between the quasi-normal line passing through a point on the utopia line and the boundary of the feasible set

3.3 Normalized normal constraint method

Using the same notation as for the NBI it is possible to formulate another approach for the generation of the Pareto front. The Normalized Normal Constraint (NNC) method relies on the solution of a sequence of sub-problems MINLPNNC(ω) defined by:

$$\begin{array}{@{}rcl@{}} \min_{x \in \mathbb{R}^{N}} \quad && \bar{\mu}_{2}(x) \end{array} $$
(8a)
$$\begin{array}{@{}rcl@{}} \text{subject to: } \quad && g(x)\leq 0, \end{array} $$
(8b)
$$\begin{array}{@{}rcl@{}} &&h(x)=0, \end{array} $$
(8c)
$$\begin{array}{@{}rcl@{}} &&\bar{N} \cdot (\bar{\mu}(x)-\bar{\Phi}\omega)\leq 0, \end{array} $$
(8d)
$$\begin{array}{@{}rcl@{}} && x_{j} \in \{0,1\}, \quad \forall j \in B. \end{array} $$
(8e)

where the normalization of (μ 1,μ 2) in the objective space is defined by:

$$\bar{\mu}_{1}(x):=\frac{\mu_{1}(x)-\mu_{1}(x_{1}^{*})}{\mu_{1}(x_{2}^{*})-\mu_{1}(x_{1}^{*})}, \qquad \bar{\mu}_{2}(x):=\frac{\mu_{2}(x)-\mu_{2}(x_{2}^{*})}{\mu_{2}(x_{1}^{*})-\mu_{2}(x_{2}^{*})}, $$

and \(\bar {N}:=\bar {\mu }(x_{2}^{*})-\bar {\mu }(x_{1}^{*})\) is the direction of the utopia line, pointing towards \(\bar {\mu }(x_{2}^{*})\), while \(\bar {\Phi }\) is the normalized pay-off matrix and \(\bar {\Phi }\omega \) represents a point of the normalized utopia line. MINLPNNC(ω) performs a minimization of \(\bar {\mu }_{2}\) on the half-space defined by the normal plane to the utopia line as shown in Fig. 3.

Fig. 3
figure 3

NNC method solves a sequence of sub-problems minimizing μ 2 over a restricted feasible space (filled in grey)

Although NBI and NNC have similarities in the way they use normal directions, NNC has been observed to be less likely to generate non-Pareto points (Logist and Van Impe 2012; Messac et al. 2003). The example in Fig. 4a illustrates a situation in which NBI will generate non Pareto points, while NNC will not. In fact, the segments \([\mu (x_{1}^{*}),A]\) and [B,C] are Pareto optimal (at least locally), while [A,B] is not. Every NBI sub-problems started from a point belonging to [A ,B ] will generate non Pareto points, while the solutions of the correspondent NNC sub-problems will be the Pareto point A. Moreover, NNC can be reformulated considering the minimization of \(\bar {\mu }_{1}\) and the set of the Pareto points generated can depend on this choice (Logist and Van Impe 2012). For example, observe that in Fig. 4a the minimization of \(\bar {\mu }_{1}\) by the NNC sub-problem can generate non-Pareto points that are avoided by the minimization of \(\bar {\mu }_{2}\). Since there are no general a priori criteria for the choice of the objective to be minimized, it is suggested to test both options (Logist and Van Impe 2012).

Fig. 4
figure 4

a NNC is more robust than NBI here since it can avoid the generation of some non-Pareto points; b however, in highly non-convex problem \(\mu (\mathcal {T})\) can be significantly folded and the Pareto front disconnected

The additional equality constraints of NBI are particularly stringent since they restrict the search in \(\mu (\mathcal {T})\) to a line. As pointed out in Das (2000), this is expected to generate complications specifically in the case of a multiobjective mixed integer nonlinear optimization problem. In fact, the discrete decision variables result in a Pareto front with disconnected branches and isolated points, where the intersection between the quasi-normal direction and the Pareto front can be empty. On the other hand, the weighted sum method does not suffer from the disconnected nature of the Pareto front. Das and Dennis (1997) presented the following geometric interpretation of the WS method: a weight ω∈[0,1] defines a particular line in the space of the objective functions and solving problem MINLPWS(ω) corresponds to translating that line along its orthogonal direction and choosing the first intersection point with \(\mu (\mathcal {T})\). As a result, the WS method does not experience the difficulties associated with disconnected Pareto fronts. Likewise, the NNC method is not expected to preclude the convergence to Pareto points which are isolated or belong to disconnected branches because its additional inequality constraint enforces a minimization over half-spaces, which is less strict than the equality constraints used in NBI.

However, when the image of the feasible set is folded, both the NBI and NNC methods will generate different local Pareto optima (or even non Pareto points) when different initial conditions are used for the corresponding sub-problems (Messac et al. 2003). In Fig. 4b, we illustrate this situation; the segments of curve \([\mu (x_{2}^{*}),A]\) and \([E,\mu (x_{1}^{*})]\) constitute the global Pareto front, while the branches [B,C] and [D,E] are only locally Pareto optimal. Note that although is some examples the Pareto front does not have such complex structures (Logist et al. 2010), many engineering problems have a highly nonlinear, non-convex structures that result in a disconnected Pareto front. Furthermore, even when the global Pareto front is connected, it could be made up from different locally optimal branches that are cut and glued together, corresponding to separate regions in the space of the decision variables (Hartikainen and Lovison 2014) (see also Fig. 6d and the discussions on it). Despite some differences, NBI and NNC are intrinsically related. The work by Logist and Van Impe (2012) shows that, given a weight ω, any stationary point of a NBI-subproblem is also a stationary point for the NNC-subproblem, provided that the additional inequality constraint of NNC is active. However, as pointed out also by the same authors, the front generated by NBI and NNC could be different in practice, due to the non-convexity of the scalarization optimization sub-problems.

4 Solution of single-objective MINLPs: a continuous relaxation

All the methods for the generation of the Pareto front for Problem (5a) involve the solution of single objective MINLP problems whose structure can be generically represented as:

$$\begin{array}{@{}rcl@{}} \min_{x \in \mathbb{R}^{N}} \quad && f(x) \end{array} $$
(9a)
$$\begin{array}{@{}rcl@{}} \text{subject to:} \quad && c_{I}(x) \leq 0, \end{array} $$
(9b)
$$\begin{array}{@{}rcl@{}} & &c_{E}(x)=0, \end{array} $$
(9c)
$$\begin{array}{@{}rcl@{}} & & x_{j} \in \{0,1\}, \quad \forall j \in B. \end{array} $$
(9d)

Since its integer constraints are binary, Problem (9a) can also be reformulated as a mathematical program with complementarity constraints (MPCC):

$$ \begin{array}{ll} \min_{x \in \mathbb{R}^{N}} \quad & f(x) \\ \text{subject to}\quad & c_{I}(x) \leq 0, \\ &c_{E}(x) = 0, \\ &0 \leq x_{j} \perp 1-x_{j} \geq 0, \quad \forall j \in B. \\ \end{array} $$
(10)

The feasible set of a mathematical program with complementarity constraints has a special structure which results in the violation of standard constraints qualification. Therefore, it is necessary to introduce constraints qualifications tailored for MPCCs (Scheel and Scholtes 2000), without which standard nonlinear programming solvers will face convergence issues when applied directly to Problem (10). Various reformulations based on constraint relaxations and penalty methods have been proposed in order to deal with this pathological characteristic. For a general discussion we refer the reader to Leyffer et al. (2006), Leyffer (2006), Raghunathan and Biegler (2005), Ralph and Wright (2004), Scheel and Scholtes (2000), Herty and Steffensen (2012), Scholtes (2001) and the references therein.

Here, we will focus on the relaxation scheme based on the Fischer-Burmeister function (Leyffer 2006; Herty and Steffensen 2012): \(\phi _{FB}(a):=1-\sqrt {1+2a^{2}-2a}\). Clearly, ϕ F B (a)=0⇔a=0 or a=1, and so we can substitute for (9d), the constraint ϕ F B (x j )=0, ∀jB. The MPCC in (10) can then be approximated by the relaxed problem REL(t):

$$ \begin{array}{ll} \min_{x \in \mathbb{R}^{N}} \quad & f(x) \\ \text{subject to: }\quad & c_{I}(x) \leq 0, \\ & c_{E}(x) = 0, \\ & 0 \leq x_{j} \leq 1, \quad \forall j \in B, \\ & \phi_{FB}(x_{j}) \leq t, \quad \forall j \in B, \end{array} $$
(11)

where t>0. Most nonlinear programming (NLP) solvers are designed to find only stationary points, particularly in the non-convex case. The following result relates the stationary points of REL(t) to the stationary points of Problem (10).

Theorem 1

Assume that \((x^{k})_{k \in \mathbb {N}}\) is sequence of stationary points for REL(t k ). Let \(x^{k} \xrightarrow {k \rightarrow +\infty } \bar {x}\) as \(t^{k} \xrightarrow {k \rightarrow +\infty }0\) . Finally, assume that \(\bar {x}\) satisfies a suitable constraints qualification for MPCCs. Then \(\bar {x}\) is a stationary point for Problem ( 10 ).

Proof

The proof follows as in (Herty and Steffensen 2012, Theorem 2.3) after the introduction of a slack variable w:=1−x. □

As shown in Ralph and Wright (2004), if Problem (10) satisfies a linear independence constraints qualification tailored for programs with complementarity constraints, then the relaxed problems (11) satisfy the standard linear independence constraints qualification (LICQ) for nonlinear programs - see also (Nocedal and Wright 2006, Ch. 12) Such a tailored constraints qualification is denoted by MPCC-LICQ.

Moreover, as discussed in Scholtes and Stȯhr (2001), MPCC-LICQ is a generic condition for mathematical programs with complementarity constraints. This means that even if a particular MPCC violates the constraints qualification, an arbitrary small perturbation of the constraints will result in the satisfaction of MPCC-LICQ. It is possible to prove that Problem (10) satisfies MPCC-LICQ, once a small perturbation is applied - for the sake of brevity, we do not include the proof here.

Moreover, it can be shown that the gradients of the equality constraints of Problem (11) are linearly independent, while rank deficiencies occur for some of the inequality constraints. Nonetheless, as reported in Section 5, no numerical difficulty was encountered when the interior point solver IPOPT (Waechter and Biegler 2006) was applied. In fact, only the linear dependence of equality constraints has a negative impact on interior point methods, while the gradients of inequality constraints cannot cause any rank deficiency of the matrices involved into these algorithms - see (Nocedal and Wright 2006, pp. 573–574).

According to Theorem 1, it is possible to find stationary points of single-objective MPCC problems solving a sequence of relaxed nonlinear programs, for decreasing values of the relaxation parameter t. In the present work, every single-objective MPCC within the scalarization methods (6a), (7a) and (8a) is solved using Algorithm 1, which is based on Theorem 1. The procedure stops when the complementarity violation \(\texttt {Vio}(x):=\max _{j \in B}\bigl ({\min {(x_{j},1-x_{j})}\bigr )}\) is below a given tolerance.

figure a

This approach offers a scalable method for the solution of design problems for WDNs, since the relaxed problems can take advantage of sparsity in the constraints using tailored NLP solvers. Therefore, the relaxation approach is particularly convenient for application to multiobjective problems, where in order to generate a uniform and detailed Pareto front, it may be necessary to solve a large number of single-objective MINLPs. In this case, the application of standard branch and bound techniques for MINLPs would require infeasible computational time, particularly when dealing with large-scale water networks (see for example the computational experience reported in D’Ambrosio and Lodi (2013)).

5 Case study

The outlined methods were implemented for the solution of the bi-objective optimization problem (5a) using an example network model. The selected benchmarking network has 22 nodes, 37 pipes and 3 reservoirs, see Fig. 5. Details on pipes’ characteristics and nodal demands can be found in Jowitt and Xu (1990) and Araujo et al. (2006), respectively. The resulting MINLP has 2304 continuous variables, 74 binary variables, 566 linear constraints and 3552 nonlinear constraints.

Fig. 5
figure 5

25 nodes selected benchmarking network model

At each iteration of WS, NBI or NNC methods, the corresponding single-objective sub-problem was solved using the continuous smooth relaxation illustrated in Algorithm 1, generating a sequence of stationary points for relaxed sub-problems; each of these nonlinear programs was solved using the interior-point solver for large scale optimization problems IPOPT (Waechter and Biegler 2006). An interior-point solver was preferred to an active-set method because of the highly constrained structure of the optimization problem: an active-set approach needs to keep track of changes in the active-set of constraints and bounds, requiring additional computational resources (Nocedal and Wright 2006). In our implementation, IPOPT options mu_strategy and mu_oracle were set to “adaptive” and “LOQO”, respectively - see also Kawajir et al. (2015). Moreover, a warm-start technique was applied in both the relaxation algorithm (inner iteration) and scalarization methods (outer iteration) using the solution of a sub-problem (and the corresponding Lagrangian multipliers) as initial condition for the next iteration.

Since the optimization problem is sparse, we have exploited the corresponding sparse structure of the Jacobian of the constraints through IPOPT. We have tested the behaviour of the solver with exact sparse Hessians and approximations computed by a limited-memory quasi-Newton method (L-BFGS), available within IPOPT (Kawajir et al. 2015). It was noted that better performances were obtained using L-BFGS approximation, both in terms of number of iterations and quality of solutions; therefore, the L-BFGS Hessian approximation was used in our implementation. The linear systems within IPOPT were solved using the package MA57 (Duff 2004), which is well suited for solving sparse linear systems. After preliminary simulations, we selected the perturbation value δ=10−4 for the regularization factor in \(\tilde {A}\) (see Section 2); in practice, this choice is shown not to significantly perturb the optimization problem. All computations were performed within MATLAB 2015a for Windows, installed on a 2.50GHz Intel® Xeon(R) CPU E5−26400 with 18 Cores.

We applied the weighted-sum, normal boundary intersection and normalized normal constraint methods in the optimal placement of 1 to 4 control valves, addressing the minimization of average zone pressure (AZP) and pressure variability (PV) simultaneously. The resulting Pareto fronts are reported in Figs. 6 and 7.

Fig. 6
figure 6

Pareto fronts for the optimal valve placement of 1 and 2 valves: the non-dominated solutions are circled. Each method was set to generate 20 points

Fig. 7
figure 7

Pareto fronts for the optimal valve placement of 3 and 4 valves: the non-dominated solutions are circled. Each method was set to generate 20 points

These Pareto fronts are effective tools for the analysis of the trade-offs between AZP and PV. For example, consider the problem for the optimal placement of 3 valves, discussed also in Section 3. The selection of a point on the front different from the two individual minima results in the average zone pressure profile reported in Fig. 8, representing a compromise between the minimization of AZP and the need to maintain low pressure variability.

Fig. 8
figure 8

Example of trade-off between our two objectives. The selected Pareto optimum results in a satisfactory average zone pressure, maintaining a low variability

In Section 3, we argued that the Pareto set for a multiobjective mixed integer nonlinear optimization problem is generally disconnected and may be composed of many isolated points. As Figs. 6 and 7 show, in this particular case study, the efficient trade-offs between the two objectives are distributed along continuous and possibly disconnected branches but they are not represented by isolated points.

As expected, the proposed methods have generated a number of local Pareto optima, some of which dominated, due to the non-convexity of the single-objective sub-problems. Therefore, the Pareto filter proposed in Messac et al. (2003) was used to identify the non-dominated Pareto configurations obtained in each instance. As anticipated in Section 3, the weighted sum method has not suffered from the disconnected nature of the Pareto set and has generated good approximations of the front. However, as expected, the Pareto points are not uniformly distributed along the front. On the other hand, the application of the NBI method resulted in uniform Pareto fronts for the case of 1 and 4 valves. Nonetheless, a number of non-Pareto points were obtained in all instances and the fronts for 2 and 3 valves are not well captured by the NBI method. As stated in Section 3, this behaviour is predictable since NBI is known to generate non Pareto points when disconnected local Pareto branches are present– see also the folded case in Fig. 4b.

Figures 6 and 7 show that the NNC method is fairly robust: it has generated uniform and well-spread Pareto fronts with fewer dominated points, in all shown instances. Note that, in many practical situations, the global Pareto front can be very complex; for example, in Fig. 6f it is composed of the union of separate local branches. In this case, the visual analysis of the front is not sufficient to determine whether the two Pareto curves are disconnected or intersect each other. It is known that nonlinear maps can project a set of disconnected domains into a connected image, or a disconnected one. Both cases are frequently encountered in the solution of nonlinear multiobjective optimization problems. An example of a disconnected Pareto front is discussed in Section 3 - see also Fig. 4. On the other hand, some multiobjective problems can give rise to connected Pareto fronts, composed of the non-dominated set of the union of multiple local branches that intersect and correspond to separate regions of the domain, see also Hartikainen and Lovison (2014).

This situation is also particularly interesting from the hydraulic application point of view. In fact, for the considered optimization problem, the jump from one local branch to another represents a change in the locations of the control valves. On the other hand, when moving along the continuous local Pareto curves, the locations are fixed and only the valve settings are modified. As can be observed in Fig. 6f, if we fix the valve locations in order to individually minimize PV (i.e. we are on the Pareto front on the right) and change the operational settings trying to decrease AZP, we can not obtain significant reductions in average zone pressure. However, we note that smaller values in AZP, with similar PV, can be achieved on the Pareto curve on the left that corresponds to a different valve location, which is the location that minimizes the individual objective of average zone pressure. Therefore, we can conclude that the solution of the purely operational problem (where the locations of the control valves are fixed) can lead to suboptimal pressure management when considering the two objectives.

In the present work we have only considered trade-offs between 2 conflicting criteria. Nonetheless, other objectives can be included in a straightforward fashion, the optimization of water quality or resilience for example. The generalization and application of the weighted sum and normal boundary intersection methods to optimization problems with more than 2 objectives is straightforward (Marler and Arora 2004; Das and Dennis 1998). On the other hand, the normalized normal constraint method has to be modified in order to properly handle more than 2 objectives and the enhanced normalized normal constraint method of Sanchis et al. (2008) can be used.

In Table 1 we report the average number of IPOPT iterations and CPU time needed to compute a point of the Pareto front, for each scalarization method and number of valves considered here. Note that in our practical experience, the procedure described in Algorithm 1 produces a solution of the original single-objective MINLP after 3 iterations.

Table 1 Average computational time and number of IPOPT iterations needed to compute one point of the Pareto front, with each scalarization method considered in the present study

Moreover, note that we are considering an off-line co-design problem, aiming to compute optimal valves’ locations and pressure settings. Once the optimization process is complete, it is possible to define for each installed valve a feedback rule based on the optimization results, see Ulanicki et al. (2000). Therefore, valves can ultimately be operated through an on-line feedback control system. Nonetheless, the application of standard techniques for non-convex mixed integer programs would be impractical (D’Ambrosio and Lodi 2013), especially when considering large scale water networks. On the other hand, the reported computational experience is promising and suggests that the proposed methods can be successfully applied to large scale problems related to optimal valves’ placement and operation in water distribution networks. Finally, all computations were executed in series on a single workstation; however, scalarization methods can be easily parallelized - i.e. different Pareto points can be computed in parallel.

6 Conclusions

We have presented a multiobjective co-design optimization problem for optimal valve placement and operation in water distribution networks, addressing the minimization of average zone pressure and pressure variability. By modelling boundary valves and pressure reducing valves within the same framework, the optimization allows to identify which physical actuator should be used and, at the same time, it provides optimal location and pressure settings. The considered formulation results in a multiobjective mixed integer nonlinear optimization problem. Because of the conflict among the two objectives we have adopted the notion of Pareto optimality to provide a mathematical characterization of the best compromises between conflicting criteria. We have focused on the objectives space and we have investigated the application of scalarization methods for the generation of the image through the objective functions of the set of Pareto optima, the so called Pareto front. We have described weighted sum, normal boundary intersection and normalized normal constraint methods, discussing their strengths and limitations. It was shown that each scalarization method relies on the solution of a series of single-objective sparse mixed integer nonlinear programs (MINLPs).

Since the integer variables considered are binary, they allow a reformulation as a mathematical program with complementarity constraints (MPCC) whose solution can be obtained as limit point of a sequence of stationary points of sparse nonlinear programs - a relaxation method. The application of scalarization and relaxation methods for the solution of multiobjective mixed integer nonlinear problems is, to the authors knowledge, novel. Furthermore, the iterative solution of sparse NLPs provides a scalable approach for large scale water distribution networks, since sparse techniques can be coupled with state-of-the-art NLP solvers to efficiently find (at least local) solutions.

Finally, we have applied the presented methods to solve a bi-objective optimization problem for a published benchmark network model, considering a varying level of actuation (i.e. number of valves). Our numerical simulations have shown that the normal boundary intersection method is most affected by the disconnected nature of the Pareto fronts. On the other hand, the weighted sum and normalized normal constraint methods have generated good approximations of the Pareto fronts. In line with expectations, the fronts obtained by the weighted sum method are not as uniform as the ones produced by the normalized normal constraint approach. Moreover, the normalized normal constraint method has been shown to be robust, since it has avoided most of the non-Pareto points found by the others approaches. The results from the case study are promising and suggest that the presented scalarization approaches for the Pareto front generation, coupled with the relaxation method, can be applied to other multiobjective co-design optimization problems for large scale water distribution networks, even for problems with more than 2 objectives.