Elsevier

Journal of Hydrology

Volume 577, October 2019, 123994
Journal of Hydrology

Research papers
Surrogate assisted multi-objective robust optimization for groundwater monitoring network design

https://doi.org/10.1016/j.jhydrol.2019.123994Get rights and content

Highlights

  • Propose a new stochastic multi-objective optimization algorithm (ε-MONMA).

  • Implement accurate uncertainty quantification of concentration using sparse PCE.

  • Achieve the obvious improvement of robustness of monitoring network design.

  • Save significant CPU time while guaranteeing the robustness of Pareto solutions.

Abstract

The robust optimization of groundwater quality monitoring network is subject to many conflicting objectives and high level of uncertainty in hydraulic conductivity. This study develops a two-stage stochastic optimization framework including the uncertainty quantification using a cheap-to-evaluate surrogate model and an improved epsilon multi-objective noisy memetic algorithm (ε-MONMA) for monitoring network design. The surrogate model based on sparse polynomial chaos expansion (PCE) is constructed to replace expensive simulation model in the uncertainty quantification of concentrations at the pre-defined monitoring locations for reducing huge computational cost. Additionally, the scenario discovery strategy using sparse PCE model is applied to filter a typical scenario set and the centroid of contaminant plume is used as the diversity metric, which avoids enumerating all possible contamination plumes caused by the uncertain K-field in the optimization. The proposed algorithm is then employed to solve stochastic management model to achieve robust monitoring design, indicating the insensitivity of monitoring design to plume uncertainty no matter which of the many possible scenarios becomes the true distribution of contamination under the true K-field. A synthetic aquifer considering uncertainty in hydraulic conductivity is designed to optimize monitoring network design. The Pareto-optimal solutions to the synthetic example are achieved under three of plume scenario sets defined at deterministic scenario (Scenario A0), Monte Carlo based scenario discovery (Scenario A1) and surrogate assisted scenario discovery (Scenario A2), respectively. Comprehensive analysis demonstrates that the monitoring design based on Scenario A2 outperforms either of the two designs based on Scenarios A0 and A1 in terms of the improvement of robustness of designs evaluated against the typical scenario set. Meanwhile, the performance of monitoring network deteriorates as the uncertainty of plume (noisy strength) increases, indicating the significance of reducing parameter uncertainty in groundwater monitoring design. The research findings show that the developed stochastic optimization framework is a computationally efficient and promising tool for multi-objective design of groundwater monitoring network under uncertainty.

Introduction

Optimal design of groundwater monitoring network that is capable of providing accurate and informative data is crucial to improve our understanding of complex groundwater systems while reducing huge and unnecessary capital expenditure. In general, an optimal sampling design needs to consider multiple contradictory objectives for minimizing the costs of data acquisition and maximizing the amount of information in the linked simulation-optimization framework (Kollat and Reed, 2006, Reed and Kollat, 2013, Luo et al., 2016). However, as a prerequisite for the simulation-optimization management model, the groundwater flow and transport simulation model inevitably involves the uncertainty associated with hydrogeological parameters, which is essentially used to describe and update the state variables in the management model. Therefore, the management models such as one used for optimal design of groundwater monitoring network have to consider model uncertainty derived from model parameters or model structures, otherwise they might result in erroneous management decisions (Wu et al., 2006, Bayer et al., 2010, Kollat et al., 2011, Alzraiee et al., 2013, Luo et al., 2016). The heterogeneity of spatially distributed parameters (e.g., hydraulic conductivity, K) in the subsurface systems can be characterized using sparse measurement points combined with a geostatistical model in the field studies (Deutsch and Journel, 1997, Diggle and Ribeiro, 2007). The traditional geostatistical model utilizes the Monte Carlo (MC) simulation to generate a large number of stochastic realizations of parameter fields based on unbiased estimations at the unmeasured locations. The uncertainty quantification (UQ) of model outputs can be implemented based on numerous realizations by the simulation model, which needs a huge computational expense. Furthermore, stochastic optimization framework generally requires thousands of individual evaluations for each parameter realization in order to search for robust solutions that are insensitive to parameter uncertainty. It is computationally prohibitive in the practical application that millions of model evaluations in total need to be implemented when several thousand realizations are generated during evolutionary search.

Motivated by the above-mentioned difficulties, Wu et al. (2006) exploited a noisy genetic algorithm (NGA) to evaluate per objective function with the much smaller realization set. The NGA can be used to find highly reliable and robust solutions using a dynamic random resample strategy in which a set of realizations are randomly resampled at each generation and the prior realizations will be augmented at the later stage of search. Then Singh and Minsker (2008) developed the multi-objective stochastic optimization coupled with NGA. After that, Bayer et al. (2010) exploited a stack-ordering technique to update realizations in the evaluation subset in order to achieve highly reliable solutions with lower computational burden compared to the random resample scheme. However, the previous studies only considered several thousand realizations using MC sampling and was unable to implement accurate UQ associated with random parameters, which should be an essential task for stochastic optimization. This challenge in the popular MC method is highly time-consuming for UQ in the stochastic partial differential equation governing groundwater flow and solute transport.

The computationally efficient surrogate models have attracted more intention in recent years (Asher et al., 2015, Razavi et al., 2012). Surrogate assisted UQ exploits much smaller sample set to train approximate model and fully replaces expensive simulation model for quantifying uncertainty of model outputs. Many surrogate models have been successfully applied to UQ and obtained credible approximate results compared to CPU-intensive MC simulation with the groundwater models (Müller et al., 2011, Zhang et al., 2013, Crevillen-Garcia et al., 2017, Meng and Li, 2017, Mo et al., 2017). Polynomial chaos expansion (PCE) is an effective alternative surrogate method, which has been extensively used in the UQ (Rajabi et al., 2015, Bazargan et al., 2015, Meng and Li, 2017). The PCE method constructs a series of orthogonal polynomial basis functions to represent the quantities of interest based on independent random inputs with different probability distributions (Xiu and Karniadakis, 2002a, Ghanem and Spanos, 2003). Furthermore, non-intrusive PCE is more attractive due to the permission of using a well-validated simulator that can be treated as a black box to train the surrogate model. In this study, we utilize stepwise regression and least angle regression (LAR) proposed by Blatman and Sudret, 2010, Blatman and Sudret, 2011 to reduce the number of regression points in the face of higher random dimensionality or higher degree of polynomial equations. As long as the PCE coefficients are achieved by the stepwise regression, the statistical moment of quantities of interest can be easily obtained. Spatial correlated K-field is a critical physical property that always dominates the process of groundwater flow and solute transport. Also, Karhunen-Loève (K-L) expansion can be applied to transform K-field into low-dimensional random subspace, and was used to construct highly efficient surrogate model integrated with the PCE (Li and Zhang, 2007, Li and Zhang, 2013, Meng and Li, 2017), indicating that the K-L technique is promising and effective due to its ability to significantly reduce dimension of random parameters. Accordingly, in this study we employ the sparse PCE model constructed by LAR algorithm to implement UQ at the potential monitoring locations under K-field uncertainty.

Surrogate assisted UQ is the first stage in the stochastic optimization framework for groundwater monitoring network design. To quantify uncertainty in the optimization, scenario-based multi-objective evolutionary algorithm (MOEA) has been exploited to implement robust optimization (e.g., Singh and Minsker, 2008, Sreekanth and Datta, 2011, Luo et al., 2014, Luo et al., 2016, Beh et al., 2015, Sreekanth et al., 2016, Yang et al., 2017, Sankary and Ostfeld, 2018). However, evaluation of evolutionary individuals against all possible operational scenarios may be computationally impractical. Therefore, a random resample strategy that selects much smaller ones from a set of typical scenarios capable of approximating the feature of all possible scenarios, is developed in the optimization framework. Sankary and Ostfeld (2018) developed a random resampling scheme that used the advantage of statistical bootstrapping to achieve reliable estimates of the uncertainty on the unseen samples in the scenario-based MOEA. Their results indicated that the MOEA coupled with the dynamic resample strategy could possess the ability to create robust solutions under unknown operational environments. The study generated a typical scenario set by setting resolution of centroid coordinates of a large number of plume scenarios from surrogate assisted scenario discovery and utilized statistical bootstrapping technique to resample in every generation.

The groundwater monitoring network design often involves many objectives (typically more than three objectives) which may result in the domination resistance phenomenon (Purshouse and Fleming, 2007, Hadka and Reed, 2013). The phenomenon shows that the capacity of Pareto sorting with distinguishing optimal solutions weakens and leads to a large number of non-dominated solutions (i.e., none of the objective values could be improved without degrading one or more of the other objective values) existed in population. Hadka and Reed (2013) developed an auto-adaptive MOEA framework, named Borg, which combines several promising techniques to improve the performance of the algorithm in addressing multi-objective optimization. In order to enhance the local optimality of solutions, memetic algorithms composed by natural evolution based on Darwinian principles and cultural evolution capable of local refinements were applied to speed up the convergence of MOEA (Sindhya et al., 2011, Sindhya et al., 2013). In this study, we employ the ε-dominance concept and a modified auto-adaptive multi-operators recombination to alleviate domination resistance and a bidirectional neighborhood mutation operator to enhance the local optimality of archived solutions. Then we integrate these techniques and the fast non-dominated sort process of NSGA-II (Deb et al., 2002) to the framework of NGA (Wu et al., 2006). Therefore, the epsilon multi-objective noisy memetic algorithm (ε-MONMA) possesses the ability of highly effective global search in addressing many-objective optimization under noisy environment. Moreover, ε-MONMA integrates the surrogate assisted UQ into a unified optimization framework so as to implement accurate UQ at the cost of acceptable computation burden. Finally, we demonstrate the availability and reliability of the framework in the uncertainty based groundwater monitoring network design through a synthetic reactive transport aquifer system.

Section snippets

Methodology

Fig. 1 illustrates the general flowchart to implement robust optimization. First, we have to construct a groundwater flow and reactive transport model and implement UQ with sparse PCE for the contaminant concentration at the monitoring wells. Then we exploit a scenario discovery strategy to generate a typical scenario set and perform stochastic multi-objective optimization by ε-MONMA. In this section, we only present the details of aforementioned modules. As for the post-optimization analysis

Description of the synthetic aquifer system

The model domain of hypothetical unconfined aquifer is 600 m in longitudinal extent, 400 m in transverse extent, and a model thickness of 10 m. The aquifer modeled in this study is discretized into a uniform model grid of 40 rows and 60 columns. As shown in Fig. 3, the contaminant source comprises immobile LNAPL hydrocarbon compounds (benzene in this study). To account for the geochemical reaction processes driven by biodegradation, a site-specific reaction network was developed to explain the

Assessment of surrogate accuracy

To achieve the desired approximate accuracy of sparse PCE models, the 1,500 samples were generated to train surrogate model and additional 100 samples were randomly selected to test the performance. The root mean square error (RMSE) and correlation coefficient (R) are used to evaluate the prediction accuracy on the test dataset. Fig. 4 shows the predicted concentration at 112 monitoring locations are in good agreement with results from the forward model predictions. The plumes interpolated with

Conclusions

In this study, we have developed a two-stage stochastic optimization framework including UQ coupled with sparse PCE and the epsilon multi-objective noisy memetic algorithm (ε-MONMA) for optimal design of groundwater monitoring network under K-field uncertainty. The effectiveness of the proposed stochastic optimization framework was evaluated through a two-dimensional hypothetical monitoring design application. In the first stage, the sparse PCE models are trained to replace expensive reactive

Declaration of Competing Interest

The authors declare that they have no conflict of interest.

Acknowledgements

This study was financially supported by the National Key Research and Development Plan of China (2016YFC0402800), and the National Natural Science Foundation of China (41730856 and 41772254). The numerical calculations in this paper have been implemented on the IBM Blade cluster system in the High Performance Computing Center of Nanjing University, China. Also, the authors are grateful to the Managing Guest Editor Dr. Behzad Ataie-Ashtiani, the Reviewer Dr. J. Sreekanth and another anonymous

References (64)

  • Z. Lu et al.

    Conditional simulations of flow in randomly heterogeneous porous media using a KL-based moment-equation approach

    Adv. Water Resour.

    (2004)
  • Q.K. Luo et al.

    Optimal design of groundwater remediation system using a probabilistic multi-objective fast harmony search algorithm under uncertainty

    J. Hydrol.

    (2014)
  • Q.K. Luo et al.

    Multi-objective optimization of long-term groundwater monitoring network design using a probabilistic Pareto genetic algorithm under uncertainty

    J. Hydrol.

    (2016)
  • J. Meng et al.

    An efficient stochastic approach for flow in porous media via sparse polynomial chaos expansion constructed by feature selection

    Adv. Water Resour.

    (2017)
  • F. Müller et al.

    Probabilistic collocation and lagrangian sampling for advective tracer transport in randomly heterogeneous porous media

    Adv. Water Resour.

    (2011)
  • M.M. Rajabi et al.

    Polynomial chaos expansions for uncertainty propagation and moment independent sensitivity analysis of seawater intrusion simulations

    J. Hydrol.

    (2015)
  • P.M. Reed et al.

    Visual analytics clarify the scalability and effectiveness of massively parallel many-objective optimization: a groundwater monitoring design example

    Adv. Water Resour.

    (2013)
  • J. Sreekanth et al.

    Pareto-based efficient stochastic simulation-optimization for robust and reliable groundwater management

    J. Hydrol.

    (2016)
  • J.F. Wu et al.

    Cost effective sampling network design for contaminant plume monitoring under general hydrogeological conditions

    J. Contam. Hydrol.

    (2005)
  • J.F. Wu et al.

    A comparative study of Monte Carlo simple genetic algorithm and noisy genetic algorithm for cost effective sampling network design under uncertainty

    Adv. Water Resour.

    (2006)
  • D. Xiu et al.

    Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos

    Comput. Meth. Appl. Mech. Energy.

    (2002)
  • D. Zhang et al.

    An efficient, high-order perturbation approach for flow in random porous media via Karhunen-Loève and polynomial expansion

    J. Comput. Phys.

    (2004)
  • A.H. Alzraiee et al.

    Multiobjective design of aquifer monitoring networks for optimal spatial prediction and geostatistical parameter estimation

    Water Resour. Res.

    (2013)
  • M.J. Asher et al.

    A review of surrogate models and their application to groundwater modeling

    Water Resour. Res.

    (2015)
  • R. Askey et al.

    Some basic hypergeometric polynomials that generalize Jacobi polynomials

  • P.M. Bayer et al.

    Optimization of high-reliability-based hydrological design problems by robust automatic sampling of critical model realizations

    Water Resour. Res.

    (2010)
  • E.H.Y. Beh et al.

    Adaptive, multiobjective optimal sequencing approach for urban water supply augmentation under deep uncertainty

    Water Resour. Res.

    (2015)
  • Deb, K., Mohan, M., Mishra, S., 2003. A fast multi-objective evolutionary algorithm for finding well-spread...
  • K. Deb et al.

    A fast and elitist multi-objective genetic algorithm: NSGA-II

    IEEE Trans. Evol. Comput.

    (2002)
  • K. Deb et al.

    Multi-scenario, multi-objective optimization using evolutionary algorithms: initial results

    2015 IEEE Congress on Evolutionary Computation, Sendai, Japan

    (2015)
  • C.V. Deutsch et al.

    Geostatistical Software Library and User’s Guide (GSLIB)

    (1997)
  • P.J. Diggle et al.

    Model-Based Geostatistics

    (2007)
  • Cited by (21)

    • Satellite data-driven multi-objective simulation-optimization modeling for water-environment-agriculture nexus in an arid endorheic lake basin

      2022, Journal of Hydrology
      Citation Excerpt :

      The maximum and minimum ETp scenarios during crop growing season are formulated to evaluate the impacts of basin-scale ET variations possibly affected by climate change on basin development planning subject to multi-sectors’ trade-offs. Finally, Pareto-optimal solutions can be solved using ε-MOMA, an advanced many-objective evolutionary algorithm developed by Song et al. (2019, 2020). This study demonstrates that the multi-objective optimization modeling can assist basin-scale WEA nexus system management facing multi-sectors’ development goals and deep uncertainty of ETp, an indispensable element for irrigation water cycle in the arid areas.

    • Machine learning-based optimal design of groundwater pollution monitoring network

      2022, Environmental Research
      Citation Excerpt :

      Kriging method (also known as Gauss process regression in ML field) is one of the widely used methods for constructing surrogate model, and many studies in related fields have applied Kriging method to achieve optimal design such as groundwater remediation schemes (Luo et al., 2019) and groundwater pollution sources identification (Li et al., 2020). Song et al. (2019) applied sparse polynomial chaos expansion surrogate model for design of GPMN in a synthetic aquifer. Fan et al. (2020a) applied support vector regression surrogate model in design of GPMN in a coal gangue pile.

    • Optimizing effluent trading and risk management schemes considering dual risk aversion for an agricultural watershed

      2022, Agricultural Water Management
      Citation Excerpt :

      They are widely used to include risk aversion in optimization process in pollution control and water resources management. They include downside risk theory (Xie et al., 2018), value-at-risk (Naserizade et al., 2018), conditional value-at-risk (Yazdi et al., 2016), chance-constrained programming (Alizadeh et al., 2018), robust optimization (Song et al., 2019), and financial risk management (Barbaro and Bagajewicz, 2004a; 2004b). Among them, financial risk management and robust optimization both account for system risks due to random loadings.

    View all citing articles on Scopus
    View full text