Research papersSurrogate assisted multi-objective robust optimization for groundwater monitoring network design
Introduction
Optimal design of groundwater monitoring network that is capable of providing accurate and informative data is crucial to improve our understanding of complex groundwater systems while reducing huge and unnecessary capital expenditure. In general, an optimal sampling design needs to consider multiple contradictory objectives for minimizing the costs of data acquisition and maximizing the amount of information in the linked simulation-optimization framework (Kollat and Reed, 2006, Reed and Kollat, 2013, Luo et al., 2016). However, as a prerequisite for the simulation-optimization management model, the groundwater flow and transport simulation model inevitably involves the uncertainty associated with hydrogeological parameters, which is essentially used to describe and update the state variables in the management model. Therefore, the management models such as one used for optimal design of groundwater monitoring network have to consider model uncertainty derived from model parameters or model structures, otherwise they might result in erroneous management decisions (Wu et al., 2006, Bayer et al., 2010, Kollat et al., 2011, Alzraiee et al., 2013, Luo et al., 2016). The heterogeneity of spatially distributed parameters (e.g., hydraulic conductivity, K) in the subsurface systems can be characterized using sparse measurement points combined with a geostatistical model in the field studies (Deutsch and Journel, 1997, Diggle and Ribeiro, 2007). The traditional geostatistical model utilizes the Monte Carlo (MC) simulation to generate a large number of stochastic realizations of parameter fields based on unbiased estimations at the unmeasured locations. The uncertainty quantification (UQ) of model outputs can be implemented based on numerous realizations by the simulation model, which needs a huge computational expense. Furthermore, stochastic optimization framework generally requires thousands of individual evaluations for each parameter realization in order to search for robust solutions that are insensitive to parameter uncertainty. It is computationally prohibitive in the practical application that millions of model evaluations in total need to be implemented when several thousand realizations are generated during evolutionary search.
Motivated by the above-mentioned difficulties, Wu et al. (2006) exploited a noisy genetic algorithm (NGA) to evaluate per objective function with the much smaller realization set. The NGA can be used to find highly reliable and robust solutions using a dynamic random resample strategy in which a set of realizations are randomly resampled at each generation and the prior realizations will be augmented at the later stage of search. Then Singh and Minsker (2008) developed the multi-objective stochastic optimization coupled with NGA. After that, Bayer et al. (2010) exploited a stack-ordering technique to update realizations in the evaluation subset in order to achieve highly reliable solutions with lower computational burden compared to the random resample scheme. However, the previous studies only considered several thousand realizations using MC sampling and was unable to implement accurate UQ associated with random parameters, which should be an essential task for stochastic optimization. This challenge in the popular MC method is highly time-consuming for UQ in the stochastic partial differential equation governing groundwater flow and solute transport.
The computationally efficient surrogate models have attracted more intention in recent years (Asher et al., 2015, Razavi et al., 2012). Surrogate assisted UQ exploits much smaller sample set to train approximate model and fully replaces expensive simulation model for quantifying uncertainty of model outputs. Many surrogate models have been successfully applied to UQ and obtained credible approximate results compared to CPU-intensive MC simulation with the groundwater models (Müller et al., 2011, Zhang et al., 2013, Crevillen-Garcia et al., 2017, Meng and Li, 2017, Mo et al., 2017). Polynomial chaos expansion (PCE) is an effective alternative surrogate method, which has been extensively used in the UQ (Rajabi et al., 2015, Bazargan et al., 2015, Meng and Li, 2017). The PCE method constructs a series of orthogonal polynomial basis functions to represent the quantities of interest based on independent random inputs with different probability distributions (Xiu and Karniadakis, 2002a, Ghanem and Spanos, 2003). Furthermore, non-intrusive PCE is more attractive due to the permission of using a well-validated simulator that can be treated as a black box to train the surrogate model. In this study, we utilize stepwise regression and least angle regression (LAR) proposed by Blatman and Sudret, 2010, Blatman and Sudret, 2011 to reduce the number of regression points in the face of higher random dimensionality or higher degree of polynomial equations. As long as the PCE coefficients are achieved by the stepwise regression, the statistical moment of quantities of interest can be easily obtained. Spatial correlated K-field is a critical physical property that always dominates the process of groundwater flow and solute transport. Also, Karhunen-Loève (K-L) expansion can be applied to transform K-field into low-dimensional random subspace, and was used to construct highly efficient surrogate model integrated with the PCE (Li and Zhang, 2007, Li and Zhang, 2013, Meng and Li, 2017), indicating that the K-L technique is promising and effective due to its ability to significantly reduce dimension of random parameters. Accordingly, in this study we employ the sparse PCE model constructed by LAR algorithm to implement UQ at the potential monitoring locations under K-field uncertainty.
Surrogate assisted UQ is the first stage in the stochastic optimization framework for groundwater monitoring network design. To quantify uncertainty in the optimization, scenario-based multi-objective evolutionary algorithm (MOEA) has been exploited to implement robust optimization (e.g., Singh and Minsker, 2008, Sreekanth and Datta, 2011, Luo et al., 2014, Luo et al., 2016, Beh et al., 2015, Sreekanth et al., 2016, Yang et al., 2017, Sankary and Ostfeld, 2018). However, evaluation of evolutionary individuals against all possible operational scenarios may be computationally impractical. Therefore, a random resample strategy that selects much smaller ones from a set of typical scenarios capable of approximating the feature of all possible scenarios, is developed in the optimization framework. Sankary and Ostfeld (2018) developed a random resampling scheme that used the advantage of statistical bootstrapping to achieve reliable estimates of the uncertainty on the unseen samples in the scenario-based MOEA. Their results indicated that the MOEA coupled with the dynamic resample strategy could possess the ability to create robust solutions under unknown operational environments. The study generated a typical scenario set by setting resolution of centroid coordinates of a large number of plume scenarios from surrogate assisted scenario discovery and utilized statistical bootstrapping technique to resample in every generation.
The groundwater monitoring network design often involves many objectives (typically more than three objectives) which may result in the domination resistance phenomenon (Purshouse and Fleming, 2007, Hadka and Reed, 2013). The phenomenon shows that the capacity of Pareto sorting with distinguishing optimal solutions weakens and leads to a large number of non-dominated solutions (i.e., none of the objective values could be improved without degrading one or more of the other objective values) existed in population. Hadka and Reed (2013) developed an auto-adaptive MOEA framework, named Borg, which combines several promising techniques to improve the performance of the algorithm in addressing multi-objective optimization. In order to enhance the local optimality of solutions, memetic algorithms composed by natural evolution based on Darwinian principles and cultural evolution capable of local refinements were applied to speed up the convergence of MOEA (Sindhya et al., 2011, Sindhya et al., 2013). In this study, we employ the ε-dominance concept and a modified auto-adaptive multi-operators recombination to alleviate domination resistance and a bidirectional neighborhood mutation operator to enhance the local optimality of archived solutions. Then we integrate these techniques and the fast non-dominated sort process of NSGA-II (Deb et al., 2002) to the framework of NGA (Wu et al., 2006). Therefore, the epsilon multi-objective noisy memetic algorithm (ε-MONMA) possesses the ability of highly effective global search in addressing many-objective optimization under noisy environment. Moreover, ε-MONMA integrates the surrogate assisted UQ into a unified optimization framework so as to implement accurate UQ at the cost of acceptable computation burden. Finally, we demonstrate the availability and reliability of the framework in the uncertainty based groundwater monitoring network design through a synthetic reactive transport aquifer system.
Section snippets
Methodology
Fig. 1 illustrates the general flowchart to implement robust optimization. First, we have to construct a groundwater flow and reactive transport model and implement UQ with sparse PCE for the contaminant concentration at the monitoring wells. Then we exploit a scenario discovery strategy to generate a typical scenario set and perform stochastic multi-objective optimization by ε-MONMA. In this section, we only present the details of aforementioned modules. As for the post-optimization analysis
Description of the synthetic aquifer system
The model domain of hypothetical unconfined aquifer is 600 m in longitudinal extent, 400 m in transverse extent, and a model thickness of 10 m. The aquifer modeled in this study is discretized into a uniform model grid of 40 rows and 60 columns. As shown in Fig. 3, the contaminant source comprises immobile LNAPL hydrocarbon compounds (benzene in this study). To account for the geochemical reaction processes driven by biodegradation, a site-specific reaction network was developed to explain the
Assessment of surrogate accuracy
To achieve the desired approximate accuracy of sparse PCE models, the 1,500 samples were generated to train surrogate model and additional 100 samples were randomly selected to test the performance. The root mean square error (RMSE) and correlation coefficient (R) are used to evaluate the prediction accuracy on the test dataset. Fig. 4 shows the predicted concentration at 112 monitoring locations are in good agreement with results from the forward model predictions. The plumes interpolated with
Conclusions
In this study, we have developed a two-stage stochastic optimization framework including UQ coupled with sparse PCE and the epsilon multi-objective noisy memetic algorithm (ε-MONMA) for optimal design of groundwater monitoring network under K-field uncertainty. The effectiveness of the proposed stochastic optimization framework was evaluated through a two-dimensional hypothetical monitoring design application. In the first stage, the sparse PCE models are trained to replace expensive reactive
Declaration of Competing Interest
The authors declare that they have no conflict of interest.
Acknowledgements
This study was financially supported by the National Key Research and Development Plan of China (2016YFC0402800), and the National Natural Science Foundation of China (41730856 and 41772254). The numerical calculations in this paper have been implemented on the IBM Blade cluster system in the High Performance Computing Center of Nanjing University, China. Also, the authors are grateful to the Managing Guest Editor Dr. Behzad Ataie-Ashtiani, the Reviewer Dr. J. Sreekanth and another anonymous
References (64)
- et al.
A robust and efficient stepwise regression method for building sparse polynomial chaos expansions
J. Comput. Phys.
(2017) - et al.
Surrogate accelerated sampling of reservoir models with complex structures using sparse polynomial chaos expansion
Adv. Water Resour.
(2015) - et al.
Sparse polynomial chaos expansions and adaptive stochastic finite elements using a regression approach
C. R. Méc.
(2008) - et al.
An adaptive algorithm to build up sparse polynomial chaos expansions for stochastic finite element analysis
Probab. Eng. Mech.
(2010) - et al.
Adaptive sparse polynomial chaos expansion based on Least Angle Regression
J. Comput. Phys.
(2011) - et al.
Modelling of transport and biogeochemical processes in pollution plumes: literature review and model development
J. Hydrol.
(2002) - et al.
Modelling the fate of styrene in a mixed petroleum hydrocarbon plume
J. Contam. Hydrol.
(2009) - et al.
Gaussian process modelling for uncertainty quantification in convectively enhanced dissolution processes in porous media
Adv. Water Resour.
(2017) - et al.
Comparing state-of-the-art evolutionary multiobjective algorithms for long-term groundwater monitoring design
Adv. Water Resour.
(2006) - et al.
A computational scaling analysis of multiobjective evolutionary algorithms in long-term groundwater monitoring applications
Adv. Water Resour.
(2007)
Conditional simulations of flow in randomly heterogeneous porous media using a KL-based moment-equation approach
Adv. Water Resour.
Optimal design of groundwater remediation system using a probabilistic multi-objective fast harmony search algorithm under uncertainty
J. Hydrol.
Multi-objective optimization of long-term groundwater monitoring network design using a probabilistic Pareto genetic algorithm under uncertainty
J. Hydrol.
An efficient stochastic approach for flow in porous media via sparse polynomial chaos expansion constructed by feature selection
Adv. Water Resour.
Probabilistic collocation and lagrangian sampling for advective tracer transport in randomly heterogeneous porous media
Adv. Water Resour.
Polynomial chaos expansions for uncertainty propagation and moment independent sensitivity analysis of seawater intrusion simulations
J. Hydrol.
Visual analytics clarify the scalability and effectiveness of massively parallel many-objective optimization: a groundwater monitoring design example
Adv. Water Resour.
Pareto-based efficient stochastic simulation-optimization for robust and reliable groundwater management
J. Hydrol.
Cost effective sampling network design for contaminant plume monitoring under general hydrogeological conditions
J. Contam. Hydrol.
A comparative study of Monte Carlo simple genetic algorithm and noisy genetic algorithm for cost effective sampling network design under uncertainty
Adv. Water Resour.
Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos
Comput. Meth. Appl. Mech. Energy.
An efficient, high-order perturbation approach for flow in random porous media via Karhunen-Loève and polynomial expansion
J. Comput. Phys.
Multiobjective design of aquifer monitoring networks for optimal spatial prediction and geostatistical parameter estimation
Water Resour. Res.
A review of surrogate models and their application to groundwater modeling
Water Resour. Res.
Some basic hypergeometric polynomials that generalize Jacobi polynomials
Optimization of high-reliability-based hydrological design problems by robust automatic sampling of critical model realizations
Water Resour. Res.
Adaptive, multiobjective optimal sequencing approach for urban water supply augmentation under deep uncertainty
Water Resour. Res.
A fast and elitist multi-objective genetic algorithm: NSGA-II
IEEE Trans. Evol. Comput.
Multi-scenario, multi-objective optimization using evolutionary algorithms: initial results
2015 IEEE Congress on Evolutionary Computation, Sendai, Japan
Geostatistical Software Library and User’s Guide (GSLIB)
Model-Based Geostatistics
Cited by (21)
Review of machine learning-based surrogate models of groundwater contaminant modeling
2023, Environmental ResearchSensitivity-dependent dynamic searching approach coupling multi-intelligent surrogates in homotopy mechanism for groundwater DNAPL-source inversion
2023, Journal of Contaminant HydrologySatellite data-driven multi-objective simulation-optimization modeling for water-environment-agriculture nexus in an arid endorheic lake basin
2022, Journal of HydrologyCitation Excerpt :The maximum and minimum ETp scenarios during crop growing season are formulated to evaluate the impacts of basin-scale ET variations possibly affected by climate change on basin development planning subject to multi-sectors’ trade-offs. Finally, Pareto-optimal solutions can be solved using ε-MOMA, an advanced many-objective evolutionary algorithm developed by Song et al. (2019, 2020). This study demonstrates that the multi-objective optimization modeling can assist basin-scale WEA nexus system management facing multi-sectors’ development goals and deep uncertainty of ETp, an indispensable element for irrigation water cycle in the arid areas.
Machine learning-based optimal design of groundwater pollution monitoring network
2022, Environmental ResearchCitation Excerpt :Kriging method (also known as Gauss process regression in ML field) is one of the widely used methods for constructing surrogate model, and many studies in related fields have applied Kriging method to achieve optimal design such as groundwater remediation schemes (Luo et al., 2019) and groundwater pollution sources identification (Li et al., 2020). Song et al. (2019) applied sparse polynomial chaos expansion surrogate model for design of GPMN in a synthetic aquifer. Fan et al. (2020a) applied support vector regression surrogate model in design of GPMN in a coal gangue pile.
Optimizing effluent trading and risk management schemes considering dual risk aversion for an agricultural watershed
2022, Agricultural Water ManagementCitation Excerpt :They are widely used to include risk aversion in optimization process in pollution control and water resources management. They include downside risk theory (Xie et al., 2018), value-at-risk (Naserizade et al., 2018), conditional value-at-risk (Yazdi et al., 2016), chance-constrained programming (Alizadeh et al., 2018), robust optimization (Song et al., 2019), and financial risk management (Barbaro and Bagajewicz, 2004a; 2004b). Among them, financial risk management and robust optimization both account for system risks due to random loadings.