Multilevel Sequential2 Monte Carlo for Bayesian inverse problems

doi:10.1016/j.jcp.2018.04.014

Journal of Computational Physics

Volume 368, 1 September 2018, Pages 154-178

https://doi.org/10.1016/j.jcp.2018.04.014 Get rights and content

Highlights

•
The Multilevel SMC sampler approximates the solution of a Bayesian inverse problem.
•
The sampler can be used consistently with black box models and in the small noise limit.
•
The sampler performs well in high-dimensional sample spaces.

Abstract

The identification of parameters in mathematical models using noisy observations is a common task in uncertainty quantification. We employ the framework of Bayesian inversion: we combine monitoring and observational data with prior information to estimate the posterior distribution of a parameter. Specifically, we are interested in the distribution of a diffusion coefficient of an elliptic PDE. In this setting, the sample space is high-dimensional, and each sample of the PDE solution is expensive. To address these issues we propose and analyse a novel Sequential Monte Carlo (SMC) sampler for the approximation of the posterior distribution. Classical, single-level SMC constructs a sequence of measures, starting with the prior distribution, and finishing with the posterior distribution. The intermediate measures arise from a tempering of the likelihood, or, equivalently, a rescaling of the noise. The resolution of the PDE discretisation is fixed. In contrast, our estimator employs a hierarchy of PDE discretisations to decrease the computational cost. We construct a sequence of intermediate measures by decreasing the temperature or by increasing the discretisation level at the same time. This idea builds on and generalises the multi-resolution sampler proposed in P.S. Koutsourelakis (2009) [33] where a bridging scheme is used to transfer samples from coarse to fine discretisation levels. Importantly, our choice between tempering and bridging is fully adaptive. We present numerical experiments in 2D space, comparing our estimator to single-level SMC and the multi-resolution sampler.

Introduction

In science and engineering we use mathematical models to simulate and understand physical processes. These models require input parameters. Once the parameters are specified we can solve the so-called forward problem to obtain output quantities of interest. In this work we focus on models that involve partial differential equations (PDEs). To date approximate forward solvers are available for many PDE-based models, and output quantities of interest can be approximated efficiently. In contrast, the identification of input parameters (the inverse problem) is more challenging. Often the physical process is only given implicitly by observations (data, measurements). These measurements are typically noisy and/or sparse, and do not contain sufficient information on the underlying parameter or are disturbed in such a way that the true parameter cannot be recovered at all. The inverse problem is ill-posed.

A classical example is the simulation of steady-state groundwater flow to assess the safety of proposed long-term radioactive waste repositories. The quantity of interest is the travel time of radioactive particles to the boundary of a safety zone. The simulation requires the hydraulic conductivity of the ground; it can be observed implicitly by pumping tests, and by pressure measurements. The objective of the groundwater flow inverse problem is the identification of the conductivity. In this example, the mathematical model involves an elliptic PDE. The groundwater flow inverse problem is well known, see e.g. [13], [14], [36], [45], [47].

In contrast to deterministic regularisation techniques, the Bayesian approach to inverse problems uses the probabilistic framework of Bayesian inference. Bayesian inference is built on Bayes' Formula in the formulation given by Laplace [34, II.1]. We remark that other formulations are possible, see e.g. the work by Matthies et al. [38]. We make use of the mathematical framework for Bayesian Inverse Problems (BIPs) given by Stuart [48]. Under weak assumptions – which we will give below – one can show that the BIP is well-posed. The solution of the BIP is the conditional probability measure of the unknown parameter given the observations.

The Bayesian framework is very general and can handle different types of forward models. However, in this work we consider PDE-based forward models, and in particular an elliptic PDE. The exact solution of the associated BIP is often inaccessible for two reasons: (i) there is no closed form expression for the posterior measure, and $(i i)$ the underlying PDE cannot be solved analytically. We focus on (i), and study efficient approximations to the full posterior measure. Alternatively, one could also only approximate the expectation of output quantities of interest with respect to the posterior measure, or estimate the model evidence, the normalization constant of the posterior measure.

Typically, BIPs are approached with sampling based methods, such as Markov Chain Monte Carlo (MCMC) or Importance Sampling. Classical MCMC samplers are the algorithms suggested by Metropolis et al. [39] and the generalisation by Hastings [26]. Advanced MCMC methods for BIP settings are Hamiltonian Monte Carlo [7] and preconditioned Crank–Nicholson MCMC [6], [11]. A disadvantage of MCMC samplers is the fact that it is often difficult to assess their convergence after an initial burn-in phase. Importance Sampling [1] on the other hand does not require burn-in. However, Importance Sampling is inefficient if the sampling density differs significantly from the target density. For these reasons we employ Sequential Monte Carlo (SMC) [10], [15], [41] to approximate the posterior measure. SMC was initially developed to approximate sequences of measures which arise from time-dependent estimation problems in data assimilation. In our setting, since the elliptic PDE models a steady-state process the SMC sequences are constructed artificially such that, starting from the prior measure, they gradually approach the posterior measure. Artificial sequences of measures arise also in simulated annealing [15], the estimation of rare events [43], model selection [50], and bridging [20].

In some situations it is convenient to determine the artificial sequences “on the fly” during the execution of the algorithm. The associated method is termed adaptive SMC; see [19], [29] for a discussion, and [2] for a careful analysis. A well-known drawback is the fact that adaptive SMC returns a biased model evidence estimate, however, the model evidence is not the major focus of our work. The estimation of the model evidence with non-adaptive SMC is discussed in [20], [42].

The major advantage of SMC is its dimension-independent convergence which is often observed in practise and which can be proved e.g. for uniformly bounded update densities [5]. Thus SMC can be used in high- and infinite dimensional settings. See [44] for a discussion of this point. Similar results are also known for the Ensemble Kalman Filter (EnKF) applied to linear inverse problems with a finite number of particles [47]. The EnKF is a linearised version of SMC and has been applied to linear and nonlinear inverse problems (see [28]).

SMC has already been used to solve BIPs where the forward model is an elliptic [5] or Navier–Stokes equation [31]. The computational challenge is that PDE-based forward solves are in general very expensive. Thus every sample, required by standard solvers such as MCMC or SMC, is expensive. The total computational budget might allow only a few samples and thus the sample error can be considerably large. We handle this problem by constructing a multilevel SMC sampler. To do this we assume that the PDE can be discretised with multiple levels of accuracy. In our work these levels are associated with different mesh sizes in a spatial domain. However, it is also possible to consider e.g. different time step sizes, or target accuracies of Newton's method.

Multilevel samplers enjoy considerable attention at the moment, and are available for various tasks in uncertainty quantification. Multilevel Monte Carlo is widely used in forward uncertainty quantification; see [24] for an overview. In the pioneering work by Giles [23] the multilevel idea is combined with standard Monte Carlo in a forward setting. However, it can be used with other samplers such as MCMC, SMC, and the EnKF, for the estimation of rare events, for filtering problems in data assimilation, and to solve Bayesian Inverse Problems. For example, multilevel Ensemble Kalman Filters have been proposed in [9], [27]. The authors in [27] consider continuous-time data assimilation with multiple time-step discretisations. In contrast, the work in [9] considers data assimilation for spatially extended models e.g. time dependent stochastic partial differential equation. The multilevel estimation of rare events with an SMC type method has been proposed in [49].

For Bayesian Inverse Problems a multilevel MCMC method has been introduced in [18]. Multilevel Sequential Monte Carlo is introduced in [4] and further discussed in [3], [17], [16]. Both the multilevel MCMC and multilevel SMC use coarse PDE discretisations for variance reduction with the help of a telescoping sum expansion. Moreover, these multilevel samplers are built to integrate output quantities of interest with respect to the posterior. In contrast, in our work we do not rely on a telescoping sum, and we construct approximations to the full posterior measure.

We build on and generalise the work by Koutsourelakis in [33]. As in [33] we combine SMC with tempering on a fixed PDE discretisation level, and bridging between two consecutive discretisation levels. The major novel contribution of our work is a fully adaptive algorithm to decide when to increase the discretisation accuracy (bridging) and when to proceed with SMC (tempering). In numerical experiments we show that our sampler – termed Multilevel Sequential² Monte Carlo – gives an approximation accuracy similar to single-level SMC while decreasing the computational cost substantially. Our method works also consistently in the small noise limit. We note that a similar SMC based multilevel method has been proposed in [8]; in this work the model error is given as part of the measurement noise and is updated iteratively.

The remainder of this paper is organised as follows. In §2 we formulate the Bayesian inverse problem and discuss its discretisation. Moreover, we review the basic idea of sequential Monte Carlo. In §3 we give an overview of classical constructions for the intermediate SMC measures, in particular, tempering, bridging, and multilevel bridging. In §3.4 we discuss adaptive SMC samplers where the inverse temperature is selected to bound the effective sample size or, equivalently, the coefficient of variation of the sample weights. The major contribution of this work is presented in §4 where we introduce the Multilevel Sequential² Monte Carlo sampler. We discuss its computational cost, and suggest an adaptive update scheme for the combination of bridging and tempering. This update scheme is consistent with the adaptive bridging and tempering discussed in §3.4. Finally, in §5 we present numerical experiments for a test problem in 2D space. In §6 we give a summary and an outlook of our work.

Section snippets

Bayesian inverse problem

We consider engineering systems or models subject to an uncertain parameter $θ \in X$ . The parameter space X is a separable Banach space. The forward response operator $G : X \to Y$ maps the parameter to the (finite-dimensional) data space $Y : = R^{N_{obs}}$ . We are interested in models where $G = O \circ G$ is the composition of an observation operator $O$ , and a solution operator G of a partial differential equation. In addition, we consider a quantity of interest $Q : X \to R$ which depends on the parameter $θ \in X$ .

Observations $y \in Y$ are

Tempering

In BIPs the posterior measure is often concentrated in a small area of the high-dimensional parameter space X. Tempering (T) is a widely-used method to approximate such measures. The fundamental idea – borrowed from Statistical Physics – is to adjust the temperature $T$ in the Boltzmann distribution.¹

Multilevel sequential² Monte Carlo

In this section we generalize Multilevel Bridging and propose the Multilevel Sequential² Monte Carlo (MLS²MC) sampler. We explain the advantages of this generalisation in §4.3, but before we do this, we introduce the sampler formally in §4.1 and discuss its accuracy and computational cost in §4.2.

MLS²MC is a Sequential Monte Carlo method which combines Tempering and Multilevel Bridging. Sequential² refers to two individual sequences in a Sequential Monte Carlo sampler, namely a sequence of

Numerical experiments

We consider a steady-state groundwater flow problem on the unit square domain $D = {(0, 1)}^{2}$ . The permeability $κ (θ)$ and the hydrostatic pressure p are coupled via the elliptic PDE $- \nabla \cdot (κ (θ (x)) \nabla p (x)) = f (x) (x \in D) .$ The source term f and the boundary conditions are specified below. We observe the pressure at $N_{obs}$ points $(d_{n} : n = 1, \dots, N_{obs})$ in the domain D. Thus the observation operator $O$ maps $p \mapsto (p (d_{n}) : n = 1, \dots, N_{obs})$ .

This inverse problem is well studied in the literature, see e.g. [5], [13], [14], [36], [45], [47].

Conclusion and outlook

We introduce a novel Sequential Monte Carlo method to approximate a posterior measure, the solution of a Bayesian inverse problem. The posterior measure is associated with the solution of a discretised PDE, and thus every Monte Carlo sample is expensive. We suggest an efficient, adaptive SMC sampler termed MLS²MC. The new sampler combines tempering on a fixed PDE discretisation as in single-level SMC, and a bridging scheme to transfer samples from coarse to fine discretisation levels. MLS²MC is

Acknowledgements

This work was supported by Deutsche Forschungsgemeinschaft (DFG) through the TUM International Graduate School of Science and Engineering (IGSSE) within the project 10.02 BAYES. The numerical experiments were performed on the Linux clusters of the Leibniz Rechenzentrum at the Bayerische Akademie der Wissenschaften.

References (50)

A. Beskos et al.
Multilevel sequential Monte Carlo samplers
Stoch. Process. Appl.
(2017)
P.S. Koutsourelakis
A multi-resolution, non-parametric, Bayesian framework for identification of spatially-varying model parameters
J. Comput. Phys.
(2009)
Y.M. Marzouk et al.
Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems
J. Comput. Phys.
(2009)
B. Minasny et al.
The Matérn function as a general model for soil variograms
Geoderma
(2005)
I. Papaioannou et al.
Sequential importance sampling for structural reliability analysis
Struct. Saf.
(2016)
S. Agapiou et al.
Importance sampling: intrinsic dimension and computational cost
Stat. Sci.
(2017)
A. Beskos et al.
On the convergence of adaptive sequential Monte Carlo methods
Ann. Appl. Probab.
(2016)
A. Beskos et al.
Multilevel sequential Monte Carlo with dimension-independent likelihood-informed proposals
A. Beskos et al.
Sequential Monte Carlo methods for Bayesian elliptic inverse problems
Stat. Comput.
(2015)
A. Beskos et al.
An MCMC method for diffusion bridges
Stoch. Dyn.
(2008)

T. Bui-Thanh et al.

Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo

Inverse Probl.

(2014)

D. Calvetti et al.

Iterative updating of model error for Bayesian inversion

Inverse Probl.

(2018)

A. Chernov et al.

Multilevel ensemble Kalman filtering for spatially extended models

N. Chopin

A sequential particle filter method for static models

Biometrika

(2002)

S.L. Cotter et al.

MCMC methods for functions: modifying old algorithms to make them faster

Stat. Sci.

(2013)

W.W. Daniel

Applied Nonparametric Statistics

(1990)

M. Dashti et al.

Uncertainty quantification and weak approximation of an elliptic inverse problem

SIAM J. Numer. Anal.

(2011)

M. Dashti et al.

The Bayesian approach to inverse problems

P. Del Moral et al.

Sequential Monte Carlo samplers

J. R. Stat. Soc. B

(2006)

P. Del Moral et al.

Multilevel sequential Monte Carlo: mean square error bounds under verifiable conditions

Stoch. Anal. Appl.

(2017)

P. Del Moral et al.

Multilevel sequential Monte Carlo samplers for normalizing constants

ACM Trans. Model. Comput. Simul.

(2017)

T.J. Dodwell et al.

A hierarchical multilevel Markov chain Monte Carlo algorithm with applications to uncertainty quantification in subsurface flow

SIAM/ASA J. Uncertain. Quantificat.

(2015)

A. Doucet et al.

A tutorial on particle filtering and smoothing: fifteen years later

Handbook of Nonlinear Filtering

(2011)

A. Gelman et al.

Simulating normalizing constants: from importance sampling to bridge sampling to path sampling

Stat. Sci.

(1998)

S. Ghosal et al.

Fundamentals of Nonparametric Bayesian Inference

(2017)

Cited by (58)

Bayesian updating and marginal likelihood estimation by cross entropy based importance sampling
2023, Journal of Computational Physics
We introduce a novel simulation-based method for Bayesian analysis to learn model parameters based on data. The method employs importance sampling (IS) to construct a sample-based approximation of the posterior probability density function (PDF) and estimate the marginal likelihood. We propose to build the IS density through an adaptive sampling approach based on the cross entropy (CE) method. The aim is to identify the parameters of a chosen parametric distribution family that minimize its Kullback-Leibler divergence from the target posterior PDF. An adaptive multi-level approach, based on tempering of the likelihood function, is proposed to efficiently solve this CE optimization problem. We investigate the appropriate choice of the parametric distribution, depending on the number of uncertain model parameters and nature of the posterior density. Numerical studies demonstrate the performance of the proposed method.
Multi-index ensemble Kalman filtering
2022, Journal of Computational Physics
Citation Excerpt :
This challenge can be overcome by the multilevel Monte Carlo method (MLMC) [24], which achieves substantial variance reduction by simulating pairwise coupled realizations on a hierarchy of temporal discretization levels. MLMC is a flexible methodology that has been combined with many other methods and successfully implemented in various fields: quasi-Monte Carlo [25,43,55], sequential Monte Carlo [13,12,51,44], inverse problems and experimental design [10,57,47,26,58], differential equations with randomness [39,19,8,42,9,5], limit theorems [2,33], importance sampling [32,41,21], and machine learning [48]. The multilevel EnKF (MLEnKF) method was introduced by Hoel et al. [34] for stochastic differential equation models with discrete-time observations, and an alternative version based on a sample average of independent pairwise coupled EnKF estimators was subsequently developed [35].
In this work we combine ideas from multi-index Monte Carlo and ensemble Kalman filtering (EnKF) to produce a highly efficient filtering method called multi-index EnKF (MIEnKF). MIEnKF is based on independent samples of four-coupled EnKF estimators on a multi-index hierarchy of resolution levels, and it may be viewed as an extension of the multilevel EnKF (MLEnKF) method developed by the same authors in 2020. Multi-index here refers to a two-index method, consisting of a hierarchy of EnKF estimators that are coupled in two degrees of freedom: time discretization and ensemble size. Under certain assumptions, when strong coupling between solutions on neighboring numerical resolutions is attainable, the MIEnKF method is proven to be more tractable than EnKF and MLEnKF. Said efficiency gains are also verified numerically in a series of test problems.
Multifidelity multilevel Monte Carlo to accelerate approximate Bayesian parameter inference for partially observed stochastic processes
2022, Journal of Computational Physics
Citation Excerpt :
In particular, the multilevel Monte Carlo (MLMC) method [49,50] expands an expectation as a telescoping summation of bias corrections and achieves variance reductions by exploiting path-wise convergence properties of numerical schemes for solving stochastic differential equations (SDEs) [51] or discrete-state Markov processes [52–54]. Recently, the MLMC telescoping summation approach has also been successfully applied to Bayesian inference [55–59], as per Equation (1), including ABC-based samplers [60–62]. Many of the above approaches use approximate simulations in ways that are not always mutually exclusive.
Models of stochastic processes are widely used in almost all fields of science. Theory validation, parameter estimation, and prediction all require model calibration and statistical inference using data. However, data are almost always incomplete observations of reality. This leads to a great challenge for statistical inference because the likelihood function will be intractable for almost all partially observed stochastic processes. This renders many statistical methods, especially within a Bayesian framework, impossible to implement. Therefore, computationally expensive likelihood-free approaches are applied that replace likelihood evaluations with realisations of the model and observation process. For accurate inference, however, likelihood-free techniques may require millions of expensive stochastic simulations. To address this challenge, we develop a new method based on recent advances in multilevel and multifidelity methods for parameter inference using partially observed Markov processes. Our novel approach combines the multilevel Monte Carlo telescoping summation, applied to a sequence of approximate Bayesian posterior targets, with a multifidelity rejection sampler that learns from computationally inexpensive model approximations to minimise the number of computationally expensive exact simulations required for accurate inference. We present the derivation of our new algorithm for likelihood-free Bayesian inference, discuss practical implementation details, and demonstrate substantial performance improvements. Using examples from systems biology, we demonstrate improvements of more than two orders of magnitude over standard rejection sampling techniques. Our approach is generally applicable to accelerate other sampling schemes, such as sequential Monte Carlo, to enable feasible Bayesian analysis for realistic practical applications in physics, chemistry, biology, epidemiology, ecology and economics. We provide source code implementations of our methods and demonstrations (available at https://github.com/davidwarne/MLMCandMultifidelityForABC).
Cross entropy-based importance sampling for first-passage probability estimation of randomly excited linear structures with parameter uncertainty
2021, Structural Safety
We propose an efficient importance sampling (IS) method for estimating the first-passage probability of linear structures with uncertain parameters and subjected to Gaussian process excitations. The method evaluates the reliability through integrating the conditional first-passage probability given the uncertain structural parameters. We develop an adaptive IS strategy to efficiently perform this integration based on an IS density that is constructed using the cross entropy (CE) method. The CE method determines the IS density by adaptively minimizing the Kullback–Leibler divergence between the theoretically optimal sampling density and a chosen parametric family of probability distributions. The CE optimization problem is solved for a series of target densities that gradually approach the optimal IS density of the structural parameters. To define the intermediate densities, a smoothening of the conditional first-passage probabilities is employed. Once the IS density of the uncertain structural parameters is obtained, an effective IS density of the random excitations conditional on the structural parameters is introduced to estimate the failure probability of the structure. Unlike other tailored methods for solving this problem, the proposed IS approach does not require any prior analysis of the dynamic system and can be applied as a black-box method. Numerical examples demonstrate that the proposed method can calculate the first-passage probability with remarkable efficiency.
Multifidelity adaptive sequential Monte Carlo for geophysical inversion
2024, Geophysical Journal International
Efficient Geometric Markov Chain Monte Carlo for Nonlinear Bayesian Inversion Enabled by Derivative-Informed Neural Operators
2024, arXiv

View all citing articles on Scopus

View full text

Multilevel Sequential2 Monte Carlo for Bayesian inverse problems

Highlights

Abstract

Introduction

Section snippets

Bayesian inverse problem

Tempering