Elsevier

Journal of Computational Physics

Volume 368, 1 September 2018, Pages 154-178
Journal of Computational Physics

Multilevel Sequential2 Monte Carlo for Bayesian inverse problems

https://doi.org/10.1016/j.jcp.2018.04.014Get rights and content

Highlights

  • The Multilevel SMC sampler approximates the solution of a Bayesian inverse problem.

  • The sampler can be used consistently with black box models and in the small noise limit.

  • The sampler performs well in high-dimensional sample spaces.

Abstract

The identification of parameters in mathematical models using noisy observations is a common task in uncertainty quantification. We employ the framework of Bayesian inversion: we combine monitoring and observational data with prior information to estimate the posterior distribution of a parameter. Specifically, we are interested in the distribution of a diffusion coefficient of an elliptic PDE. In this setting, the sample space is high-dimensional, and each sample of the PDE solution is expensive. To address these issues we propose and analyse a novel Sequential Monte Carlo (SMC) sampler for the approximation of the posterior distribution. Classical, single-level SMC constructs a sequence of measures, starting with the prior distribution, and finishing with the posterior distribution. The intermediate measures arise from a tempering of the likelihood, or, equivalently, a rescaling of the noise. The resolution of the PDE discretisation is fixed. In contrast, our estimator employs a hierarchy of PDE discretisations to decrease the computational cost. We construct a sequence of intermediate measures by decreasing the temperature or by increasing the discretisation level at the same time. This idea builds on and generalises the multi-resolution sampler proposed in P.S. Koutsourelakis (2009) [33] where a bridging scheme is used to transfer samples from coarse to fine discretisation levels. Importantly, our choice between tempering and bridging is fully adaptive. We present numerical experiments in 2D space, comparing our estimator to single-level SMC and the multi-resolution sampler.

Introduction

In science and engineering we use mathematical models to simulate and understand physical processes. These models require input parameters. Once the parameters are specified we can solve the so-called forward problem to obtain output quantities of interest. In this work we focus on models that involve partial differential equations (PDEs). To date approximate forward solvers are available for many PDE-based models, and output quantities of interest can be approximated efficiently. In contrast, the identification of input parameters (the inverse problem) is more challenging. Often the physical process is only given implicitly by observations (data, measurements). These measurements are typically noisy and/or sparse, and do not contain sufficient information on the underlying parameter or are disturbed in such a way that the true parameter cannot be recovered at all. The inverse problem is ill-posed.

A classical example is the simulation of steady-state groundwater flow to assess the safety of proposed long-term radioactive waste repositories. The quantity of interest is the travel time of radioactive particles to the boundary of a safety zone. The simulation requires the hydraulic conductivity of the ground; it can be observed implicitly by pumping tests, and by pressure measurements. The objective of the groundwater flow inverse problem is the identification of the conductivity. In this example, the mathematical model involves an elliptic PDE. The groundwater flow inverse problem is well known, see e.g. [13], [14], [36], [45], [47].

In contrast to deterministic regularisation techniques, the Bayesian approach to inverse problems uses the probabilistic framework of Bayesian inference. Bayesian inference is built on Bayes' Formula in the formulation given by Laplace [34, II.1]. We remark that other formulations are possible, see e.g. the work by Matthies et al. [38]. We make use of the mathematical framework for Bayesian Inverse Problems (BIPs) given by Stuart [48]. Under weak assumptions – which we will give below – one can show that the BIP is well-posed. The solution of the BIP is the conditional probability measure of the unknown parameter given the observations.

The Bayesian framework is very general and can handle different types of forward models. However, in this work we consider PDE-based forward models, and in particular an elliptic PDE. The exact solution of the associated BIP is often inaccessible for two reasons: (i) there is no closed form expression for the posterior measure, and (ii) the underlying PDE cannot be solved analytically. We focus on (i), and study efficient approximations to the full posterior measure. Alternatively, one could also only approximate the expectation of output quantities of interest with respect to the posterior measure, or estimate the model evidence, the normalization constant of the posterior measure.

Typically, BIPs are approached with sampling based methods, such as Markov Chain Monte Carlo (MCMC) or Importance Sampling. Classical MCMC samplers are the algorithms suggested by Metropolis et al. [39] and the generalisation by Hastings [26]. Advanced MCMC methods for BIP settings are Hamiltonian Monte Carlo [7] and preconditioned Crank–Nicholson MCMC [6], [11]. A disadvantage of MCMC samplers is the fact that it is often difficult to assess their convergence after an initial burn-in phase. Importance Sampling [1] on the other hand does not require burn-in. However, Importance Sampling is inefficient if the sampling density differs significantly from the target density. For these reasons we employ Sequential Monte Carlo (SMC) [10], [15], [41] to approximate the posterior measure. SMC was initially developed to approximate sequences of measures which arise from time-dependent estimation problems in data assimilation. In our setting, since the elliptic PDE models a steady-state process the SMC sequences are constructed artificially such that, starting from the prior measure, they gradually approach the posterior measure. Artificial sequences of measures arise also in simulated annealing [15], the estimation of rare events [43], model selection [50], and bridging [20].

In some situations it is convenient to determine the artificial sequences “on the fly” during the execution of the algorithm. The associated method is termed adaptive SMC; see [19], [29] for a discussion, and [2] for a careful analysis. A well-known drawback is the fact that adaptive SMC returns a biased model evidence estimate, however, the model evidence is not the major focus of our work. The estimation of the model evidence with non-adaptive SMC is discussed in [20], [42].

The major advantage of SMC is its dimension-independent convergence which is often observed in practise and which can be proved e.g. for uniformly bounded update densities [5]. Thus SMC can be used in high- and infinite dimensional settings. See [44] for a discussion of this point. Similar results are also known for the Ensemble Kalman Filter (EnKF) applied to linear inverse problems with a finite number of particles [47]. The EnKF is a linearised version of SMC and has been applied to linear and nonlinear inverse problems (see [28]).

SMC has already been used to solve BIPs where the forward model is an elliptic [5] or Navier–Stokes equation [31]. The computational challenge is that PDE-based forward solves are in general very expensive. Thus every sample, required by standard solvers such as MCMC or SMC, is expensive. The total computational budget might allow only a few samples and thus the sample error can be considerably large. We handle this problem by constructing a multilevel SMC sampler. To do this we assume that the PDE can be discretised with multiple levels of accuracy. In our work these levels are associated with different mesh sizes in a spatial domain. However, it is also possible to consider e.g. different time step sizes, or target accuracies of Newton's method.

Multilevel samplers enjoy considerable attention at the moment, and are available for various tasks in uncertainty quantification. Multilevel Monte Carlo is widely used in forward uncertainty quantification; see [24] for an overview. In the pioneering work by Giles [23] the multilevel idea is combined with standard Monte Carlo in a forward setting. However, it can be used with other samplers such as MCMC, SMC, and the EnKF, for the estimation of rare events, for filtering problems in data assimilation, and to solve Bayesian Inverse Problems. For example, multilevel Ensemble Kalman Filters have been proposed in [9], [27]. The authors in [27] consider continuous-time data assimilation with multiple time-step discretisations. In contrast, the work in [9] considers data assimilation for spatially extended models e.g. time dependent stochastic partial differential equation. The multilevel estimation of rare events with an SMC type method has been proposed in [49].

For Bayesian Inverse Problems a multilevel MCMC method has been introduced in [18]. Multilevel Sequential Monte Carlo is introduced in [4] and further discussed in [3], [17], [16]. Both the multilevel MCMC and multilevel SMC use coarse PDE discretisations for variance reduction with the help of a telescoping sum expansion. Moreover, these multilevel samplers are built to integrate output quantities of interest with respect to the posterior. In contrast, in our work we do not rely on a telescoping sum, and we construct approximations to the full posterior measure.

We build on and generalise the work by Koutsourelakis in [33]. As in [33] we combine SMC with tempering on a fixed PDE discretisation level, and bridging between two consecutive discretisation levels. The major novel contribution of our work is a fully adaptive algorithm to decide when to increase the discretisation accuracy (bridging) and when to proceed with SMC (tempering). In numerical experiments we show that our sampler – termed Multilevel Sequential2 Monte Carlo – gives an approximation accuracy similar to single-level SMC while decreasing the computational cost substantially. Our method works also consistently in the small noise limit. We note that a similar SMC based multilevel method has been proposed in [8]; in this work the model error is given as part of the measurement noise and is updated iteratively.

The remainder of this paper is organised as follows. In §2 we formulate the Bayesian inverse problem and discuss its discretisation. Moreover, we review the basic idea of sequential Monte Carlo. In §3 we give an overview of classical constructions for the intermediate SMC measures, in particular, tempering, bridging, and multilevel bridging. In §3.4 we discuss adaptive SMC samplers where the inverse temperature is selected to bound the effective sample size or, equivalently, the coefficient of variation of the sample weights. The major contribution of this work is presented in §4 where we introduce the Multilevel Sequential2 Monte Carlo sampler. We discuss its computational cost, and suggest an adaptive update scheme for the combination of bridging and tempering. This update scheme is consistent with the adaptive bridging and tempering discussed in §3.4. Finally, in §5 we present numerical experiments for a test problem in 2D space. In §6 we give a summary and an outlook of our work.

Section snippets

Bayesian inverse problem

We consider engineering systems or models subject to an uncertain parameter θX. The parameter space X is a separable Banach space. The forward response operator G:XY maps the parameter to the (finite-dimensional) data space Y:=RNobs. We are interested in models where G=OG is the composition of an observation operator O, and a solution operator G of a partial differential equation. In addition, we consider a quantity of interest Q:XR which depends on the parameter θX.

Observations yY are

Tempering

In BIPs the posterior measure is often concentrated in a small area of the high-dimensional parameter space X. Tempering (T) is a widely-used method to approximate such measures. The fundamental idea – borrowed from Statistical Physics – is to adjust the temperature T in the Boltzmann distribution.1

Multilevel sequential2 Monte Carlo

In this section we generalize Multilevel Bridging and propose the Multilevel Sequential2 Monte Carlo (MLS2MC) sampler. We explain the advantages of this generalisation in §4.3, but before we do this, we introduce the sampler formally in §4.1 and discuss its accuracy and computational cost in §4.2.

MLS2MC is a Sequential Monte Carlo method which combines Tempering and Multilevel Bridging. Sequential2 refers to two individual sequences in a Sequential Monte Carlo sampler, namely a sequence of

Numerical experiments

We consider a steady-state groundwater flow problem on the unit square domain D=(0,1)2. The permeability κ(θ) and the hydrostatic pressure p are coupled via the elliptic PDE(κ(θ(x))p(x))=f(x)(xD). The source term f and the boundary conditions are specified below. We observe the pressure at Nobs points (dn:n=1,,Nobs) in the domain D. Thus the observation operator O maps p(p(dn):n=1,,Nobs).

This inverse problem is well studied in the literature, see e.g. [5], [13], [14], [36], [45], [47].

Conclusion and outlook

We introduce a novel Sequential Monte Carlo method to approximate a posterior measure, the solution of a Bayesian inverse problem. The posterior measure is associated with the solution of a discretised PDE, and thus every Monte Carlo sample is expensive. We suggest an efficient, adaptive SMC sampler termed MLS2MC. The new sampler combines tempering on a fixed PDE discretisation as in single-level SMC, and a bridging scheme to transfer samples from coarse to fine discretisation levels. MLS2MC is

Acknowledgements

This work was supported by Deutsche Forschungsgemeinschaft (DFG) through the TUM International Graduate School of Science and Engineering (IGSSE) within the project 10.02 BAYES. The numerical experiments were performed on the Linux clusters of the Leibniz Rechenzentrum at the Bayerische Akademie der Wissenschaften.

References (50)

  • T. Bui-Thanh et al.

    Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo

    Inverse Probl.

    (2014)
  • D. Calvetti et al.

    Iterative updating of model error for Bayesian inversion

    Inverse Probl.

    (2018)
  • A. Chernov et al.

    Multilevel ensemble Kalman filtering for spatially extended models

  • N. Chopin

    A sequential particle filter method for static models

    Biometrika

    (2002)
  • S.L. Cotter et al.

    MCMC methods for functions: modifying old algorithms to make them faster

    Stat. Sci.

    (2013)
  • W.W. Daniel

    Applied Nonparametric Statistics

    (1990)
  • M. Dashti et al.

    Uncertainty quantification and weak approximation of an elliptic inverse problem

    SIAM J. Numer. Anal.

    (2011)
  • M. Dashti et al.

    The Bayesian approach to inverse problems

  • P. Del Moral et al.

    Sequential Monte Carlo samplers

    J. R. Stat. Soc. B

    (2006)
  • P. Del Moral et al.

    Multilevel sequential Monte Carlo: mean square error bounds under verifiable conditions

    Stoch. Anal. Appl.

    (2017)
  • P. Del Moral et al.

    Multilevel sequential Monte Carlo samplers for normalizing constants

    ACM Trans. Model. Comput. Simul.

    (2017)
  • T.J. Dodwell et al.

    A hierarchical multilevel Markov chain Monte Carlo algorithm with applications to uncertainty quantification in subsurface flow

    SIAM/ASA J. Uncertain. Quantificat.

    (2015)
  • A. Doucet et al.

    A tutorial on particle filtering and smoothing: fifteen years later

    Handbook of Nonlinear Filtering

    (2011)
  • A. Gelman et al.

    Simulating normalizing constants: from importance sampling to bridge sampling to path sampling

    Stat. Sci.

    (1998)
  • S. Ghosal et al.

    Fundamentals of Nonparametric Bayesian Inference

    (2017)
  • Cited by (58)

    • Multi-index ensemble Kalman filtering

      2022, Journal of Computational Physics
      Citation Excerpt :

      This challenge can be overcome by the multilevel Monte Carlo method (MLMC) [24], which achieves substantial variance reduction by simulating pairwise coupled realizations on a hierarchy of temporal discretization levels. MLMC is a flexible methodology that has been combined with many other methods and successfully implemented in various fields: quasi-Monte Carlo [25,43,55], sequential Monte Carlo [13,12,51,44], inverse problems and experimental design [10,57,47,26,58], differential equations with randomness [39,19,8,42,9,5], limit theorems [2,33], importance sampling [32,41,21], and machine learning [48]. The multilevel EnKF (MLEnKF) method was introduced by Hoel et al. [34] for stochastic differential equation models with discrete-time observations, and an alternative version based on a sample average of independent pairwise coupled EnKF estimators was subsequently developed [35].

    • Multifidelity multilevel Monte Carlo to accelerate approximate Bayesian parameter inference for partially observed stochastic processes

      2022, Journal of Computational Physics
      Citation Excerpt :

      In particular, the multilevel Monte Carlo (MLMC) method [49,50] expands an expectation as a telescoping summation of bias corrections and achieves variance reductions by exploiting path-wise convergence properties of numerical schemes for solving stochastic differential equations (SDEs) [51] or discrete-state Markov processes [52–54]. Recently, the MLMC telescoping summation approach has also been successfully applied to Bayesian inference [55–59], as per Equation (1), including ABC-based samplers [60–62]. Many of the above approaches use approximate simulations in ways that are not always mutually exclusive.

    View all citing articles on Scopus
    View full text