A multi-point Metropolis scheme with generic weight functions

https://doi.org/10.1016/j.spl.2012.04.008Get rights and content

Abstract

The multi-point Metropolis algorithm is an advanced MCMC technique based on drawing several correlated samples at each step and choosing one of them according to some normalized weights. We propose a variation of this technique where the weight functions are not specified, i.e., the analytic form can be chosen arbitrarily. This has the advantage of greater flexibility in the design of high-performance MCMC samplers. We prove that our method fulfills the balance condition, and provide a numerical simulation. We also give new insight into the functionality of different MCMC algorithms, and the connections between them.

Introduction

Monte Carlo statistical methods are powerful tools for numerical inference and stochastic optimization (see Robert and Casella (2004), for instance). Markov chain Monte Carlo (MCMC) methods are classical Monte Carlo techniques that generate samples from a target probability density function (pdf) by drawing from a simpler proposal pdf, usually to approximate an otherwise-incalculable (analytically) integral (Liu, 2004, Liang et al., 2010). MCMC algorithms produce a Markov chain with a stationary distribution that coincides with the target pdf.

The Metropolis–Hastings (MH) algorithm (Metropolis et al., 1953, Hastings, 1970) is the most famous MCMC technique. It can be applied to almost any target distribution. In practice, however, finding a “good” proposal pdf can be difficult. In some applications, the Markov chain generated by the MH algorithm can remain trapped almost indefinitely in a local mode meaning that, in practice, convergence may not be reached.

The Multiple-Try Metropolis (MTM) method of Liu et al. (2000) is an extension of the MH algorithm in which the next state of the chain is selected among a set of independent and identically distributed (i.i.d.) samples. This enables the MCMC sampler to make large step-size jumps without a lowering the acceptance rate; and thus MTM can explore a larger portion of the sample space in fewer iterations.

An interesting special case of the MTM, well-known in molecular simulation field, is the orientational bias Monte Carlo, as described in Chapter 13 of Frenkel and Smit (1996) and Chapter 5 of Liu (2004), where i.i.d. candidates are drawn from a symmetric proposal pdf, and one of these is chosen according to some weights directly proportional to the target pdf. Here, however, the analytic form of the weight functions is fixed and unalterable.

Casarin et al. (in press) introduced a MTM scheme using different proposal pdfs. In this case the samples produced are independent but not identically distributed. In Qin and Liu (2001), another generalization of the MTM (called the multi-point Metropolis method) is proposed using correlated candidates at each step. Clearly, the proposal pdfs are also different in this case.

Moreover, in Pandolfi et al. (2010) an extension of the classical MTM technique is introduced where the analytic form of the weights is not specified. In Pandolfi et al. (2010), the same proposal pdf is used to draw samples, so that the candidates generated in each step of the algorithm are i.i.d. Further interesting and related considerations about the use of auxiliary variables for building acceptance probabilities within a MH approach can be found in Storvik (2011).

In this paper, we draw from the two approaches (Qin and Liu, 2001, Pandolfi et al., 2010) to create a novel algorithm that selects a new state of the chain among correlated samples using generic weight functions, i.e., the analytic form of the weights can be chosen arbitrarily. Furthermore, we formulate the algorithm and the acceptance rule in order to fulfill the detailed balance condition.

Our method allows more flexibility in the design of efficient MCMC samplers with a larger coverage and faster exploration of the sample space. In fact, we can choose any bounded and positive weight functions to either improve performance or reduce computational complexity, independently of the chosen proposal pdf. Moreover, since in our approach the proposal pdfs are different, adaptive or interacting techniques can be applied, such as those introduced by Andrieu and Moulines (2006) and Casarin et al. (in press). An important advantage of our procedure is that, since in our procedure a new candidate is drawn from a conditional pdf which depends on the samples generated earlier during the same time step, it constructs an improved proposal by automatically building on the information obtained from the generated samples.

The rest of the paper is organized as follows. In Section 2 we recall the standard multi-point Metropolis algorithm. In Section 3 we introduce our novel scheme with generic weight functions and correlated samples. Section 4 provides a rigorous proof that the novel scheme satisfies the detailed balance condition. A numerical simulation is provided in Section 5 and finally, in Section 6, we discuss the advantages of our proposed technique and provide an insight into the relationships among different MTM schemes in literature.

Section snippets

Multi-point Metropolis algorithm

In the classical MH algorithm, a new possible state is drawn from the proposal pdf and the movement is accepted with a suitable decision rule. In the multi-point approach, several correlated samples are generated and, from these, a “good” one is chosen.

Specifically, consider a target pdf po(x) known up to a constant (hence, we can evaluate p(x)po(x)). Given a current state xR (we assume scalar values only for simplicity in the treatment), we draw N correlated samples each step from a sequence

Extension with generic weight functions

Now, we consider generic weight functions ωj(z1,,zj+1)Rj+1R+, that have to be (a) bounded and (b) positive. In this case, the algorithm can be described as follows.

  • 1.

    Draw N samples y1:N=[y1,y2,,yN] from the joint pdf qN(y1:N|x)=π1(y1|x)j=2Nπj(yj|x,y1:j1) namely, draw yj from πj(|x,y1:j1), with j=1,,N.

  • 2.

    Choose some suitable (bounded and positive) weight functions. Then, calculate each weight ωj(yj:1,x), and normalize them to obtain ω̄j, j=1,,N.

  • 3.

    Draw a y=yk{y1,,yN} according to ω̄1,,ω̄N,

Proof of the detailed balance condition

To guarantee that a Markov chain generated by an MCMC method converges to the target distribution p(x)po(x), the kernel A(y|x) of the corresponding algorithm fulfills the following detailed balance condition1p(x)A(y|x)=p(y)A(x|y). First of all, we have to

Toy example

Now we provide a simple numerical simulation to show an example of multi-point scheme with generic weight functions and compare it with the technique in Pandolfi et al. (2010). Let XR be a random variable2 with bimodal pdf po(x)p(x)=exp{(x24)2/4}. Our goal is to draw samples from po(x) using our proposed multi-point technique.

We

Discussion

In this work, we have introduced a Metropolis scheme with multiple correlated points where the weight functions are not defined specifically, i.e., the analytic form can be chosen arbitrarily. We proved that our novel scheme satisfies the detailed balance condition.

Our approach draws from two different approaches (Pandolfi et al., 2010, Qin and Liu, 2001) to form a novel efficient and flexible multi-point scheme.

The multi-point approach with correlated samples provides different advantages over

Acknowledgments

We would like to thank the Reviewer for his comments which have helped us to improve the first version of manuscript. Moreover, this work has been partially supported by Ministerio de Ciencia e Innovación of Spain (project MONIN, ref. TEC-2006-13514-C02- 01/TCM, Program Consolider-Ingenio 2010, ref. CSD2008- 00010 COMONSENS, and Distribuited Learning Communication and Information Processing (DEIPRO) ref. TEC2009-14504-C02-01) and Comunidad Autonoma de Madrid (project PROMULTIDIS-CM, ref.

References (12)

  • C. Andrieu et al.

    On the ergodicity properties of some adaptive MCMC algorithms

    The Annals of Applied Probability

    (2006)
  • Casarin, R., Craiu, R., Leisen, F., 2011. Interacting multiple try algorithms with different proposal distributions....
  • D. Frenkel et al.

    Understanding Molecular Simulation: From Algorithms to Applications

    (1996)
  • W.K. Hastings

    Monte Carlo sampling methods using Markov chains and their applications

    Biometrika

    (1970)
  • Liang, F., Liu, C., Caroll, R., 2010. Advanced Markov Chain Monte Carlo Methods: Learning From Past Samples. In: Wiley...
  • J.S. Liu

    Monte Carlo Strategies in Scientific Computing

    (2004)
There are more references available in the full text version of this article.

Cited by (29)

  • Compressed Monte Carlo with application in particle filtering

    2021, Information Sciences
    Citation Excerpt :

    Monte Carlo (MC) techniques come to the rescue for solving the most difficult problems of inference [27,44]. They are benchmark tools for approximating complicated integrals involving sophisticated multidimensional target densities, based ondrawing of random samples [44,34]. Markov Chain Monte Carlo (MCMC) algorithms, Importance Sampling (IS) schemes, and its sequential version (particle filtering) are the most important classes of MC methods [45].

  • Group Importance Sampling for particle filtering and MCMC

    2018, Digital Signal Processing: A Review Journal
  • A review of multiple try MCMC algorithms for signal processing

    2018, Digital Signal Processing: A Review Journal
    Citation Excerpt :

    Later on a more general algorithm, called Multiple Try Metropolis (MTM), was introduced [12].1 The MTM algorithm has been extensively studied and generalized in different ways [13–17]. Other techniques, alternative to the MTM schemes, are the so-called the Ensemble MCMC (EnMCMC) methods [18–21].

  • Orthogonal parallel MCMC methods for sampling and optimization

    2016, Digital Signal Processing: A Review Journal
    Citation Excerpt :

    Another important contribution of the work is the computational improvement provided by novel parallel implementations of MCMC techniques using multiple candidates at each iteration. We present two novel schemes for parallel Multiple Try Metropolis (MTM) chains [10,25–29] (and similarly to [12]) in order to reduce the overall computational cost in the same fashion of [11], saving generated samples, target evaluations and multinomial sampling steps. One of them is an extended version, using several candidates, of the Block Independent Metropolis presented in [11].

  • Improving the acceptance in Monte Carlo simulations: Sampling through intermediate states

    2015, Journal of Computational Physics
    Citation Excerpt :

    In MTM (as well as in the orientational-bias MC), the next configuration of the Markov chain is picked out of a number of independent samples, according to statistical weights which are carefully chosen in order to produce the correct equilibrium distribution. The MTM method was further extended [34] and modified in several ways, ranging from adaptive strategies [35–37], to the proposal of correlated states [38,39] (also the Waste-Recycling MC method could be viewed as a multiple-proposal MC strategy with correlated states, since the trial states are not generated independently from the same configuration [27]), to the Cool Walking MC method [40]. We believe that the best way to proceed with the description of ISMC is to start with an algorithm which does not satisfy DB, and then to show how it can be modified to obtain the ISMC algorithm.

  • A generalized multiple-try version of the Reversible Jump algorithm

    2014, Computational Statistics and Data Analysis
    Citation Excerpt :

    In particular, interesting extensions of the MTM approach are related to different proposal trials (Casarin et al., 2013) or correlated candidates (Qin and Liu, 2001; Craiu and Lemieux, 2007; Martino et al., 2012), which can be selected on the basis of a generic weighting function.

View all citing articles on Scopus
View full text