Moment-preserving and mesh-adaptive reweighting method for rare-event sampling in Monte-Carlo algorithms

https://doi.org/10.1016/j.cpc.2021.108041Get rights and content

Abstract

We present novel roulette schemes for rare-event sampling that are both structure-preserving and unbiased. The boundaries where Monte Carlo markers are split and deleted are placed automatically and adapted during runtime. Extending existing codes with the new schemes is possible without severe changes because the equation of motion for the markers is not altered. These schemes can also be applied in the presence of nonlinear and nonlocal coupling between markers. As an illustrative application, we have implemented this method in the ASCOT-RFOF code, used to simulate fast-ion generation by radio-frequency waves in fusion plasmas. In this application, with this method the Monte-Carlo noise level for typical fast-ion energies can be reduced at least of one order of magnitude without increasing the computational effort.

Introduction

Monte Carlo techniques can be efficient for solving a variety of problems [1], [2], [3], [4], [5], [6]. One example is plasma physics and the heating of plasmas with ion cyclotron resonance heating (ICRH) [7]. The particle distribution function f in a plasma is governed by a Boltzmann equation in Fokker-Planck approximationft=azf+z(zbb2f). z is the phase space coordinate, D=(bb)/2 is the diffusion tensor, and a the advection velocity. The single arrow denotes a vector, the double arrow a tensor. D and a together are referred to as Fokker-Planck Coefficients (FPC). f can be computed by simulating the corresponding Langevin equation [8, p. 294ff]dz=adt+bξ(t)dt for test particles, which we call markers. ξ are Gaussian random numbers with zero mean and variance 1. The markers are distributed according to f, which can be reconstructed for example by using a histogram:fhistogram bin(t)=binf(z,t)dzbindzi1bin(zi(t))Wi(t)bindz, where we approximated the integral over the distribution function with the sum over the markers inside a bin of the histogram. 1bin(z) is 1 if z lies in the bin of the histogram, and 0 otherwise. Wi is the weight of the i-th marker, which is constant unless reweighting is employed.1 The markers are randomly initialized such that equation (3) is valid for the initial condition for f.

For ICRH the FPC have two contributions: collisions, for which a Maxwellian background plasma with which the particles collide can be assumed [10], and radio frequency (RF) interactions. The RF FPC depend on the amplitude of the wave field Awave. The wave amplitude is set by the injected power:Prf=AwaveDg(f(z),D(z),v(z))dz, where Awaveg is the density of the absorbed power, and D the domain of the Fokker-Planck equation. When modeling an experiment one has to adjust the FPC such that the absorbed power is close to the power injected in the experiment. Equations (1) and (2) are then nonlocal, complicating the numerical solution of equation (1). The Langevin equation (2) can still be solved for fusion plasmas by orbit following codes such as ASCOT [10]. The code library RFOF [11] is used here to include the RF interaction.

A direct numerical solution of equation (2) quickly becomes computationally expensive when the solution of the diffusion-advection equation varies by several orders of magnitude in the domain of the solution. As f is proportional to the probability distribution of marker positions, only few markers are in areas where f is small. Those areas are therefore only poorly resolved. Plasmas heated by ICRH exhibit a population of highly energetic ions, which are few compared to thermal ions but of prime interest [12], [13], [14], [15].

From importance sampling we know how to reduce the error of the simulation results in such regions of interest: We have to place many markers there. Equation (3) tells us that this corresponds to lower marker weights in those regions compared to regions of less interest. As the simulation progresses, the markers will mix due to the stochastic term in equation (2), i.e. small markers will leave the region of interest and large markers will enter it. We need to adjust the weights of markers entering and leaving the region to keep the weight of markers in the region of interest smaller than the average weight, which is called reweighting. There exist different reweighting techniques in the literature which however suffer from drawbacks. The classic splitting and Russian Roulette techniques [16, p. 99f] lead to growing fluctuations of the marker number and total weight because the Russian Roulette does not conserve the conserved quantities of the underlying Fokker-Planck equation. For long times all markers will eventually be deleted (cf. the Bienaymé-Galton-Watson process [17, 5.4], which we introduce in Section 2.3). There are other methods, based on merging markers [18], which however bias the result. The mean result is then no longer the exact result of the underlying Fokker-Planck equation, although the extent of this effect can be mitigated by selecting well the markers that are merged (Octree binning) [18].

There exist other methods for resolving regions of phase space with low densities, often collectively referred to as rare event sampling. Here we however choose splitting and Russian Roulette techniques because

  • They can cope with the nonlinear and nonlocal property introduced by the fixed RF power (equation (4)).

  • One can avoid additional bias compared to the direct solution of the Langevin equation (2) with the new splitting and roulette techniques presented in this paper. We show this analytically for a special case in Appendix A, and verify it numerically [19].

  • The new reweighting schemes are structure preserving because no errors in the moments of the distribution function are introduced that grow as the simulation progresses.

  • It is possible to extend existing codes that directly simulate the Langevin equation (2) without big changes because the equation of motion for the markers is not altered.

This paper is structured as follows: In Sections 2.2 and 2.3 we introduce the traditional splitting and roulette methods as tools for variance reduction. In Sections 2.3.1 - 2.5 we present our new reweighting schemes, whose parameters can be chosen as described in Section 2.6. The schemes are applied in Section 3. In Section 3.3 we show how the new schemes can reduce by orders of magnitude the number of markers needed to obtain a certain noise amplitude for simulations with the orbit-following code ASCOT [10].

Section snippets

Reweighting techniques

Diffusion-advection equations are often solved by markers evolving according to the associated Langevin equation (2). This can be inefficient when the distribution function f varies by orders of magnitude in its domain. When we calculate f from the marker positions by using a histogram, we can use bin statistics2

Characterization and performance of novel schemes

We characterize the schemes by considering simple diffusion-advection equations in sections 3.1 and 3.2. We let the simulations reach steady state and use bins to calculate the phase space density for many, i.e. 106, points in time. From this set of measurements for the same phase space densities one can then calculate the variance of the density measurement.

After splitting, the markers decorrelate on the length scale of the bin size. To analyze the decorrelation process it is therefore

Conclusion

When using splitting and roulette schemes one has to define regions in the simulation domain and boundaries between them. Every region is associated with the weight of the markers therein. We presented a method to automatically define these regions, and even adapt them during the simulation to account for changing requirements due to the evolving distribution function. We developed two new roulette schemes because the traditional Russian Roulette [16, p. 99] leads to fluctuations that

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

We would like to thank Jakob Ameres, Roman Hatzky and Omar Maj for fruitful discussions. This work has been carried out within the framework of the EUROfusion Consortium and has received funding from the Euratom research and training programme 2014-2018 and 2019-2020 under grant agreement No 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission.

References (23)

  • U.H. Hansmann et al.

    Curr. Opin. Struct. Biol.

    (1999)
  • M.J. Saxton

    Biophys. J.

    (1994)
  • G. Bal et al.

    J. Comput. Phys.

    (2011)
  • E. Hirvijoki et al.

    Comput. Phys. Commun.

    (2014)
  • R.S. Martin et al.

    J. Comput. Phys.

    (2016)
  • M. Vranic et al.

    Comput. Phys. Commun.

    (2015)
  • R. Dum et al.

    Phys. Rev. A

    (1992)
  • S. Lemaire et al.

    Phys. Rev. C

    (2005)
  • P. Brandimarte

    Handbook in Monte Carlo Simulation: Applications in Financial Engineering, Risk Management, and Economics

    (2014)
  • T.H. Stix

    Waves in Plasmas

    (1992)
  • S. Ichimaru

    Basic Principles of Plasma Physics: A Statistical Approach

    (2018)
  • Cited by (1)

    The review of this paper was arranged by Prof. David W. Walker.

    View full text