1 Introduction

Event generators are indispensable simulation tools for understanding high energy collisions in particle physics. A core part of the event generators are parton showers where hard scattering events, calculated perturbatively, are dressed up with more and more partons in an iterative manner, typically starting from some high energy scale Q and evolving down towards smaller and smaller scales until an infrared cutoff \(Q_0\) is reached.

Branchings within the parton shower evolution occur with a rate P(qzx), which determines the probability density of the dynamic variables associated to the branching. These are the scale q, additional splitting variables z, and additional variables x on which a certain type of branching depends. The latter can be collections of discrete and continuous values. Branchings are ordered in decreasing scale q in the evolution interval Q to \(Q_0\). The probability for a certain splitting not to occur between the scales Q and q is given by the so-called Sudakov form factor \(\varDelta _P(q|Q,x)\),

$$\begin{aligned} \varDelta _P(q|Q,x) = \exp \left( -\int _q^Q \mathrm{d}k\int \mathrm{d}z\ P(k,z,x) \right) . \end{aligned}$$

The density for the dynamic variables describing an individual branching which is to occur below a scale Q of a previous branching (or an initial condition set by the hard process), is then given by

$$\begin{aligned}&\frac{\mathrm{d}S_P (q|Q,z,x)}{\mathrm{d}q\ \mathrm{d}z} = \varDelta _P(Q_0|Q,x)\delta (q-Q_0)\delta (z-z_0) \nonumber \\&\quad + \varDelta _P(q|Q,x)P(q,z,x)\theta (Q-q)\theta (q-Q_0), \end{aligned}$$
(1)

where \(\varDelta _P(Q_0|Q,x)\delta (q-Q_0)\delta (z-z_0)\) represents events which did not radiate until they reached the cut-off scale \(Q_0\) and \(z_0\) is an irrelevant parameter associating fixed branching variables to the cutoff scale, where no actual branching will occur. Different branching types \(x_1,\ldots , x_n\), with a total rate given by \(\sum _{i=1}^n P(q,z,x_i)\), can be drawn algorithmically using the competition algorithm (see e.g. [1, 2]), which we will not discuss in more detail in this letter.

An individual splitting kernel P(qzx) cannot generally be integrated to an invertible function, which would allow to draw from \(\mathrm{d}S_P\) using sampling by inversion, so what is typically done is to find an overestimate R(qzx) (s.t. \(R(q,z,x) \ge P(q,z,x)\) for all qzx), which can be integrated to an invertible function. Equating the normalized integrated splitting kernel to a random number \(r \in (0,1)\) and solving for q then generates a new scale q with the distribution for R(qzx), and it can be proved that accepting the proposed branching variables with probability P(qzx)/R(qzx) generates the desired density P(qzx) instead [1,2,3,4,5].

In practice, in the case of a positive definite splitting kernel P(qzx) with a known overestimate R(qzx), the algorithm to draw from \(\mathrm{d}S_P\) then is given by Algorithm 1.

figure a

This algorithm is statistically perfect in the sense that it produces the correct distribution using only (positive) unit weights. In case of splitting kernels of indefinite sign, or in an attempt to include variations of the splitting rate P attached to a fixed sequence of branching variables, weights different from unity are unavoidable, see [2, 5,6,7,8,9,10] for some of these applications.

Especially when extending parton showers to include corrections beyond leading order [11,12,13] as well as sub-leading \(N_c\) effects and in attempts to perform the evolution at the amplitude level [10, 14,15,16,17,18,19,20], negative contributions to the splitting rates arise. These contributions require the weighted Sudakov veto algorithm, Algorithm 2, in order to be included in the simulation.

figure b

While the weighted Sudakov veto algorithm can be shown to correctly account for negative contributions, an issue with this algorithm and in general with weighting procedures, is largely varying weight distributions which accumulate multiplicatively during the simulation of an event, especially in presence of the competition algorithm. As a consequence, the Monte Carlo variance of the algorithm increases approximately geometrically fast with the number of weight changes caused by each emission attempt. In [10] the weight degeneration was partly reduced by only keeping event weights down to the winning scale in the competition step. This greatly reduced the weight spreading, but did not fully resolve the issue, in the sense that incorporating yet more emissions with negative weights would have produced unmanageable weights.

Approaching the parton shower precision era, we expect the issue of negative or varying weights to be a severe obstacle that has to be overcome, preferably by more general methods than to case by case monitor weights and adjust weighting algorithms. We therefore suggest to utilize the method of resampling [21, 22] in the context of Monte Carlo event generators. We introduce the new method in this letter as follows: in Sect. 2, we introduce the basics of the resampling method, while a benchmark example, illustrating the resampling power by comparing the proposed approach to a simplified parton shower is given in Sect. 4. We also show the efficiency of resampling by taking a more realistic parton shower [23] with unit weights, destroying its statistical convergence beyond recognition using the weighted Sudakov algorithm, Algorithm 2, and recovering good statistical convergence using the resampling method. Finally we make an outlook and discuss improvements in Sect. 5.

2 Resampling

The key idea behind resampling is to, rather than reweightFootnote 1N weighted events, select n events among the N weighted ones in proportion to their weights. This is done in an interleaved procedure which is supposed to keep the weight distribution as narrow as possible at each intermediate step. After each weight change (in our case induced by the shower), all selected events are assigned the weight N/n and are further evolved separately.Footnote 2 It is important to mention that this method will introduce correlated events; some of them will be duplicated or have partly identical evolution history. The number n of selected events may well be chosen equal to the number N of events, but can also be chosen differently.

Note that the resampling procedure will not alter the flexibility of the simulation when it comes to predicting arbitrarily many differential cross sections in one run of the program; in fact, the appearance of correlated events is also known in approaches using Markov Chain Monte Carlo (MCMC) methods as explored for the hard process cross sections in [24].

In the more commonly used reweighting procedure, events with small weights are selected with a low probability; however, if selected, they are assigned the same nominal weight (typically 1) as other events. Using resampling, also the large weights are kept under control. Instead of being allowed to increase without limit, all event weights can be kept at unity through duplication of events with large weights. This greatly improves the statistical convergence properties, and mitigates the risks of obtaining distributions that appear statistically wrong for a small number of events.

In an implementation of a parton shower, a natural way of implementing the resampling procedure is to resample after each emission step, i.e. to let each event radiate according to whatever versions of Sudakov factors and splitting kernels are used, and to then, among the say \(N'\) evolved events, pick \(N'\) events in proportion to their weights \(w_i\), meaning that some events will be duplicated and some will be forgotten. This can even be extended to apply to intermediate proposals within the veto and competition algorithm itself; we will explore both options in Sect. 4. In any way, the resampling mechanism is guiding the gradual evolution of the events by imposing a selection pressure for fitness as measured by the weights.

Events which have reached the cut-off scale \(Q_0\) can be ignored in the resampling procedure, since these are not evolved and will hence not acquire any further weights in the shower evolution. Rather, resampling among events which already have unit weights can only impair the statistical convergence, since some events will be “forgotten”, whereas others will be duplicated. After resampling, all the resampled events are assigned the same weight; this weight is given by the average weight of the evolved events among which the resampled events are drawn. As a consequence, resampling will not introduce any statistical bias, in the sense that distributions of observables before and after resampling are statistically equivalent; we refer to Appendix A for further mathematical details. Since the repeated resampling operations continuously uniformise the weights of the evolving events, they serve to stabilise numerically Algorithm 2, leading to a dramatic reduction of variance.

In the broader context of Monte Carlo simulation, resampling was first introduced in [21, 25] as a means for transforming Monte Carlo samples of weighted simulations into uniformly weighted samples. Later, [22] proposed resampling as a tool for controlling weight degeneracy in sequential importance sampling methods [26]. In sequential importance sampling, weighted Monte Carlo samples targeting sequences of probability distributions are formed by means of recursive sampling and multiplicative weight updating operations. The observation in [22], that recurrent resampling guarantees the numerical stability of sequential importance sampling estimators over large time horizons—a finding that was later confirmed theoretically in [27]—lead to an explosive interest in such sequential Monte Carlo methods during the last decades. Today sequential Monte Carlo methods constitute a standard tool in the statistician’s tool box and are applied in a variety of scientific and engineering disciplines such as computer vision, robotics, machine learning, automatic control, image/signal processing, optimisation, and finance; see e.g. the collection [28].

We emphasise that when implemented properly, the resampling procedure does not involve any significant time-penalty. The selection of n events among N ones in proportion to their weights \(w_i\) is equivalent to drawing n independent and identically distributed indices \(i_1, \ldots , i_n\) among \(\{1, \ldots , N\}\) such that the probability that \(i_\ell = j\) is \(w_j / \sum _{\ell = 1}^N w_\ell \). Naively, one may, using the “table-look-up” method, generate each such index \(i_\ell \) by simulating a uniform (i.e., a uniformly distributed number) \(u_\ell \) over (0, 1), finding k such that

$$\begin{aligned} \sum _{i = 1}^{k - 1} \frac{w_i}{\sum _{j = 1}^N w_j} < u_\ell \le \sum _{i = 1}^k \frac{w_i}{\sum _{j = 1}^N w_j}, \end{aligned}$$

and letting \(i_\ell = k\). Assume that \(n = N\) for simplicity; then, since using a simple binary search tree determining k requires, on the average, \(\log _2 N\) comparisons, the overall complexity for generating \(\{ i_\ell \}\) is \(N \log _2 N\). However, it is easily seen that if the uniforms \(\{ u_\ell \}\) are ordered, \(u_{(1)} \le u_{(2)} \le \cdots \le u_{(N)}\), generating the N indices \(\{ i_\ell \}\) using the table-look-up method requires only N comparisons. Hence, once a sample \(\{ u_{(\ell )} \}\) of N ordered uniforms can be simulated at a cost increasing only linearly in N, the overall complexity of the resampling procedure is linear in N.

In the following we review one such simulation approach, which is based on the observation that the distribution of the order statistics \((u_{(1)}, \ldots , u_{(n)})\) of n independent uniforms coincides with the distribution of

$$\begin{aligned} \left( \prod _{i = 1}^n \root i \of {v_i}, \prod _{i = 2}^n \root i \of {v_i}, \ldots , \root n - 1 \of {v_{n - 1}} \root n \of {v_n}, \root n \of {v_n} \right) , \end{aligned}$$

where the n random variables \(\{ v_i \}\) are again independent and uniformly distributed over (0, 1); see [29, Chapter 5]. As a consequence, a strategy for simulating \(\{ u_{(\ell )} \}\) is given by the Algorithm 3 having indeed a linear complexity in n.

figure c

Alternatively, one may apply the method of uniform spacings; see again [29, Chapter 5] for details. A caveat of the resampling procedure that one must be aware of is that it typically leads to event path depletion for long simulations. Indeed, when the number of iterations is large the ancestral paths of the events will typically, if resampling is performed systematically, coincide before a certain random point; in other words, the event paths will form an ancestral tree with a common ancestor. For \(n = N\) one may establish a bound on the expected height of the “crown” of this ancestral tree, i.e., the expected time distance from the last generation back to the most recent common ancestor, which is proportional to \(N \ln N\); see [30]. Thus, when the number of iterations increases, the ratio of the height of the “crown” to that of the “trunk” tends to zero, implying high variance of path space Monte Carlo estimators.

The most trivial remedies for the path depletion phenomenon is to increase the sample size N or to reduce the number of resampling operations. In [31] it is suggested that resampling should be triggered adaptively and only when the coefficient of variation

$$\begin{aligned} C_V^2= & {} N \sum _{i = 1}^N \left( \frac{w_i}{\sum _{\ell = 1}^N w_\ell } - \frac{1}{N} \right) ^2 \nonumber \\= & {} N \sum _{i = 1}^N \left( \frac{w_i}{\sum _{\ell = 1}^N w_\ell } \right) ^2 - 1 \end{aligned}$$
(4)

exceeds a given threshold. The coefficient of variation detects weight skewness in the sense that \(C_V^2\) is zero if all weights are equal. On the other hand, if the total weight is carried by a single draw (corresponding to the situation of maximal weight skewness), then \(C_V^2\) is maximal and equal to \(N - 1\). A closely related measure of weight skewness is the effective sample size

$$\begin{aligned} \texttt {ESS}= \frac{N}{1 + C_V^2}. \end{aligned}$$

Heuristically, \(\texttt {ESS}\) measures, as its name suggests, the number of samples that contribute effectively to the approximation. It takes on its maximal value N when all the weights are equal and its minimal value 1 when all weights are zero except for a single one. Thus, a possible adaptive resampling scheme could be obtained by executing resampling only when \(\texttt {ESS}\) falls below, say, \(50\%\).Footnote 3 We refer to [32] for a discussion and a theoretical analysis of these measures in the context of sequential Monte Carlo sampling.

With respect to negative contributions to the radiation probability, showing up as negative weighted events, we note that events may well keep a negative weight. In this case, resampling is just carried through in proportion to \(|w_i|\), whereas events with \(w_i<0\) contribute negatively when added to histograms showing observables. Negative contributions, for example from subleading color or NLO, do thus not pose an additional problem from the resampling perspective.

3 A view to variance estimation

Since the resampling operation induces complex statistical dependencies between the generated events, statistical analysis of the final sample is non-trivial. Still, as we will argue in the following, such an analysis is possible in the light of recent results on variance estimation in sequential Monte Carlo methods [33, 34]. As explained in Sect. 2, subjecting continuously evolving events to resampling induces genealogies of the same. In [33] the authors propose estimating the variance of sequential Monte Carlo methods by stratifying the final samples with respect to the distinct time-zero ancestors traced by these genealogies. As the families of events formed in this way will be only weakly dependent (via resampling), this approach allows the Monte Carlo variance to be estimated on the basis of a statistic that is reminiscent of the standard sample variance for independent data. More precisely, let \(\{(\xi _{i}, w_{i})\}_{i = 1}^N\) be a sample of events with associated weights generated by means of n resampling operations and let \(\varepsilon _{n}^{i}\) denote the index of the time-zero ancestor of the event \(\xi _{i}\);Footnote 4 then the estimator of an observable h may be expressed as

$$\begin{aligned} \frac{1}{N} \sum _{i = 1}^N w_{i} h(\xi _{i}) = \frac{1}{N} \sum _{i = 1}^N \upsilon _i {\mathop {=}\limits ^{\tiny {\mathrm {not.}}}} {\bar{\upsilon }}, \end{aligned}$$
(5)

where \(\upsilon _i = \sum _{\ell : \varepsilon _{n}^{\ell } = i} w_{\ell } h(\xi _{\ell })\) is the contribution of the family of events originating from the time-zero ancestor i. Note that the time-zero ancestors can be easily kept track of as the sample propagates; indeed, given the time-zero ancestor indices \(\{ \varepsilon _{n}^{i} \}_{i = 1}^N\) after n resampling operations, the indices \(\{ \varepsilon _{n + 1}^{i} \}_{i = 1}^N\) after \(n + 1\) operations are, for all i, given by

$$\begin{aligned} \varepsilon _{n + 1}^{i} = \varepsilon _{n}^{\ell _i}, \end{aligned}$$

where \(\{ \ell _i \}_{i = 1}^N\) are the ancestor indices. Since the variables \(\{ \upsilon _i \}_{i = 1}^N\) are derived from disjoint sets of event paths, there is only a weak statistical dependence between the same. Recall that in the case of truly independent and identically distributed draws \(\{ y_i \}_{i = 1}^N\), the classical estimator

$$\begin{aligned} s^2&= \frac{1}{N(N - 1)} \sum _{i = 1}^N (y_i - {\bar{y}})^2 \\&= {\bar{y}}^2 \left( 1 - \frac{N}{N - 1} \right) + \left( \frac{N}{N - 1} \right) \frac{1}{N^2} \sum _{i = 1}^N y_i^2 \end{aligned}$$

of the variance of the sample mean \({\bar{y}}\) is unbiased for all N. For sequential importance sampling with resampling, the authors of [34] propose (as a refinement of the estimator suggested by [33]) the generalisation

$$\begin{aligned} {\bar{\upsilon }}^2 \left[ 1 - \left( \frac{N}{N - 1} \right) ^{n + 1} \right] +\left( \frac{N}{N - 1} \right) ^{n + 1} \frac{1}{N^2} \sum _{i = 1}^N \upsilon _i^2 \end{aligned}$$
(6)

(where, again, n is the number of resampling operations) of \(s^2\). Interestingly, in the case where the full Monte Carlo sample is involved in every resampling operation, it is possible to show that (6) is indeed an unbiased estimator of the variance of \({\bar{\upsilon }}\) for all N; see [34, Theorem 4]. Even though we proceed somewhat differently and exclude non-evolving events from the resampling, we expect the same approach to apply to our estimator. More specifically, since resampling among non-evolving events can only deteriorate statistical convergence, the variance estimate in (6) from the all-evolving case, must serve as an upper estimate of our partially resampled sets. However, a full study of the theoretical and empirical properties of an estimator of type (6) in our context is the topic of ongoing research and hence beyond the scope of the present paper.

4 Benchmark examples

In this section we illustrate the impact of the resampling procedure both in a simplified setup and within a full parton shower algorithm. As a toy model we consider a prototype shower algorithm with infrared cutoff \(Q_0\) and splitting kernels

$$\begin{aligned} \mathrm{d}P(q,z,x) \!=\! a \frac{\mathrm{d}q}{q} \frac{(1\!+\!z^2)\mathrm{d}z}{1-z} \theta (1\!-\!Q_0/q-z)\theta (z-x) \ ,\nonumber \\ \end{aligned}$$
(7)

with \(a>0\) being some parameter (similar to a coupling constant) and x an external parameter in the form of an initial “momentum fraction” (first arbitrarily set to 0.1 and later changing in the shower with each splitting).

Starting from a scale Q (initially we make the arbitrary choice \(Q=1\), and pick a cut-off value of \(Q_0=0.01\)) we obtain a new lower scale q and a momentum fraction z (of the previous momentum), sampled according to the Sudakov-type density \(\mathrm{d}S_P(q|Q,z,x)\) associated with the splitting kernel above.

The momentum fraction parameter x will be increased to x/z after the emission, such that for an emission characterized by \((q_i,z_i,x_i)\) the variables for the next emission will be drawn according to \(\mathrm{d}S_P(q_{i+1}|q_i,z_{i+1},x_i/z_i=x_{i+1})\). This algorithm resembles “backward evolution” to some extent, but in this case we rather want to define a model which features both the infrared cutoff (upper bound, \(1-\frac{Q_0}{Q}\), on z and lower bound on evolution scale \(q>Q_0\)) and the dynamically evolving phase space for later emissions (lower bound on z), while still being a realistic example of a parton shower, with double and single logarithmic enhancement in Q/q.

Fig. 1
figure 1

Illustration of the different algorithms we consider as benchmark algorithms; we depict the algorithm as executed in each shower instance. Each \(\mathrm{d}S_{R_i}\) block corresponds to a proposal density for one of n competing channels, \(\epsilon \) diamonds depict the acceptance/rejection step with the weighting procedure of the Sudakov veto algorithm, and the selection of the highest scale within the competition algorithm. Red dots indicate when the algorithm in each shower instance is interrupted and resampling in between the different showers is performed. Depending on the resampling strategy, events which reached the cutoff can be included in the resampling, or put aside as is done in our benchmark studies. The left flow diagram corresponds to the benchmark example discussed, while the right flow diagram corresponds to the implementation in the Python dipole shower. Re-entering after a veto step (the ‘veto’ branches), will start the proposal from the scale just vetoed. Note that in the first case (left), resampling is performed only after emission (or shower termination), whereas in the second case, it is performed after each (sometimes rejected) emission attempt

Proposals for the dynamic variables q and z are generated using a splitting kernel of \(R(q,z,x)=\frac{a}{q}\frac{2}{1-z}\) in place of \(P(q,z,x)=\frac{a}{q}\frac{1+z^2}{1-z}\), and with simplified phase space boundaries \(0<z <1-Q_0/q\). We consider the distributions of the intermediate variables q and z, as well as the generated overall momentum fraction x, for up to eight emissions.

To consider a realistic scenario, similar to what is encountered in actual parton shower algorithms, where many emission channels typically compete, we generate benchmark distributions by splitting the parameter a into a sum of different values, (arbitrarily [0.01, 0.002, 0.003, 0.001, 0.01, 0.03, 0.002, 0.002 ,0.02, 0.02]) corresponding to competing channels. The usage of weighted shower algorithms amplified by competition is a major source of weight degeneration, as the weights from the individually competing channels need to be multiplied together.

We perform resampling after the weighted veto algorithm (choosing a constant \(\epsilon =0.5\) in this example) and the competition algorithm have proposed a new transition, as schematically indicated in the left panel in Fig. 1. In Fig. 2 we illustrate the improvement of the resampling algorithm.

We also try the resampling strategy out in a more realistic parton shower setting, using the Python dipole parton shower from [23]. In this case, we run 10,000 events in parallel, and each event is first assigned a new emission scale (or the termination scale \(Q_0\)) using the standard overestimate Sudakov veto algorithm combined with the competition algorithm. For each shower separately, we then use one veto step as in Algorithm 2 such that some emissions are rejected. In the ratio P/R, P contains the actual splitting kernel and coupling constant, whereas R contains an overestimate with the singular part of the splitting kernel and an overestimate of the strong coupling constant, see [23] for details. The constant \(\epsilon \) in Algorithm 2 is put to 0.5. This results in weighted shower events, and the resampling step, where 10,000 showers among the 10,000 are randomly kept in proportion to their weight, is performed, again resulting in unweighted showers. This procedure is illustrated in the right panel in Fig. 1, and we remark that the resampling step can be combined with the shower in many different ways, for example, as in the first example above (to the left in the figure) after each actual emission (or reaching of \(Q_0\)), or, as in the second example (to the right) also after rejected emission attempts.

Fig. 2
figure 2

Distribution of the scale of the 4th emission, for sampling the distribution with the competition algorithm splitting a into a sum across 10 different competing channels, with red being the direct algorithm, blue the weighted Sudakov algorithm and orange the resampled, weighted Sudakov algorithm. The bands in the distribution reflect the minimum and maximum estimate encountered in 300 runs with different random seeds and thus give a rough estimate of the width of the distribution of estimates

The result for the Durham \(y_{45}\) jet observable [35] is shown in Fig. 3. The smooth curve in red gives the unaltered result for the Durham \(y_{45}\) jet observable [35]; the jagged blue curve represents the same parton shower with equally many events, but where the weighted Sudakov algorithm, Algorithm 2, has been used. The resulting distribution is so jagged and distorted that it appears incompatible at first sight. Only when adding interleaved resampling, in-between the emissions steps (in orange) is it clear that the curves actually represent the same statistical distribution. This beautifully illustrates the power of the resampling method. To avoid oversampling, the resampling is turned off for events for which the shower has terminated.

We stress that, while the usage of the weighted Sudakov algorithm in the above examples is completely unmotivated from a statistical perspective, since the default starting curve already has optimal statistics, the resampling procedure can equally well be applied in scenario with real weight problems, coming for example from applying the weighted Sudakov algorithm in a sub-leading \(N_c\) parton shower.

Fig. 3
figure 3

The Durham \(y_{45}\) observable as implemented in [23] (default). The same observable when sampled with Algorithm 2 without resampling (no resampling), and when sampled with interleaved resampling after each emission or rejection step in Algorithm 2 (resampling). In the lower panel we show the range of variation as in Fig. 2

5 Conclusion and outlook

In this note we advocate to use the resampling method for drastically improving the statistical convergence for parton shower algorithms where weighted algorithms are used. We also stress that the resampling procedure can always be applied separately to only those weights which have been generated by the weighted shower algorithm, leaving weights from the hard process aside. Inclusive quantities are therefore expected to be reproduced just as it would be the case for unweighted shower evolution.

The implementation tests, performed in the most simplistic way, illustrate beautifully the power of the basic resampling method, and prove its usefulness for parton showers. Nevertheless, we argue that further improvement could be achievable by various extensions of the algorithm to mitigate the ancestor depletion problem. One possible approach is to furnish the proposed resampling-based algorithm with an additional so-called backward simulation pass generating randomly new, non-collapsed ancestral paths on the stochastic grid formed by the events generated by the algorithm [36, 37]. Another direction of improvement goes in the direction of parallelisation of the algorithm, which is essential in scenarios with large sample sizes. Parallelization of sequential Monte Carlo methods is non-trivial due to the “global” sample interaction imposed by the resampling operation. Nevertheless, a possibility is to divide the full sample into a number of batches, or islands, handled by different processors, and to subject—“locally”—each island to sequential weighing and resampling.

In such a distributed system, statistical convergence may, following [38, 39], may be improved further by interleaving the evolution of the distinct islands with island-level resampling operations, in which the islands are resampled according to their average weights.

At the more conceptual level, let us remark that the method we suggest represents a first step in a change of paradigm, where parton showers are viewed as tools for generating correct statistical distributions, rather than tools for simulating independent events. We believe that the most advanced and precise high energy physics simulations, if they should run in an efficient way, will have no chance to avoid weighted algorithms, and as such heavily need to rely on methods like the resampling method outlined in this note. We will further investigate the method, also including the possibility of making cancellations in higher order corrections explicit already in intermediate steps of such a Monte Carlo algorithm rather than by adding large weights at the very end. We will also address the structural challenges in implementing these methods in existing event generators which we hope will help to make design decisions, keeping in mind the necessity of methods like the one presented here.