Next Article in Journal
Geometric Variational Inference and Its Application to Bayesian Imaging
Previous Article in Journal
Comparing the Zeta Distributions with the Pareto Distributions from the Viewpoint of Information Theory and Information Geometry: Discrete versus Continuous Exponential Families of Power Laws
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Simulation-Based Inference of Bayesian Hierarchical Models While Checking for Model Misspecification †

CNRS & Sorbonne Université, UMR 7095, Institut d’Astrophysique de Paris, 98 bis boulevard Arago, F-75014 Paris, France
Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.
Phys. Sci. Forum 2022, 5(1), 4; https://doi.org/10.3390/psf2022005004
Published: 2 November 2022

Abstract

:
This paper presents recent methodological advances for performing simulation-based inference (SBI) of a general class of Bayesian hierarchical models (BHMs) while checking for model misspecification. Our approach is based on a two-step framework. First, the latent function that appears as a second layer of the BHM is inferred and used to diagnose possible model misspecification. Second, target parameters of the trusted model are inferred via SBI. Simulations used in the first step are recycled for score compression, which is necessary for the second step. As a proof of concept, we apply our framework to a prey–predator model built upon the Lotka–Volterra equations and involving complex observational processes.

1. Introduction

Model misspecification is a long-standing problem for Bayesian inference: when the model differs from the actual data-generating process, posteriors tend to be biased and/or overly concentrated. In this paper, we are interested the problem of model misspecification for a particular, but common, class of Bayesian hierarchical models (BHMs): those that involve a latent function, such as the primordial power spectrum in cosmology (e.g., [1]) or the population model in genetics (e.g., [2]).
Simulation-based inference (SBI) only provides the posterior of top-level target parameters and marginalizes over all other latent variables of the BHM. Alone, it is therefore unable to diagnose whether the model is misspecified. Key insights regarding the issue of model misspecification can usually be obtained from the posterior distribution of the latent function, as there often exists an independent theoretical understanding of its values. An approximate posterior for the latent function (a much higher-dimensional quantity than the target vector of parameters) can be obtained using selfi (simulator expansion for likelihood-free inference, [1]), an approach based on the likelihood of an alternative parametric model, constructed by linearizing model predictions around an expansion point.
This paper presents a framework that combines selfi and SBI while recycling the necessary simulations. The simulator is first linearized to obtain the selfi posterior of the latent function. Next, the same simulations are used for data compression to the score function (the gradient of the log-likelihood with respect to the parameters), and the final SBI posterior of target parameters is obtained.

2. Method

2.1. Bayesian Hierarchical Models with a Latent Function

In this paper, we assume a given BHM consisting of the following variables: ω R N (vector of N target parameters), θ R S (vector containing the values of the latent function θ at S support points), Φ R P (data vector of P components), and ω ˜ R N (compressed data vector of size N). We typically expect N O ( 5 10 ) target parameters, S O ( 10 2 10 3 ) support points; P can be any number and as large as O ( 10 7 ) for complex data models. We further assume that ω and θ are linked by a deterministic function T, usually theoretically well-understood and numerically cheap. Therefore, the expensive and potentially misspecified part of the BHM is the probabilistic simulator linking the latent function θ to the data Φ , P ( Φ | θ ) . The deterministic compression step C linking Φ to ω ˜ is discussed in Section 2.4.

2.2. Latent Function Inference with SELFI

The first part of the framework proposed in this paper is to infer the latent function θ conditional on observed data Φ O . This is an inference problem in high dimension (S, the number of support points for the latent function θ ), which means that usual SBI frameworks, allowing a general exploration of parameter space, will fail and that stronger assumptions are required. selfi [1] relies upon the simplification of the inference problem around an expansion point θ 0 .
The first assumption is a Taylor expansion (linearization) of the mean data model around θ 0 . Namely, if Φ ^ θ E Φ θ is the expectation value of Φ θ , where Φ θ are simulations of Φ given θ (i.e., Φ θ P ( Φ | θ ) ), we assume that
Φ ^ θ f 0 + f 0 · ( θ θ 0 ) f ( θ ) ,
where f 0 Φ ^ θ 0 is the mean data model at the expansion point θ 0 , and f 0 is the gradient of f 0 at the expansion point (for simplification, we note f 0 = θ f 0 , where the gradient is taken with respect to θ ). The second assumption is that the (true) implicit likelihood of the problem is replaced by a Gaussian effective likelihood: P ( Φ O | θ ) exp ^ θ ( θ ) with
2 ^ θ ( θ ) log 2 π C 0 + Φ O f ( θ ) C 0 1 Φ O f ( θ ) ,
where C 0 is the data covariance matrix at the expansion point θ 0 .
The selfi framework is fully characterized by f 0 , C 0 , and f 0 , which, if unknown, can be evaluated through forward simulations only. The numerical computation requires N 0 simulations at the expansion point (to evaluate the empirical mean f 0 and empirical covariance matrix C 0 ), and N s simulations in each direction of parameter space (to evaluate the empirical gradient f 0 via first-order forward finite differences). The total is N 0 + N s × S simulations; N 0 and N s should be of the order of the dimensionality of the data space P, giving a total cost of O ( P ( S + 1 ) ) model evaluations.
To fully characterize the Bayesian problem, one requires a prior on θ , P ( θ ) . Any prior can be used if one is ready to use numerical techniques to explore the posterior (such as standard Markov Chain Monte Carlo), using the linearized data model and Gaussian effective likelihood. However, a remarkable analytic result with selfi is that, if the prior is Gaussian with a mean equal to the expansion point θ 0 , i.e.,
2 log P ( θ ) log 2 π S + ( θ θ 0 ) S 1 ( θ θ 0 ) ,
then the effective posterior is also Gaussian:
2 log P ( θ | Φ O ) log 2 π Γ + ( θ γ ) Γ 1 ( θ γ ) .
The posterior mean and covariance matrix are given by
γ θ 0 + Γ ( f 0 ) C 0 1 ( Φ O f 0 ) ,
Γ ( f 0 ) C 0 1 f 0 + S 1 1
(see [1] Appendix B, for a derivation). They are fully characterized by the expansion variables θ 0 , f 0 , C 0 , and f 0 , as well as the prior covariance matrix S .

2.3. Check for Model Misspecification

The selfi posterior can be used as a check for model misspecification. Visually checking the reconstructed γ and Γ can yield interesting insights, especially if the latent function has some properties (such as an expected shape, periodicity, etc.) to which the data model may be sensitive if misspecified (see Section 4.2).
If a quantitative check for model misspecification is desired, we propose using the Mahalanobis distance between the reconstruction γ and the prior distribution P ( θ ) , defined formally by
d M ( θ , θ 0 | S ) θ θ 0 S 1 ( θ θ 0 ) .
The value of d M ( γ , θ 0 | S ) for the selfi posterior mean γ can be compared to an ensemble of values of d M ( θ ω , θ 0 | S ) for simulated latent functions θ ω = T ( ω ) , where samples ω are drawn from the prior P ( ω ) .

2.4. Score Compression and Simulation-Based Inference

Having checked the BHM for model misspecification, we now address the second part of the framework, aimed at inferring top-level parameters ω given observations. SBI is known to be difficult when the dimensionality of the data space P is high. For this reason, data compression is usually necessary. Data compression can be thought of as an additional layer at the bottom of the BHM, made of a deterministic function C acting on Φ . In practical scenarios, data compression shall preserve as much information about ω as possible, meaning that compressed summaries C ( Φ ) shall be as close as possible to sufficient summary statistics of Φ , i.e., P ( ω | C ( Φ ) ) = P ( ω | Φ ) .
Here, we propose to use score compression [3]. We make the assumption (for compression only, not for later inference) that P ( Φ | ω ) is Gaussian distributed: P ( Φ O | ω ) exp ^ ω ( ω ) where ^ ω ( ω ) = ^ θ ( T ( ω ) ) (see Equation (2)). The score function ω ^ ω 0 is the gradient of this log-likelihood with respect to the parameters ω at a fiducial point ω 0 in parameter space. Using as fiducial point the values that generate the selfi expansion point (i.e., ω 0 such that θ 0 = T ( ω 0 ) ), a quasi maximum-likelihood estimator for the parameters is ω ˜ O ω 0 + F 0 1 ω ^ ω 0 , where the Fisher matrix F 0 and the gradient of the log-likelihood are evaluated at ω 0 . Compression of Φ O to ω ˜ O yields N compressed statistics that are optimal in the sense that they preserve the Fisher information content of the data [3].
In our case, the covariance matrix C 0 is assumed not to depend on parameters ( ω C 0 = 0 ), and the expression for C ( Φ ) is therefore
C ( Φ ) = ω ˜ ω 0 + F 0 1 ( ω f 0 ) C 0 1 ( Φ f 0 ) .
The Fisher matrix of the problem further takes a simple form:
F 0 E ω ω ^ ω 0 ( ω ) = ( ω f 0 ) C 0 1 ω f 0 .
We therefore need to evaluate
ω f 0 = f 0 · T ( ω ) ω ω = ω 0 .
Importantly, in Equations (8)–(10), C 0 and f 0 have already been computed for latent function inference with selfi. The only missing quantity is the second matrix in the right-hand side of Equation (10), that is, ω T 0 , the gradient of T evaluated at ω 0 . If unknown, its computation (e.g., via finite differences) does not require any more simulation of Φ . It is usually easy, as there are only N directions in parameter space and T is the numerically cheap part of the BHM. We note that, because we have to calculate F 0 , we can easily get the Fisher–Rao distance between any simulated summaries ω ˜ and the observed summaries ω ˜ O ,
d FR ( ω ˜ , ω ˜ O ) ω ˜ ω ˜ O F 0 ( ω ˜ ω ˜ O ) ,
which can be used by any non-parametric SBI method.
We specify a prior P ( ω ) (typically peaking at or centered on ω 0 , for consistency with the assumptions made for data compression). Having defined C, we now have a full BHM that maps ω (of dimension N) to compressed summaries ω ˜ (of size N) and has been checked for model misspecification for the part linking θ to Φ . We can then proceed with SBI via usual techniques. These can include likelihood-free rejection sampling, but also more sophisticated techniques such as delfi (e.g., [4,5]) or bolfi (e.g., [6,7,8]).

3. Lotka–Volterra BHM

3.1. Lotka–Volterra Solver

The Lotka–Volterra equations describe the dynamics of an ecological system in which two species interact, as a pair of first-order non-linear differential equations:
d x d t = α x β x y ,
d y d t = δ x y γ y .
where x ( t ) is the number of prey at time t, and y ( t ) is the number of predators at time t. The model is characterized by ω = ( α , β , γ , δ ) , a vector of four real parameters describing the interaction of the two species.
The initial conditions of the problem x ( 0 ) , y ( 0 ) = x 0 , y 0 are assumed to be exactly known. Throughout the paper, timestepping and number of timesteps are fixed: t i = i Δ t for i 0 , S / 2 .
The expression T is an algorithm that numerically solves the ordinary differential equations. For simplicity, we choose an explicit Euler method: for all i 0 , S / 2 1 ,
x ( t i + 1 ) = x ( t i ) × 1 + α β y ( t i ) × Δ t ,
y ( t i + 1 ) = y ( t i ) × 1 + δ x ( t i ) γ × Δ t .
The latent function θ ( t ) is a concatenation of x ( t ) and y ( t ) evaluated at the timesteps of the problem. The corresponding vector is θ x ( t i ) 0 i < S / 2 , y ( t i ) 0 i < S / 2 of size S.

3.2. Lotka–Volterra Observer

3.2.1. Full Data Model

To go from θ to Φ , we assume a complex, probabilistic observational process of prey and predator populations, later referred to as “model A” and defined as follows.
Signal. The (unobserved) signal s z is a delayed and non-linearly perturbed observation of the true population function for species z x , y , modulated by some seasonal efficiency e z ( t ) . Formally, s x ( 0 ) = x 0 , s y ( 0 ) = y 0 , and for i 0 , S / 2 1 ,
s x ( t i + 1 ) = e x ( t i ) x ( t i ) p x ( t i ) y ( t i ) + q x ( t i ) 2 ,
s y ( t i + 1 ) = e y ( t i ) y ( t i ) + p x ( t i ) y ( t i ) q y ( t i ) 2 .
These equations involve two parameters: p accounts for hunts between t i and t i + 1 (temporarily making prey more likely to hide and predators more likely to be visible), and q accounts for the gregariousness of prey and independence of predators. The free functions e x ( t ) and e y ( t ) , valued in [ 0 , 1 ] , describe how prey and predators are likely to be detectable at any time, accounting, for example, for seasonal variation (hibernation, migration).
Noise. The signal s z is subject to additive noise, giving a noisy signal u z ( t ) = s z ( t ) + n z D ( t ) + n z O ( t ) , where the noise has two components:
  • Demographic Gaussian noise with zero mean and variance proportional to the true underlying population, i.e., n x D ( t ) G 0 , r x ( t ) and n y D ( t ) G 0 , r y ( t ) . The parameter r gives the strength of demographic noise.
  • Observational Gaussian noise that accounts for observer efficiency, coupling prey and predators such that
    n x O ( t ) n y O ( t ) G 0 0 , s y ( t ) t x ( t ) y ( t ) t x ( t ) y ( t ) x ( t ) .
    The parameter s gives the overall amplitude of observational noise, and the parameter t controls the strength of the non-diagonal component (it should be chosen such that the covariance matrix appearing in Equation (18) is positive semi-definite).
Censoring. Finally, observed data are a censored and thresholded version of the noisy signal: for each timestep t i , Φ z ( t i ) = m z ( t i ) × min u z ( t i ) , M z , where M z is the maximum number of prey or predators that can be detected by the observer, and m z is a mask (taking either the value 0 or 1). Masked data points are discarded. The data vector is Φ = Φ x ( t i ) , Φ y ( t i ) . It contains P S elements depending on the number of masked timesteps for each species z (formally, P = i = 0 S / 2 1 δ K m x ( t i ) , 1 + δ K m y ( t i ) , 1 , where δ K is a Kronecker delta symbol).
All of the free parameters (p, q, r, s, t, M x , M y ) and free functions ( e x ( t ) , e y ( t ) , m x ( t ) , m y ( t ) ) appearing in the Lotka–Volterra observer data model described in this section are assumed known and fixed throughout the paper. Parameters used are x 0 = 10 , y 0 = 5 , p = 0.05 , q = 0.01 , r = 0.15 , s = 0.05 , t = 0.2 .

3.2.2. Simplified Data Model

In this section, we introduce “model B”, a simplified (misspecified) data model linking θ to Φ . Model B assumes that underlying functions are directly observed, i.e., s z ( t ) = z ( t ) . It omits observational noise, such that u z ( t ) = s z ( t ) + n z D ( t ) . In model B, parameters p, q, s, and t are not involved, and the value of r (strength of demographic noise) can be incorrect (we used r = 0.105 ). Finally, model B fails to account for the thresholds: Φ z ( t ) = m z ( t ) u z ( t ) .

4. Results

In this section, we apply the two-step inference method described in Section 2 to the Lotka–Volterra BHM introduced in Section 3. We generate mock data Φ O from model A, using ground truth parameters ω gt = ( α gt , β gt , γ gt , δ gt ) = ( 0.55 , 0.2 , 0.2 , 0.05 ) . We assume that ground truth parameters are known a priori with a precision of approximately 3 % . Consistently, we choose a Gaussian prior P ( ω ) with mean ω 0 = ( 0.5768 , 0.1963 , 0.1968 , 0.0484 ) and diagonal covariance matrix diag ( 0 . 0173 2 , 0 . 0059 2 , 0 . 0059 2 , 0 . 0015 2 ) .

4.1. Inference of Population Functions with SELFI

We first seek to reconstruct the latent population functions x ( t ) and y ( t ) , conditional on the data Φ O , using selfi. We choose as an expansion point the population functions simulated from the mean of the prior on ω , i.e., θ 0 = T ( ω 0 ) . We use N 0 = 150 and N s = 100 ; the computational workload is therefore a fixed number of 10 , 150 simulations for each model. It is known a priori and perfectly parallel.
We adopt a Gaussian prior P ( θ ) and combine it with the effective likelihood to obtain the selfi effective posterior P ( θ | Φ O ) . Figure 1 (left panels) shows the inferred population functions γ in comparison with the prior mean and expansion point θ 0 and the ground truth θ gt . The figure shows 2 σ credible regions for the prior and the posterior (i.e., 2 diag ( S ) and 2 diag ( Γ ) , respectively). The full posterior covariance matrix Γ for each model is shown in the rightmost column of Figure 1.

4.2. Check for Model Misspecification

The inferred population functions allow us to check for model misspecification. From Figure 1, it is clear that model B fails to produce a plausible reconstruction of population functions: model B breaks the (pseudo-)periodicity of the predator population function y ( t ) , which is a property required by the model. In the bottom left-hand panels, the red lines differ in shape from fiducial functions T ( ω ) (grey lines), and the credible intervals exclude the expansion point. On the contrary, with model A, the reconstructed population functions are consistent with the expansion point. The inference is unbiased, as the ground truth typically lies within the 2 σ credible region of the reconstruction.
As a quantitative check, we compute the Mahalanobis distance between γ and P ( θ ) (Equation (7)) for each model. We find that d M ( γ , θ 0 | S ) is much smaller for model A than for model B ( 5.35 versus 12.54 ). The numbers can be compared to the empirical mean among our set of fiducial populations functions, d M ( T ( ω n ) , θ 0 | S ) = 9.43 .
At this stage, we therefore consider that model B is excluded, and we proceed further with model A.

4.3. Score Compression

As T is numerically cheap, we get ω T 0 via sixth-order central finite differences around ω 0 , then obtain ω f 0 using Equation (10). This does not require any further evaluation of the data model P ( Φ | θ ) , as f 0 has already been computed.
Using Equations (8) and (9), we compress Φ O and obtain ω ˜ O = ( 0.7050 , 0.2287 , 0.1471 , 0.0415 ) .

4.4. Inference of Parameters Using Likelihood-Free Rejection Sampling

As a last step, we infer top-level parameters ω given compressed summaries ω ˜ O . As the problem studied in this paper is sufficiently simple, we rely on the simplest solution for SBI, namely likelihood-free rejection sampling (sometimes also known as approximate Bayesian computation, e.g., [9]). To do so, we use the Fisher–Rao distance between simulated ω ˜ and observed ω ˜ O , which comes naturally from score compression (see Equation (11)), and we set a threshold ε = 2 . We draw samples from the prior P ( ω ) , simulate ω ˜ , then accept ω as a sample of P ( ω | ω ˜ O ) if d FR ( ω ˜ , ω ˜ O ) < ε , and reject it otherwise.
In Figure 2, we find that the inference of top-level parameters is unbiased, with the ground truth ω gt (dashed lines) lying within the 2 σ credible region of the posterior. We observe that the data correctly drive some features that are not built into the prior, for instance, the degeneracy between α and γ , respectively, the reproduction rate of prey and the mortality rate of predators.

5. Conclusions

One of the biggest challenges in statistical data analysis is checking data models for misspecification, so as to obtain meaningful parameter inferences. In this work, we described a novel two-step simulation-based Bayesian approach, combining selfi and SBI, which can be used to tackle this issue for a large class of models. BHMs to which the approach can be applied involve a latent function depending on parameters and observed through a complex probabilistic process. They are ubiquitous, e.g., in astrophysics and ecology.
In this paper, we introduced a prey–predator model, consisting of a numerical solver of the Lotka–Volterra system of equations and of a complex observational process of population functions. As a proof of concept, we applied our technique to this model and to a simplified (misspecified) version of it. We demonstrated successful identification of the misspecified model and unbiased inference of the parameters of the correct model.
In conclusion, the method developed constitutes a computationally efficient and easily applicable framework to perform SBI of BHMs while checking for model misspecification. It allows one to infer the latent function as an intermediate product, then to perform score compression at no additional simulation cost. This study opens up a new avenue to increase the robustness and reliability of Bayesian data analysis using fully non-linear, simulator-based models.

Funding

This work was done within the Aquila Consortium (https://aquila-consortium.org, accessed on 31 October 2022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code and data underlying this paper, as well as additional plots, have been made publicly available as part of the pyselfi code at https://pyselfi.florent-leclercq.eu (accessed on 31 October 2022).

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Leclercq, F.; Enzi, W.; Jasche, J.; Heavens, A. Primordial power spectrum and cosmology from black-box galaxy surveys. Mon. Not. R. Astron. Soc. 2019, 490, 4237–4253. [Google Scholar] [CrossRef]
  2. Rousset, F. Inferences from Spatial Population Genetics. In Handbook of Statistical Genetics; John Wiley & Sons, Ltd.: London, UK, 2007; Chapter 28; pp. 945–979. [Google Scholar] [CrossRef]
  3. Alsing, J.; Wandelt, B. Generalized massive optimal data compression. Mon. Not. R. Astron. Soc. Lett. 2018, 476, L60–L64. [Google Scholar] [CrossRef] [Green Version]
  4. Papamakarios, G.; Murray, I. Fast ϵ-free Inference of Simulation Models with Bayesian Conditional Density Estimation. In Advances in Neural Information Processing Systems 29: Proceedings of the 30th International Conference on Neural Information Processing Systems, 5–10 December 2016, Barcelona, Spain; Curran Associates Inc.: Red Hook, NY, USA, 2016; pp. 1036–1044. [Google Scholar]
  5. Alsing, J.; Wandelt, B.; Feeney, S. Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology. Mon. Not. R. Astron. Soc. 2018, 477, 2874–2885. [Google Scholar] [CrossRef] [Green Version]
  6. Gutmann, M.U.; Corander, J. Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models. J. Mach. Learn. Res. 2016, 17, 1–47. [Google Scholar]
  7. Leclercq, F. Bayesian optimization for likelihood-free cosmological inference. Phys. Rev. D 2018, 98, 063511. [Google Scholar] [CrossRef] [Green Version]
  8. Thomas, O.; Pesonen, H.; Sá-Leão, R.; de Lencastre, H.; Kaski, S.; Corander, J. Split-BOLFI for misspecification-robust likelihood free inference in high dimensions. arXiv, 2020; arXiv:2002.09377v1. [Google Scholar]
  9. Beaumont, M.A. Approximate Bayesian Computation. Annu. Rev. Stat. Its Appl. 2019, 6, 379–403. [Google Scholar] [CrossRef]
Figure 1. selfi inference of the population function θ given the observed data Φ O , used as a check for model misspecification. Left panels: the prior mean and expansion point θ 0 and the effective posterior mean γ are represented as yellow and green/red lines, respectively, with their 2 σ credible intervals. For comparison, simulations T ( ω ) with ω P ( ω ) , and the ground truth θ gt are shown in grey and blue, respectively. Middle and right panels: the prior covariance matrix S and the posterior covariance matrix Γ , respectively. The first row corresponds to model A (see Section 3.2.1) and the second row to model B (see Section 3.2.2).
Figure 1. selfi inference of the population function θ given the observed data Φ O , used as a check for model misspecification. Left panels: the prior mean and expansion point θ 0 and the effective posterior mean γ are represented as yellow and green/red lines, respectively, with their 2 σ credible intervals. For comparison, simulations T ( ω ) with ω P ( ω ) , and the ground truth θ gt are shown in grey and blue, respectively. Middle and right panels: the prior covariance matrix S and the posterior covariance matrix Γ , respectively. The first row corresponds to model A (see Section 3.2.1) and the second row to model B (see Section 3.2.2).
Psf 05 00004 g001
Figure 2. Simulation-based inference of the Lotka–Volterra parameters ω = ( α , β , γ , δ ) given the compressed observed data ω ˜ O . Plots in the lower corner show two-dimensional marginals of the prior P ( ω ) (yellow contours) and of the SBI posterior P ( ω | ω ˜ O ) (green contours), using a threshold ε = 2 on the Fisher–Rao distance between simulated ω ˜ and observed ω ˜ O , d FR ( ω ˜ , ω ˜ O ) . Contours show 1, 2, and 3 σ credible regions. Plots on the diagonal show one-dimensional marginal distributions of the parameters, using the same color scheme. Dotted and dashed lines denote the position of the fiducial point for score compression ω 0 and of the ground truth parameters ω gt , respectively. The scatter plots in the upper corner illustrate score compression for pairs of parameters. There, red dots represent some simulated samples. Larger dots show some accepted samples (i.e., for which d FR ( ω ˜ , ω ˜ O ) < ε ), with a color map corresponding to the value of one component of ω ˜ . In the color bars, pink lines denote the mean and 1 σ scatter among accepted samples of the component of ω ˜ , and the orange line denotes its value in ω ˜ O .
Figure 2. Simulation-based inference of the Lotka–Volterra parameters ω = ( α , β , γ , δ ) given the compressed observed data ω ˜ O . Plots in the lower corner show two-dimensional marginals of the prior P ( ω ) (yellow contours) and of the SBI posterior P ( ω | ω ˜ O ) (green contours), using a threshold ε = 2 on the Fisher–Rao distance between simulated ω ˜ and observed ω ˜ O , d FR ( ω ˜ , ω ˜ O ) . Contours show 1, 2, and 3 σ credible regions. Plots on the diagonal show one-dimensional marginal distributions of the parameters, using the same color scheme. Dotted and dashed lines denote the position of the fiducial point for score compression ω 0 and of the ground truth parameters ω gt , respectively. The scatter plots in the upper corner illustrate score compression for pairs of parameters. There, red dots represent some simulated samples. Larger dots show some accepted samples (i.e., for which d FR ( ω ˜ , ω ˜ O ) < ε ), with a color map corresponding to the value of one component of ω ˜ . In the color bars, pink lines denote the mean and 1 σ scatter among accepted samples of the component of ω ˜ , and the orange line denotes its value in ω ˜ O .
Psf 05 00004 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Leclercq, F. Simulation-Based Inference of Bayesian Hierarchical Models While Checking for Model Misspecification. Phys. Sci. Forum 2022, 5, 4. https://doi.org/10.3390/psf2022005004

AMA Style

Leclercq F. Simulation-Based Inference of Bayesian Hierarchical Models While Checking for Model Misspecification. Physical Sciences Forum. 2022; 5(1):4. https://doi.org/10.3390/psf2022005004

Chicago/Turabian Style

Leclercq, Florent. 2022. "Simulation-Based Inference of Bayesian Hierarchical Models While Checking for Model Misspecification" Physical Sciences Forum 5, no. 1: 4. https://doi.org/10.3390/psf2022005004

Article Metrics

Back to TopTop