A limit result for the prior predictive applied to checking for prior-data conflict
Introduction
The relevance of the results of a statistical analysis depends upon the inputs chosen by the analyst. If the inputs are deemed not to be appropriate, then one has reason to doubt any conclusions drawn. For a Bayesian statistical analysis, these inputs comprise the sampling model for the data considered here as a collection of probability measures, one of which is supposed to have generated the observed data, the prior on the model parameter and perhaps a loss function. One way to assess the relevance of these inputs is to see whether or not these make sense in light of the data collected. From this point of view, we have a model failure when the data observed is surprising for every probability distribution in the model. In this paper, we are concerned with assessing whether or not there is a prior-data conflict.
Intuitively, prior-data conflict arises when the likelihood is relatively high, where the prior is relatively low. While this seems easy to assess via a graph when dealing with a one-dimensional parameter, more formal methods seem necessary in general. Various approaches have been proposed for assessing prior-data conflict; see for example, Young and Pettit (1996), Evans and Moshonov (2006), and Marshall and Spiegelhalter (2007). Some Bayesian model checking methods, such as those discussed in Box (1980) and Gelman et al. (1996) could also be considered as assessments of the prior although this is confounded with checking the model. Separating out the assessment of the prior from the model gives greater information concerning which of these choices might be in conflict with the data. Box (1980) proposed the tail probability , where is the observed data, and is the density of the data associated with the prior predictive measure on the sample space , where has density . We show in Example 1 that this is not appropriate for checking the prior.
In this paper, we prove a consistency result for the check for prior-data conflict discussed in Evans and Moshonov, 2006, Evans and Moshonov, 2007. Suppose that is a minimal sufficient statistic for with density on . The tail probability was proposed for checking for prior-data conflict where is the prior predictive distribution of . The following example motivates why (1) is suitable for checking for prior-data conflict. Example 1 Location Normal Suppose that is a sample from a distribution where is unknown. Then a minimal sufficient statistic is given by and converges almost surely to the true value as . Suppose we put a prior on . Then is the distribution and this converges in distribution to the distribution. Also converges almost surely to the prior density uniformly for in a compact set. A simple computation then shows that (1) converges almost surely to which assesses how far out in the tails of the prior lies. Now consider the Box (1980) tail probability for this problem. We have that where is the identity matrix, is a vector of ones, and where is the chi-squared distribution function. The quadratic form can be decomposed as where, conditionally given and and are independent. Now and uniformly in , by Theorem XVI.4.1 in Feller (1971). Hence where we have used the uniform continuity of . Since we have that . This limit is independent of the prior and whether or not is in the tails of the prior. Therefore, this tail probability is not useful for checking for prior-data conflict.
While the potential ill effects of a prior-data conflict have long been recognized, it is not clear what one should do when we conclude that a conflict exists. One can note, however, that we have learned something of relevance and it seems only fair that an analyst report this. Also, we note that the situation is similar with model checking as it is not clear what we should do when we have a failure and this does excuse us from these checks. The typical response to model failure is that we must modify the model in some way, perhaps by enlarging the family of distributions. Similarly, when a prior-data conflict exists, our response can be to use a new prior that is less informative in the sense that we can expect fewer prior-data conflicts. We discuss this in Section 4.
A criticism of (1) is that, in the case of continuous models, (1) is not invariant under smooth transformations. For suppose that is 1-1 and smooth and let be the reciprocal of the Jacobian determinant of evaluated at . Then is also minimal sufficient and (1) applied to gives the tail probability which is generally different than (1). This issue is avoided if we use the approach discussed in Evans and Jang (2010a) to get the invariant tail probability where and is the differential of . The factor corrects for volume distortions due to the transformation . Note that whenever is linear, then is constant and the invariant tail probability is the same as (1). This is the case for all but one of our examples. We state a relevant convergence result for this tail probability in Section 2.
In Section 2, we provide theorems, with proofs in the Appendix, for the convergence of (1) to the tail probability where is the true value of the parameter, i.e., (1) is a consistent assessment of whether or not the true value of the parameter is in the tails of the prior. In Section 3, we provide some applications. In Section 4, we discuss what one can do when a prior-data conflict is encountered.
Section snippets
Consistency of the check
We consider the behavior of (1) as the amount of data grows. We have the following generalization of Example 1. Theorem 1 Suppose is open and (i) a.s. for every , (ii) uniformly on compact subsets of , (iii) is continuous and the prior distribution of has no atoms, then a.s. .
Examples
For these examples the details associated with establishing Theorem 1(ii) are similar to the proof of Theorem 2 and can be found in Evans and Jang (2010b). Example 2 Scale-Gamma Let be a sample from a Gamma distribution where the scale parameter is unknown. Then is minimal sufficient and . When satisfies Theorem 1(iii), then Theorem 1(ii) holds and Theorem 1 applies.
The following example uses Example 2 in a problem of considerable
Resolving a prior-data conflict
There are several possible courses of action when we find that a given prior is in conflict with the data. First we note that, as we increase the amount of data it is typical that the effect of the prior disappears. So even though a prior-data conflict may exist, it may be that we can ignore it as the prior has little effect on the analysis. Diagnostics for assessing this are discussed in Evans and Moshonov (2006) and these involve comparing posterior inferences under the prior with those under
Acknowledgements
The authors thank the Editor and referees for some helpful comments.
References (13)
- et al.
On the development of the reference prior method
- et al.
The formal definition of reference priors
Ann. Statist.
(2009) Sampling and Bayes’ inference in scientific modelling and robustness
J. R. Stat. Soc. Ser. A
(1980)- Evans, M., Jang, G.H., 2009. Weak informativity and the information in one prior relative to another. Tech. Rep. No....
- et al.
Invariant -values for model checking
Ann. Statist.
(2010) - Evans, M., Jang, G.H., 2010b. A limit result for the prior predictive. Tech. Rep. No. 1004. Department of Statistics,...
Cited by (32)
Measuring statistical evidence using relative belief
2016, Computational and Structural Biotechnology JournalCitation Excerpt :Such a check is carried out by computing a tail probability based on the prior predictive distribution of a minimal sufficient statistic (see Evans and Moshonov [20,21]). In Evans and Jang [16] it is proved that this tail probability is consistent in the sense that, as the amount of data grows, it converges to a probability that measures how far into the tails of the prior the true value of θ lies. Here “lying in the tails” is interpreted as indicating that a prior-data conflict exists since the data is not coming from a distribution where the prior assigns most of the belief.
On some problems of Bayesian region construction with guaranteed coverages
2024, Statistical PapersAvoiding prior–data conflict in regression models via mixture priors
2022, Canadian Journal of StatisticsBayesian statistics and modelling
2021, Nature Reviews Methods Primers