doi:10.1016/j.csda.2005.05.008
Copyright © 2006 Elsevier B.V. All rights reserved.
Flexible distributions for triple-goal estimates in two-stage hierarchical models
aRAND Corporation, 1776 Main Street, Santa Monica, CA 90401, USA
bDepartment of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205, USA
Received 7 February 2005;
revised 25 May 2005;
accepted 26 May 2005.
Available online 29 June 2005.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
Performance evaluations often aim to achieve goals such as obtaining estimates of unit-specific means, ranks, and the distribution of unit-specific parameters. The Bayesian approach provides a powerful way to structure models for achieving these goals. While no single estimate can be optimal for achieving all three inferential goals, the communication and credibility of results will be enhanced by reporting a single estimate that performs well for all three. Triple goal estimates [Shen and Louis, 1998. Triple-goal estimates in two-stage hierarchical models. J. Roy. Statist. Soc. Ser. B 60, 455–471] have this performance and are appealing for performance evaluations. Because triple-goal estimates rely more heavily on the entire distribution than do posterior means, they are more sensitive to misspecification of the population distribution and we present various strategies to robustify triple-goal estimates by using nonparametric distributions. We evaluate performance based on the correctness and efficiency of the robustified estimates under several scenarios and compare empirical Bayes and fully Bayesian approaches to model the population distribution. We find that when data are quite informative, conclusions are robust to model misspecification. However, with less information in the data, conclusions can be quite sensitive to the choice of population distribution. Generally, use of a nonparametric distribution pays very little in efficiency when a parametric population distribution is valid, but successfully protects against model misspecification.
Keywords: Bayesian statistics; League tables; Nonparametrics; Percentiles; Ranking; Robustness
Fig. 1. Means and 95% posterior probability intervals of the ratio of posterior predictive probabilities placed on the EDF versus G0 under (a) DP-1 and (b) DP-2, given various data-generating scenarios, which are denoted beneath each boxplot: the true distribution (Gaussian, T5, or a bimodal mixture),
(denoted by gm), and rls. Note: figures drawn on different scales.
Fig. 2. Scaled θ-EDF estimates using PM, ML, and GR when the data-analytic and data-generating distributions, G, are Gaussian.
.
Fig. 3. Scaled EDF estimates when true G is a mixture of two Gaussians. First row: gm=0.1, rls=1. Second row: gm=1, rls=1. Third row: gm=1, rls=25. Each column corresponds to the assumed model (column 1: Gaussian; column 2: T5; column 3: DP-1; column 4: DP-2; column 5: SBR).
Fig. 4. GR estimates derived under DP versus Gaussian models for G for (a) the full sample and (b) the subset of majority students only.
Fig. 5. Empirical distribution of (a) observed school-level average math achievement scores; (b) GR estimates derived under a Gaussian distribution for θj; (c) GR estimates derived under a Dirichlet process model for G for the full sample. (d)–(f) are the analogous figures for the analysis of the subset of nonminority cases.
Table 1.
Comparison of ML, PM, and GR for estimating θ's and G when the data-generating and data-analytic distributions, G, agree

Part (a) reports 10 000×SEL for the ML estimate of the θk's and the SELs for PM and GR are expressed as a percentage of the ML SEL. Part (b) reports
ISEL for ML estimate of G and the ISELs for PM and GR are expressed as a percentage of the ML ISEL.
Table 2.
GR estimates derived under various data-analytic population distributions for (a) θs, (b) G, and (c) 10th and (d) 25th percentiles of G when the data-generating distribution is a standard Gaussian (which is asterisked)
a The upper quantiles equal the lower quantiles by symmetry.
Table 3.
GR estimates derived under various data-analytic population distributions for (a) θs, (b) G, and (c) and (d) percentiles of G when the data-generating distribution is a T5 (which is asterisked)
a The upper quantiles equal the lower quantiles by symmetry.
Table 4.
GR estimates derived under various data analysis choices for G for (a) θ under squared-error loss (SEL) and (b) G under integrated squared-error loss (ISEL), when the data-generating distribution is a bimodal mixture of two Gaussians

Table 5.
Percentile estimates of G using GR estimates derived under various data analysis choices when the data-generating distribution is a bimodal mixture of two Gaussians
