Abstract
In this paper we describe a statistical analysis of the inter-laboratory data summarized in Rosati et al. (2008) to assess the performance of an analytical method to detect the presence of dust from the collapse of the World Trade Center (WTC) on September 11, 2001. The focus of the inter-lab study was the measurement of the concentration of slag wool fibers in dust which was considered to be an indicator of WTC dust. Eight labs were provided with two blinded samples each of three batches of dust that varied in slag wool concentration. Analysis of the data revealed that three of labs, which did not meet measurement quality objectives set forth prior to the experimental work, were statistically distinguishable from the five labs that did meet the quality objectives. The five labs, as a group, demonstrated better measurement capability although their ability to distinguish between the batches was somewhat mixed. This work provides important insights for the planning and implementation of future studies involving examination of dust samples for physical contaminants. This work demonstrates (a) the importance of controlling the amount of dust analyzed, (b) the need to take additional replicates to improve count estimates, and (c) the need to address issues related to the execution of the analytical methodology to ensure all labs meet the measurement quality objectives.
Similar content being viewed by others
References
Caulcutt, R., & Boddy, R. (1983). Statistics for Analytical Chemists. London: Chapman and Hall.
International Standard. (2002). ISO 14966: Ambient Air—Determination of Numerical Concentration of Inorganic Fibrous Particles—Scanning Electron Microscopy Method. Geneva: International Organization for Standardization.
Meeker G, Bern A, Lowers H, Brownfield I. Determination of a 401diagnostic signature for World Trade Center dust using 402 scanning electron microscopy point counting techniques. U.S. 403 Geological Survey Open File Report 2005–1031. http://pubs.usgs.404gov/of/2005/1031/2005.
Pleil, J. D., Vette, A. F., Johnson, B. A., & Rappaport, S. M. (2004). Air levels of carcinogenic polycyclic aromatic hydrocarbons after the World Trade Center disaster. Proceedings of the National Academy of Sciences, 101, 11685–11688.
Rosati, J. A., Bern, A. M., Willis, R. D., Blanchard, F. T., Conner, T. L., Kahn, H. D., et al. (2008). Multi-laboratory testing of a screening method for world trade center (WTC) collapse dust. Science of the Total Environment, 390, 514–519.
Scheffe, H. (1959). The Analysis of Variance. New York: Wiley.
U.S. EPA. (2005) Final report on the World Trade Center (WTC) dust screening method study, EPA/R-05/086, United States Environmental Protection Agency, Office of Research and Development, RTP, NC. http://www.epa.gov/wtc/panel/pdfs/Final_Report_August_17_2005.pdf.
Acknowledgments
The authors are grateful to Barry Nussbaum, Dennis Santella, and Paul White for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the US Environmental Protection Agency.
Appendices
Appendix 1: Scheffe contrasts for the slag wool inter-lab data
The method of Scheffe contrasts provides a basis for comparison of combinations of means in the analysis of variance that is quite general and conservative. The method can be applied to any linear combination of factor means in the analysis of variance where the sum of the coefficients in the linear combination equal zero. (Scheffe, H. [1959] The analysis of variance). This procedure is well suited to the evaluation of the data from the group of five labs (A, B, C, D, and H) versus the data from the group of three labs (E, F, and G). Normal probability plots of the residuals for the analysis of variance model for each batch that provide support for the assumption of approximate normality of the error term are shown in Fig. 3.
Scheffe contrast to compare the mean of group 1 labs A, B, C, D, and E to the mean of group 2 labs E, F, and G:
Define the contrast as
where
- μ i =:
-
True mean for ith lab
- c i =:
-
1/5 for i = 1, 2, 3, 4, and 8 (labs A, B, C, D, and H)
- c i =:
-
−1/3 for i = 5, 6, and 7 (labs E, F, and G)
The estimated contrast is then
where \( {\bar{Y}_i} = \) observed mean for the ith lab
The confidence interval for \( L \) is
where
- f=:
-
\( \sqrt {{{f_{{k - 1,n - k,1 - \alpha }}}/\left( {k - 1} \right)}} \)
- s=:
-
residual standard error from the analysis of variance
- k=:
-
the number of labs
- n=:
-
total number of determinations at all labs
- n i =:
-
number of determinations at the ith lab
- \( {f_{{k - 1,n - k,1 - \alpha }}} \)=:
-
(1-α) quantile of a \( F(k - 1,n - k) \) distribution
In this set-up for the slag wool inter-lab data, if the calculated value of the confidence interval CI does not include zero then the test provides evidence to reject the hypothesis at the (1-α) level that the group means are equal. For each of the slag wool batches, the indication is that the mean of the group 1 labs is greater than the mean of group 2 labs at the 0.95 significance level. This is consistent with the graphical evidence that shows (1) the group 2 labs report generally lower values for all batch levels and (2) report values that do not discriminate well among the different batches.
Appendix 2: Confidence intervals for the batch means
The results of the one-way ANOVAs by batch using the data from the group 1 labs were used to calculate confidence intervals for the batch means following the procedure described by Caulcutt and Boddy, Statistics for Analytical Chemistry, page 137. The results are shown in Table 3, above. The procedure is summarized below.
Confidence interval for the true mean of the ith batch:
where
- \( \mathop{{\bar{X}}}\nolimits_i \) =:
-
overall mean for the ith batch
- \( \hat{\sigma }_b^2 \) =:
-
estimated between-lab variance
$$ \frac{{\left( {{\text{between lab mean square }}--{\text{ within lab mean square}}} \right)}}{\text{number of determinations per lab}} $$ - \( \hat{\sigma }_w^2 \) =:
-
estimated within-lab variance (residual mean square)
- t =:
-
t statistic with α/2 = 0.025 and appropriate degrees of freedom
- k =:
-
number of labs
- n =:
-
total number of determinations
Degrees of freedom for t:
where
- ≈:
-
approximately equal
- k =:
-
number of labs
- WLMS =:
-
with lab mean square
- BLMS =:
-
between-lab mean square
Appendix 3: Reproducibility computation
The computational formula for reproducibility R, given by Caulcutt and Boddy (1983), page 143 is
where
- \( \hat{\sigma }_w^2 \) :
-
estimated within-lab variance
- \( \hat{\sigma }_b^2 \) :
-
estimated between-lab variance
- t =:
-
t statistic with α/2 = 0.025 and appropriate degrees of freedom
The values for \( \hat{\sigma }_w^2 \), \( \hat{\sigma }_b^2 \), and t are computed as described in Appendix 2, above. From the 10% batch ANOVA using the data from group 1, the values are
\( R{ } = { }40,285 \) fibers/gram.
Rights and permissions
About this article
Cite this article
Kahn, H.D., Rosati, J.A. & Bray, A.P. Statistical evaluation of data from multi-laboratory testing of a measurement method intended to indicate the presence of dust resulting from the collapse of the World Trade Center. Environ Monit Assess 184, 6367–6375 (2012). https://doi.org/10.1007/s10661-011-2426-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10661-011-2426-7