Abstract
The analysis of reference materials (RMs) can help assess the equivalence of chemical measurement processes. When two or more RMs are available for a given measurand, confidently establishing the equivalence of measurement processes requires the RMs to be capable of yielding equivalent results. Evaluating the degrees of equivalence among RMs that differ in analyte quantity and perhaps matrix composition requires an approach other than that used to assess results for samples of a single material. We have more than a decade of experience with an approach that compares the assigned values of RMs to a simple linear model of the relationship between those values and measurement results ideally made under repeatability conditions. In addition to accessing the metrological equivalence of specific RMs, the equivalence of the value-assignment capabilities of the organizations that issue the RMs can also be accessed. This report summarizes our experience with the design of and analysis of studies using this approach and provides numeric and graphical tools for estimating degrees of equivalence. We divide the required tasks into four steps: (1) design, (2) measurement, (3) definition of a reference function, and (4) estimation of degrees of equivalence. We regard the experimental design and measurement tasks as most critical to the eventual utility of the comparison, since creative mathematics cannot fully compensate for poor planning or erratic measurements.
Similar content being viewed by others
Abbreviations
- ANOVA:
-
Analysis of variance
- CCQM:
-
Consultative Committee for Amount of Substance—Metrology in Chemistry
- CIPM MRA:
-
Comité International des Poids et Mesures Mutual Recognition Arrangement
- ESM:
-
Electronic supplementary material
- GAWG:
-
Gas Analysis Working Group
- GDR:
-
Generalized distance regression
- GUM:
-
Guide to the Expression of Uncertainty in Measurement
- LAI:
-
Leave all in
- LOO:
-
Leave one out
- MCMC:
-
Markov chain Monte Carlo
- NMI:
-
National metrology institute
- OAWG:
-
Organic Analysis Working Group
- OLS:
-
Ordinary least squares
- PBMC:
-
Parametric bootstrap Monte Carlo
- PSM:
-
Primary standard gas mixture
- RF:
-
Reference function
- RM:
-
Reference material
- MAX(.):
-
Maximum value in specified set of values
- N(μ, σ 2):
-
Gaussian distribution of specified mean and variance
- PTILE(p, q):
-
the p percentile of the set of q values
- SGN(.):
-
Signum, sign of a value expressed as −1 or +1
- t s(p, ν):
-
Student’s t expansion factor for a specified coverage probability and number of degrees of freedom
- ∪:
-
Union of specified sets of values
- ~:
-
When to the left of a mathematical expression, indicates “is distributed as”
- ˆ:
-
When above a symbol, indicates an estimated quantity
- {}:
-
A defined set of values
- α :
-
Intercept
- β :
-
Slope
- ε i :
-
Residual difference for the ith material
- Ε :
-
Residual random error
- γ ij :
-
Between-unit differences
- δ ijk :
-
Within-unit differences
- μ :
-
Mean
- ρ :
-
Correlation
- σ :
-
Standard deviation
- σ a,i :
-
Between-aliquot imprecision for the ith material
- σ c,i :
-
Between-campaign imprecision for the ith material, also the confounded between-campaign and between-unit imprecision when one unit is evaluated in each campaign
- σ r,i :
-
Repeatability imprecision for the ith material
- σ u,i :
-
Between-unit imprecision for the ith material
- ν :
-
Degrees of freedom
- d i :
-
Degree of equivalence for the ith result for nominally identical samples of one material
- d absi :
-
Degree of equivalence for the ith material as an absolute value
- d reli :
-
Degree of equivalence for the ith material as a percentage of the value
- D abs :
-
Degree of equivalence for one organization as an absolute value
- D rel :
-
Degree of equivalence for one organization as a relative value
- i :
-
Index over materials
- j :
-
Index over campaigns and/or units
- k :
-
Index over replicates or aliquots
- l :
-
Index over replicates
- k p :
-
Coverage factor providing a p % level of confidence
- n :
-
Number
- n a :
-
Number of aliquots of each unit
- n c :
-
Number of measurement campaigns
- n m :
-
Number of materials
- n MC :
-
Number of PBMC analyses
- n r :
-
Number of replicates of each aliquot or unit
- n u :
-
Number of units of each material
- p :
-
Probability expressed as a percentage (i.e., on the range 0 to 100)
- q :
-
a set of PBMC estimates of a given quantity
- Q :
-
a “best estimate” of a given quantity
- R :
-
Generic representation of “instrument response”
- R i :
-
Instrument response for the ith material
- R ij :
-
Instrument response for the jth unit of the ith material
- R ijk :
-
Instrument response for the kth replicate of the jth unit of the ith material
- R ijkl :
-
Instrument response for the lth replicate of the kth aliquot of the jth unit of the ith material
- u :
-
Standard uncertainty
- u ∞ :
-
“Large sample” standard uncertainty
- U p :
-
One-half of a p % level of confidence symmetric coverage interval
- − U p :
-
Lower bound of a p % level of confidence asymmetric coverage interval
- + U p :
-
Upper bound of a p % level of confidence asymmetric coverage interval
- V :
-
Generic representation of “assigned value”
- V i :
-
Assigned value for the ith material
- x i :
-
Participant-reported measurement result for a given study material
- x ref :
-
Reference value for a comparison of results on nominally identical samples of one material
References
JCGM 200 (2008) International vocabulary of metrology—basic and general concepts and associated terms (VIM). Joint committee for guides in metrology. Sèvres, France. http://www.bipm.org/en/publications/guides/vim.html
Emons H (2006) The ‘RM family’—identification of all of its members. Accred Qual Assur 10:690–691
Armbruster D, Miller RR (2007) The joint committee for traceability in laboratory medicine (JCTLM): a global approach to promote the standardisation of clinical laboratory test results. Clin Biochem Rev 28(3):105–114
JCTLM (2006) Joint committee for traceability in laboratory medicine quality system procedure JCTLM WG1-P-04A process for comparing certified values of the same measurand in multiple references materials (CRMs). http://www.bipm.org/utils/en/pdf/WG1-P-04A.pdf
Duewer DL, Lippa K, Long SE et al (2009) Demonstrating the comparability of certified reference materials. Anal Bioanal Chem 395(1):155–169
Status accessible through: http://kcdb.bipm.org/appendixB/KCDB_ApB_search.asp
CIPM. Mutual recognition of national measurement standards and of calibration and measurement certificates issued by national metrology institutes. Comité International des Poids et Mesures. Paris, 14 October 1999. http://www.bipm.org/en/cipm-mra/documents/
Kummell CH (1879) Reduction of observation equations which contain more than one observed quantity. The Analyst (Ann Math) 6(4):97–105
Feigelson ED, Babu GJ (1992) Linear regression in astronomy. II. Astrophys J 397:55–67
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Phil Mag 6(2):559–572
Deming WE (1943) Statistical adjustment of data. Wiley, NY
Christian SD, Tucker EE (1984) LINGEN—a general linear least squares program. J Chem Educ 61(9):788
Ripley BD, Thompson M (1987) Regression techniques for the detection of analytical bias. Analyst 112(4):337–383
ISO. ISO 6143:2001 Gas analysis—comparison methods for determining and checking the composition of calibration gas mixtures. International Organization for Standardization. Geneva, 2001
Milton MJT, Harris PM, Smith IM, Brown AS, Goody BA (2006) Implementation of a generalized least-squares method for determining calibration curves from data with general uncertainty structures. Metrologia 43(4):S291–S298
Guenther FR, Possolo A (2011) Calibration and uncertainty assessment for certified reference gas mixtures. Anal Bioanal Chem 399:489–500
Toman B, Duewer DL, Gasca Aragon H, Guenther FR, Rhoderick GC (2012) A Bayesian approach to the evaluation of comparisons of individually value-assigned reference materials. Anal Bioanal Chem 403:537–548
Microsoft Corporation, Redman, WA, USA. http://office.microsoft.com/en-us/excel/
Analytical Methods Committee (AMC). Linear functional relationship estimation by maximum likelihood. http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp
National Physics Laboratory. XLGENLINE. http://www.eurometros.org
Huffel Van (2007) Total least squares and errors-in-variables modeling. Comp Stat Data Anal 52:1076–1079
Bartholomew-Biggs M, Butler BP, Forbes AB (2000) Optimization algorithms for generalized distance regression in metrology. In: Ciarlini P, Forbes AB, Pavese F, Richter D (eds) Advanced mathematical and computational tools in metrology IV. Ser Adv Math Appl Sci 53:21–31
van der Veen AMH, Brinkmann FNC, Arnautovic M et al (2007) International comparison CCQM–P41 Greenhouse gases. 2. Direct comparison of primary standard gas mixtures. Metrologia 44:08003
NPL (2007) What is a primary standard gas mixture (PSM)? (FAQ—gas standards). http://www.npl.co.uk/science-technology/chemical-metrology/faqs/what-is-a-primary-standard-gas-mixture-(psm)-(faq-gas-standards). Accessed 25-May-2012
Wielgosz RI, Esler M, Viallon J et al (2008) International comparison CCQM–P73: nitrogen monoxide gas standards (30–70) μmol/mol. Metrologia 45:08002
van der Veen AMH, Chander H, Ziel PR et al (2010) International comparison CCQM–K54: primary standard gas mixtures of hexane in methane. Metrologia 47:08019
Lee J, Lee JB, Moon DM et al (2010) Final report on international key comparison CCQM–K53: oxygen in nitrogen. Metrologia 47:08005
Ciarlini P, Cox MG, Pavese F, Regoliosi G (2004) The use of a mixture of probability distributions in temperature interlaboratory comparisons. Metrologia 41:116–121
Salit ML, Turk GC (1998) A drift correction procedure. Anal Chem 70:3184–3190
Fearn T, Fisher SA, Thompson M, Ellison SLR (2002) A decision theory approach to fitness for purpose in analytical measurement. Analyst 126(6):818–824
Ellison SLR, Barwick VJ, Farrant TJD (2009) Practical statistics for the analytical scientist: a bench guide, 2nd edn. RSC Publishing, Cambridge, UK
Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley-Interscience, Hoboken, NJ, USA
SAS/STAT 9.2 (2008) User’s Guide. SAS Institute Inc. Cary, NC USA
Bates D, Maechler M (2009) lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-32. http://cran.us.r-project.org/web/packages/lme4/
JCGM 101:2008. Evaluation of measurement data—supplement 1 to the “Guide to the expression of uncertainty in measurement”—Propagation of distributions using a Monte Carlo method. BIPM, Sèvres, France. http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E.pdf
Lunn DJ, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions (with discussion). Stat Med 28:3049–3082. See also http://www.mrc-bsu.cam.ac.uk/bugs/ and http://www.openbugs.info/w/
Gelman A, Carlin JB, Stern HA, Rubin DB (2004) Bayesian data analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton, FL, USA
ISO (2006) ISO GUIDE 35:2006 reference materials—general and statistical principles for certification. International Organization for Standardization, Geneva
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York, NY, USA. http://www.stanford.edu/~hastie/local.ftp/Springer/ESLII_print5.pdf
Duewer DL, Kowalski BR, Fasching JL (1976) Improving the reliability of factor analysis of chemical data by utilizing the measured analytical uncertainty. Anal Chem 48:2002–2010
Duewer DL (2008) A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers. Accred Qual Assur 13:193–216
Lawn RE, Thompson M, Walker RF (1997) Proficiency testing in analytical chemistry. Royal Society of Chemistry, Cambridge, UK
Acknowledgments
We thank Johanna E. Camara (NIST, Gaithersburg) and Rosemarie Phillips (BAM, DE) for their insights and experience in the practicalities of CRM comparison measurements; Wolfram Bremser (BAM, DE) and Stephen L.R. Ellison (LGC Limited, UK) for helpful discussions on approaches to analyzing CRM comparison results; Jolene D. Splett (NIST, Boulder) for her expertise with SAS and the interpretation of its results; Chih-Ming Wang (NIST, Boulder) for his careful review (and correction) of statistical concepts and notation; Katherine E. Sharpless (NIST, Gaithersburg) for her efforts toward making this document more accessible; the anonymous reviewers who thoughtfully critiqued the original draft of this report; and this Journal’s editorial staff for their insightful questions and gentle corrections.
Author information
Authors and Affiliations
Corresponding author
Additional information
Disclaimer Certain commercial software is identified in this report to specify the experimental procedure as completely as possible. In no case does such identification imply a recommendation or endorsement by the National Institute of Standards and Technology nor does it imply that the software is necessarily the best available for the purpose.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Duewer, D.L., Gasca-Aragon, H., Lippa, K.A. et al. Experimental design and data evaluation considerations for comparisons of reference materials. Accred Qual Assur 17, 567–588 (2012). https://doi.org/10.1007/s00769-012-0920-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00769-012-0920-4