Skip to main content
Log in

Experimental design and data evaluation considerations for comparisons of reference materials

  • General Paper
  • Published:
Accreditation and Quality Assurance Aims and scope Submit manuscript

Abstract

The analysis of reference materials (RMs) can help assess the equivalence of chemical measurement processes. When two or more RMs are available for a given measurand, confidently establishing the equivalence of measurement processes requires the RMs to be capable of yielding equivalent results. Evaluating the degrees of equivalence among RMs that differ in analyte quantity and perhaps matrix composition requires an approach other than that used to assess results for samples of a single material. We have more than a decade of experience with an approach that compares the assigned values of RMs to a simple linear model of the relationship between those values and measurement results ideally made under repeatability conditions. In addition to accessing the metrological equivalence of specific RMs, the equivalence of the value-assignment capabilities of the organizations that issue the RMs can also be accessed. This report summarizes our experience with the design of and analysis of studies using this approach and provides numeric and graphical tools for estimating degrees of equivalence. We divide the required tasks into four steps: (1) design, (2) measurement, (3) definition of a reference function, and (4) estimation of degrees of equivalence. We regard the experimental design and measurement tasks as most critical to the eventual utility of the comparison, since creative mathematics cannot fully compensate for poor planning or erratic measurements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Abbreviations

ANOVA:

Analysis of variance

CCQM:

Consultative Committee for Amount of Substance—Metrology in Chemistry

CIPM MRA:

Comité International des Poids et Mesures Mutual Recognition Arrangement

ESM:

Electronic supplementary material

GAWG:

Gas Analysis Working Group

GDR:

Generalized distance regression

GUM:

Guide to the Expression of Uncertainty in Measurement

LAI:

Leave all in

LOO:

Leave one out

MCMC:

Markov chain Monte Carlo

NMI:

National metrology institute

OAWG:

Organic Analysis Working Group

OLS:

Ordinary least squares

PBMC:

Parametric bootstrap Monte Carlo

PSM:

Primary standard gas mixture

RF:

Reference function

RM:

Reference material

MAX(.):

Maximum value in specified set of values

N(μ, σ 2):

Gaussian distribution of specified mean and variance

PTILE(p, q):

the p percentile of the set of q values

SGN(.):

Signum, sign of a value expressed as −1 or +1

t s(p, ν):

Student’s t expansion factor for a specified coverage probability and number of degrees of freedom

∪:

Union of specified sets of values

~:

When to the left of a mathematical expression, indicates “is distributed as”

ˆ:

When above a symbol, indicates an estimated quantity

{}:

A defined set of values

α :

Intercept

β :

Slope

ε i :

Residual difference for the ith material

Ε :

Residual random error

γ ij :

Between-unit differences

δ ijk :

Within-unit differences

μ :

Mean

ρ :

Correlation

σ :

Standard deviation

σ a,i :

Between-aliquot imprecision for the ith material

σ c,i :

Between-campaign imprecision for the ith material, also the confounded between-campaign and between-unit imprecision when one unit is evaluated in each campaign

σ r,i :

Repeatability imprecision for the ith material

σ u,i :

Between-unit imprecision for the ith material

ν :

Degrees of freedom

d i :

Degree of equivalence for the ith result for nominally identical samples of one material

d absi :

Degree of equivalence for the ith material as an absolute value

d reli :

Degree of equivalence for the ith material as a percentage of the value

D abs :

Degree of equivalence for one organization as an absolute value

D rel :

Degree of equivalence for one organization as a relative value

i :

Index over materials

j :

Index over campaigns and/or units

k :

Index over replicates or aliquots

l :

Index over replicates

k p :

Coverage factor providing a p % level of confidence

n :

Number

n a :

Number of aliquots of each unit

n c :

Number of measurement campaigns

n m :

Number of materials

n MC :

Number of PBMC analyses

n r :

Number of replicates of each aliquot or unit

n u :

Number of units of each material

p :

Probability expressed as a percentage (i.e., on the range 0 to 100)

q :

a set of PBMC estimates of a given quantity

Q :

a “best estimate” of a given quantity

R :

Generic representation of “instrument response”

R i :

Instrument response for the ith material

R ij :

Instrument response for the jth unit of the ith material

R ijk :

Instrument response for the kth replicate of the jth unit of the ith material

R ijkl :

Instrument response for the lth replicate of the kth aliquot of the jth unit of the ith material

u :

Standard uncertainty

u :

“Large sample” standard uncertainty

U p :

One-half of a p % level of confidence symmetric coverage interval

U p :

Lower bound of a p % level of confidence asymmetric coverage interval

+ U p :

Upper bound of a p % level of confidence asymmetric coverage interval

V :

Generic representation of “assigned value”

V i :

Assigned value for the ith material

x i :

Participant-reported measurement result for a given study material

x ref :

Reference value for a comparison of results on nominally identical samples of one material

References

  1. JCGM 200 (2008) International vocabulary of metrology—basic and general concepts and associated terms (VIM). Joint committee for guides in metrology. Sèvres, France. http://www.bipm.org/en/publications/guides/vim.html

  2. Emons H (2006) The ‘RM family’—identification of all of its members. Accred Qual Assur 10:690–691

    Article  CAS  Google Scholar 

  3. Armbruster D, Miller RR (2007) The joint committee for traceability in laboratory medicine (JCTLM): a global approach to promote the standardisation of clinical laboratory test results. Clin Biochem Rev 28(3):105–114

    Google Scholar 

  4. JCTLM (2006) Joint committee for traceability in laboratory medicine quality system procedure JCTLM WG1-P-04A process for comparing certified values of the same measurand in multiple references materials (CRMs). http://www.bipm.org/utils/en/pdf/WG1-P-04A.pdf

  5. Duewer DL, Lippa K, Long SE et al (2009) Demonstrating the comparability of certified reference materials. Anal Bioanal Chem 395(1):155–169

    Article  CAS  Google Scholar 

  6. Status accessible through: http://kcdb.bipm.org/appendixB/KCDB_ApB_search.asp

  7. CIPM. Mutual recognition of national measurement standards and of calibration and measurement certificates issued by national metrology institutes. Comité International des Poids et Mesures. Paris, 14 October 1999. http://www.bipm.org/en/cipm-mra/documents/

  8. Kummell CH (1879) Reduction of observation equations which contain more than one observed quantity. The Analyst (Ann Math) 6(4):97–105

    Article  Google Scholar 

  9. Feigelson ED, Babu GJ (1992) Linear regression in astronomy. II. Astrophys J 397:55–67

    Article  Google Scholar 

  10. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Phil Mag 6(2):559–572

    Google Scholar 

  11. Deming WE (1943) Statistical adjustment of data. Wiley, NY

    Google Scholar 

  12. Christian SD, Tucker EE (1984) LINGEN—a general linear least squares program. J Chem Educ 61(9):788

    Article  Google Scholar 

  13. Ripley BD, Thompson M (1987) Regression techniques for the detection of analytical bias. Analyst 112(4):337–383

    Article  Google Scholar 

  14. ISO. ISO 6143:2001 Gas analysis—comparison methods for determining and checking the composition of calibration gas mixtures. International Organization for Standardization. Geneva, 2001

  15. Milton MJT, Harris PM, Smith IM, Brown AS, Goody BA (2006) Implementation of a generalized least-squares method for determining calibration curves from data with general uncertainty structures. Metrologia 43(4):S291–S298

    Article  Google Scholar 

  16. Guenther FR, Possolo A (2011) Calibration and uncertainty assessment for certified reference gas mixtures. Anal Bioanal Chem 399:489–500

    Google Scholar 

  17. Toman B, Duewer DL, Gasca Aragon H, Guenther FR, Rhoderick GC (2012) A Bayesian approach to the evaluation of comparisons of individually value-assigned reference materials. Anal Bioanal Chem 403:537–548

    Google Scholar 

  18. Microsoft Corporation, Redman, WA, USA. http://office.microsoft.com/en-us/excel/

  19. Analytical Methods Committee (AMC). Linear functional relationship estimation by maximum likelihood. http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp

  20. National Physics Laboratory. XLGENLINE. http://www.eurometros.org

  21. Huffel Van (2007) Total least squares and errors-in-variables modeling. Comp Stat Data Anal 52:1076–1079

    Article  Google Scholar 

  22. Bartholomew-Biggs M, Butler BP, Forbes AB (2000) Optimization algorithms for generalized distance regression in metrology. In: Ciarlini P, Forbes AB, Pavese F, Richter D (eds) Advanced mathematical and computational tools in metrology IV. Ser Adv Math Appl Sci 53:21–31

  23. van der Veen AMH, Brinkmann FNC, Arnautovic M et al (2007) International comparison CCQM–P41 Greenhouse gases. 2. Direct comparison of primary standard gas mixtures. Metrologia 44:08003

    Article  Google Scholar 

  24. NPL (2007) What is a primary standard gas mixture (PSM)? (FAQ—gas standards). http://www.npl.co.uk/science-technology/chemical-metrology/faqs/what-is-a-primary-standard-gas-mixture-(psm)-(faq-gas-standards). Accessed 25-May-2012

  25. Wielgosz RI, Esler M, Viallon J et al (2008) International comparison CCQM–P73: nitrogen monoxide gas standards (30–70) μmol/mol. Metrologia 45:08002

    Article  Google Scholar 

  26. van der Veen AMH, Chander H, Ziel PR et al (2010) International comparison CCQM–K54: primary standard gas mixtures of hexane in methane. Metrologia 47:08019

    Article  Google Scholar 

  27. Lee J, Lee JB, Moon DM et al (2010) Final report on international key comparison CCQM–K53: oxygen in nitrogen. Metrologia 47:08005

    Article  Google Scholar 

  28. Ciarlini P, Cox MG, Pavese F, Regoliosi G (2004) The use of a mixture of probability distributions in temperature interlaboratory comparisons. Metrologia 41:116–121

    Article  Google Scholar 

  29. Salit ML, Turk GC (1998) A drift correction procedure. Anal Chem 70:3184–3190

    Article  CAS  Google Scholar 

  30. http://en.wikipedia.org/wiki/The_Story_of_the_Three_Bears

  31. Fearn T, Fisher SA, Thompson M, Ellison SLR (2002) A decision theory approach to fitness for purpose in analytical measurement. Analyst 126(6):818–824

    Article  Google Scholar 

  32. Ellison SLR, Barwick VJ, Farrant TJD (2009) Practical statistics for the analytical scientist: a bench guide, 2nd edn. RSC Publishing, Cambridge, UK

    Google Scholar 

  33. Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley-Interscience, Hoboken, NJ, USA

    Book  Google Scholar 

  34. SAS/STAT 9.2 (2008) User’s Guide. SAS Institute Inc. Cary, NC USA

  35. Bates D, Maechler M (2009) lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-32. http://cran.us.r-project.org/web/packages/lme4/

  36. JCGM 101:2008. Evaluation of measurement data—supplement 1 to the “Guide to the expression of uncertainty in measurement”—Propagation of distributions using a Monte Carlo method. BIPM, Sèvres, France. http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E.pdf

  37. Lunn DJ, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions (with discussion). Stat Med 28:3049–3082. See also http://www.mrc-bsu.cam.ac.uk/bugs/ and http://www.openbugs.info/w/

    Google Scholar 

  38. Gelman A, Carlin JB, Stern HA, Rubin DB (2004) Bayesian data analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton, FL, USA

    Google Scholar 

  39. ISO (2006) ISO GUIDE 35:2006 reference materials—general and statistical principles for certification. International Organization for Standardization, Geneva

  40. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York, NY, USA. http://www.stanford.edu/~hastie/local.ftp/Springer/ESLII_print5.pdf

  41. Duewer DL, Kowalski BR, Fasching JL (1976) Improving the reliability of factor analysis of chemical data by utilizing the measured analytical uncertainty. Anal Chem 48:2002–2010

    Article  CAS  Google Scholar 

  42. Duewer DL (2008) A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers. Accred Qual Assur 13:193–216

    Article  CAS  Google Scholar 

  43. Lawn RE, Thompson M, Walker RF (1997) Proficiency testing in analytical chemistry. Royal Society of Chemistry, Cambridge, UK

    Google Scholar 

Download references

Acknowledgments

We thank Johanna E. Camara (NIST, Gaithersburg) and Rosemarie Phillips (BAM, DE) for their insights and experience in the practicalities of CRM comparison measurements; Wolfram Bremser (BAM, DE) and Stephen L.R. Ellison (LGC Limited, UK) for helpful discussions on approaches to analyzing CRM comparison results; Jolene D. Splett (NIST, Boulder) for her expertise with SAS and the interpretation of its results; Chih-Ming Wang (NIST, Boulder) for his careful review (and correction) of statistical concepts and notation; Katherine E. Sharpless (NIST, Gaithersburg) for her efforts toward making this document more accessible; the anonymous reviewers who thoughtfully critiqued the original draft of this report; and this Journal’s editorial staff for their insightful questions and gentle corrections.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David L. Duewer.

Additional information

Disclaimer Certain commercial software is identified in this report to specify the experimental procedure as completely as possible. In no case does such identification imply a recommendation or endorsement by the National Institute of Standards and Technology nor does it imply that the software is necessarily the best available for the purpose.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 507 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duewer, D.L., Gasca-Aragon, H., Lippa, K.A. et al. Experimental design and data evaluation considerations for comparisons of reference materials. Accred Qual Assur 17, 567–588 (2012). https://doi.org/10.1007/s00769-012-0920-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00769-012-0920-4

Keywords

Navigation