Determining reference ranges and sample sizes in parallel-group studies

Gwowen Shieh

doi:10.1371/journal.pone.0278447

Abstract

Background and objectives

Reference ranges are widely used to locate the major range of the target probability distribution. When future measurements fall outside the reference range, they are classified as atypical and require further investigation. The fundamental principles and statistical properties of reference ranges are closely related to those of tolerance interval procedures. Existing investigations of reference ranges and tolerance intervals mainly devoted to the primitive cases of one- and paired-sample designs. Although reference ranges hold considerable promise for parallel group designs, the corresponding methodological and computational issues for determining reference limits and sample sizes have not been adequately addressed.

Methods

This paper describes a complete collection of one- and two-sided reference ranges for assessing measurement differences in parallel-group studies that assume variance homogeneity.

Results

The problem of sample size determination for precise reference ranges is also examined under the expected half-width and assurance probability considerations. Unlike the current methods, the suggested sample size criteria explicitly accommodate desired interval width in precise interval estimation.

Conclusions

Theoretical examinations and empirical assessments are presented to validate the usefulness of the proposed reference range and sample size procedures. To enhance the usages of the recommended techniques in practical applications, computer programs are developed for efficient calculation and exact analysis. A real data example regarding tablet absorption rate and extent is presented to illustrate the suggested assessments between two drug formulations.

Citation: Shieh G (2022) Determining reference ranges and sample sizes in parallel-group studies. PLoS ONE 17(11): e0278447. https://doi.org/10.1371/journal.pone.0278447

Editor: Mahdi Abbasi, Bu-Ali Sina University: Bu Ali Sina University, ISLAMIC REPUBLIC OF IRAN

Received: April 27, 2022; Accepted: November 17, 2022; Published: November 30, 2022

Copyright: © 2022 Gwowen Shieh. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: This work was supported by a grant from the Ministry of Science and Technology (MOST 109-2410-H-009-021-MY2). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Reference ranges are commonly used as a decision-making tool in medical practice. The conventional confidence intervals provide the potential bounds for a single-value mean parameter. Yet the reference ranges give distinctly different implications in quantifying a plausible range for the major proportion of a distribution. The reference limits of a constructed reference range comprise a substantial part of the population measurements with the designated confidence level. Specifically, the reference limits of 95% proportion encompass the 2.5th percentile and 97.5th percentile for the target population. The main use of reference ranges is to classify future observations, and the measurements fall outside the reference range are considered as atypical and necessitate additional evaluations. Practical guidelines and general discussions can be found in Harris and Boyd [1], Horn and Pesce [2], and Horowitz et al. [3]. Moreover, important applications and principles were presented in Ceriotti, Hinzmann, and Panteghini [4], Geffre et al. [5], Henny and Petersen [6], Jensen and Kjelgaard-Hansen [7], Klee et al. [8], Siest et al. [9], and Wellek et al. [10].

It is noteworthy that reference ranges to estimate a designated proportion of a population are statistically equivalent to tolerance intervals to contain at least a specified proportion of a distribution considered by Wald and Wolfowitz [11]. The inferential properties of tolerance intervals are readily applicable to clarify the utility of reference ranges. Technical discussions of tolerance interval estimation are available in the excellent texts by Krishnamoorthy and Mathew [12] and Meeker, Hahn, and Escobar [13]. Methodological update and related comparisons of tolerance intervals are available in the recent studies of Francq, Berger, and Boachie [14], Liu, Bretz, and Cortina-Borja [15], and Roshan et al. [16]. Although the applications of reference ranges mainly concentrate in laboratory medicine and industrial engineering, the concept and analysis are pertinent to comparative studies across virtually all disciplines. Moreover, the functional versatility of reference ranges is also extended to evaluate exchangeability or switchability in method comparisons. Related applications of reference range or tolerance interval procedures for agreement assessments were demonstrated in Francq et al. [13], Gerke [17], Jan and Shieh [18], and Shieh [19].

Sample size determination for precise tolerance intervals has long been recognized in Wilks [20] for distribution-free tolerance limits. This practical problem has attracted considerable attention in the statistical literature. A common approach is to insure there is a small probability that the derived tolerance interval will exceed the desired proportion by a small margin. Due to the computational complexity, several approximations were considered in the early work by, among others, Faulkenberry and Weeks [21], Faulkenberry and Daly [22], and Guenther [23]. Instead, Odeh, Chou, and Owen [24] presented exact calculations for tolerance intervals to control the major proportion of a distribution. Moreover, Chou and Mee [25] described sample size procedure for tolerance intervals to control equal tail probability of a distribution. These procedures require specifying two key factors of a small probability and a marginal proportion for sample size determination. Note that the selections of appropriate specifications and their impact on the resulting tolerance intervals are not always clear and interpretable. Accordingly, Jennen-Steinmetz and Wellek [26] and Young et al [27] proposed alternative sample size methods for reference range and tolerance interval studies.

The current literature of reference range procedures and sample size determinations mainly focuses on the simple cases of one- and paired-sample designs. The principles have been generalized to distinct scenarios in various research fields. Although the applications of reference ranges are inevitably important for parallel group designs, the corresponding methodological and computational issues for determining reference limits and sample sizes have not been adequately addressed. Consequently, no single resource appropriately describes the technical and computational aspects of reference ranges and sample size calculations for parallel group designs under variance homogeneity. Appropriate use of reference ranges is likely hampered by an absence of knowledge of their properties. Both researchers and practitioners may appreciate a concise document of the primary development. On the other hand, the analytic complexity requires computer algorithms to compute the exact reference limits and optimal sample sizes under various design configurations. The use of reference range and sample size techniques can be significantly impeded by a lack of specialized computer software.

To complement the findings in the literature, the current study has three major goals. First, both one- and two-sided reference ranges are described for assessing measurement differences in parallel-group studies under homogeneous variance conditions. They encompass two types of reference ranges to control the major proportion and to control equal tail proportions of the measurement differences. In the process, approximations are also presented and examined to justify the advantages of the recommended exact approaches. Second, the prevalent criterion for determining optimal sample size requires that the probability be small that the tolerance interval covers too large a proportion of the sampled population. It is sensible to apply a rule that directly controls the potential width of a tolerance interval. Sample size procedures for precise reference ranges are proposed under the expected half-width and assurance probability considerations. Third, the described methods for parallel group designs are not available in current software packages. To alleviate the theoretical complexity and computational demand, computer algorithms are developed for the suggested reference range and sample size calculations. Numerical illustrations are also provided to demonstrate the advancement and usage of the exact approaches and accompanying software programs.

Methods

Reference ranges

Consider independent random samples X_ij from two normal populations: (1) where μ_i, σ² are unknown parameters, j = 1, …, N_i, and i = 1 and 2. For detecting the group effect μ_D = μ₁ − μ₂ in terms of the hypothesis H₀: μ_D = 0 versus H₁: μ_D ≠ 0, the usual two-sample t statistic has the form where , M = 1/(1/N₁ + 1/N₂), , , , and ν = N₁ + N₂ − 2.

Note that the difference D = X_1j − X_2j′ between two measurements of the normal populations has the distribution (2) where . Accordingly, the 100pth percentile of the normal distribution is denoted by θ_p, where (3) and z_p is the (100·p)th percentile of the standard normal distribution N(0, 1) and 0 < p < 1.

One-sided reference ranges for a major proportion.

To provide a reference range of the major proportion p* for the distribution of measurement difference D, a feasible approach is to consider the one-sided reference ranges for the percentile θ_p* = μ_D + z_p*σ_D where 0 < p* < 1. Standard derivations show that (4) where t(ν, −(2M)^1/2z_p*) is a noncentral t distribution with degrees of freedom ν and noncentrality parameter–(2M)^1/2z_p*. First, a 100(1 –α)% one-sided lower confidence interval of θ_p*, with p* > 0.5, is expressed as C_LP = (–∞, T_LPU) and where t_α(ν, −(2M)^1/2z_p*) is the (100 α)th percentile of the distribution t(ν, −(2M)^1/2z_p*). The upper confidence limit T_LPU is conveniently expressed as (5) where τ_1−α = t_1−α(ν, (2M)^1/2z_p*) = t_α(ν, −(2M)^1/2z_p*).

Second, a 100(1 –α)% one-sided upper confidence interval of θ_1−p* = μ_D − z_p*σ_D, with 1 –p* < 0.5, is denoted by C_UP = (T_UPL, ∞) and the lower confidence limit T_UPL has the form (6)

Note that the upper confidence limit T_LPU of θ_p* assures that . Hence, an upper 100(1 –α)% confidence limit on the (100·p*)th percentile for p* > 0.5 is equivalent to an upper tolerance limit to exceed at least a proportion p* of the population with probability 1 –α. On the other hand, the lower confidence limit T_UPL of θ_1−p* guarantees that . Thus, a lower 100(1 –α)% confidence limit on the 100(1 –p*)th percentile for 1 –p* < 0.5 is equivalent to a lower tolerance limit to be exceeded by at least a proportion p* of the population with probability 1 –α. As noted in Hahn [28,29], the one-sided reference ranges of percentiles are equivalent to the one-sided tolerance intervals for the designated proportion p* of population. Similar results were presented in Section 2.4.1 Krishnamoorthy and Mathew [12] when the variance ratio is known.

Two-sided reference ranges for a major proportion.

A general form of two-sided reference ranges to control the major proportion p* for the target population of measurement difference D is expressed as (7) where , , and g_1−α is a designated value so that (8)

Following the detailed derivations in Appendix A1, the critical value g_1−α can be uniquely determined from (9) where the expectation E_Z is taken with respect to the standard normal distribution Z, K_g is given in Equation A2 in S1 File, and Φ_K(·) is the cumulative distribution function of the chi-square random variable K ~ χ²(ν). Despite the critical value g_1−α does not have an explicit analytic form, the simplification in Eq 9 provides a particularly attractive formulation for computing the critical value and reference range. Related discussions for tolerance intervals of designated major proportions can be found in Krishnamoorthy, Lian and Mondal [30] when the variance ratio is known and unknown. However, they did not cover tolerance interval estimation for the central proportion with equal tail probabilities.

Two-sided reference ranges for the central proportion.

The second type of reference ranges specialize on the central proportion p* and controls the equal tail proportions in terms of the paired percentiles {θ_p, θ_1−p} = {μ_D − z_pσ_D, μ_D + z_pσ_D} with p > 0.5 and p* = (2p − 1). Hence, it is desirable to find the 100(1 –α)% confidence interval (10) to cover the central proportion p* of the distribution of measurement difference D where and h_1−α is a selected value so that (11)

It follows from the technical arguments in Appendix A2 that the critical value h_1−α can be uniquely computed from (12) where the expectation E_Z is taken with respect to the standard normal distribution Z and K_h is given in Equation A4 in S1 File. In general, the critical value h_1−α does not have a close-form expression. However, the expression in Eq 12 gives a computationally convenient formulation for obtaining the critical value and reference range.

Notably, the derived expressions for in Eqs 9 and 12 demonstrate the close resemblance between the confidence level appraisals for the control of major proportion and central proportion, respectively. It is believed that no previous attempt has made to give the unified and transparent formulations. Accordingly, the required calculations of the critical values g_1−α and h_1−α can be conducted with the embedded normal and chi-square distribution functions in popular software packages. The corresponding computer algorithms are developed for implementing the described computations. To quantify the desired range of distribution, the control of equal tails of the distribution is more stringent than the control of the sum of two tails of the distribution. When all the factors are identical, the resulting critical value h_1−α for the coverage of central proportion is larger than the counterpart g_1−α for the coverage of major proportion: h_1−α > g_1−α. Therefor, the reference range C_ET is generally wider than the reference range C_MP under the identical model settings. The discrepancy between the different types of reference ranges in terms of interval widths and sample sizes will be illustrated in the subsequent analysis.

Approximations and numerical comparisons.

The reference ranges of major proportion and central proportion require a specialized algorithm to compute the critical value for the designated settings of sample sizes (N₁, N₂), confidence level 1 –α, distribution proportion p*. It is temping to adopt simplified procedures with less computation. Similar to the simplification in Wald and Wolfowitz [11] for the tolerance intervals of major proportion in one-sample case, two approximations are presented here.

With the prescribed result Z ~ N(0, 1), it is clear that E[Z²] = 1. Thus, the noncentrality λ² of has the approximation λ² = Z²(2M) ≐ E[Z²]/(2M) = 1/(2M). The confidence level in Equation A2 in S1 File is simplified as P{K > A_g} = 1 –α where . It follows from K ~ χ²(ν) that (13) where is the (100·α)th percentile of . Accordingly, a approximate two-sided tolerance interval to control the major proportion p* for the target population of measurement difference D is denoted by (14) where and .

Moreover, the same simplification can also be applied to the critical values of central proportion. Specifically, with the approximation |Z| = (Z²)^1/2 ≐ E[(Z²)^1/2] = (2/π)^1/2, the evaluation in Equation A4 in S1 File is rewritten as P{K > A_h} = 1 –α where . With K ~ χ²(ν), the approximate critical value is (15)

Thus, a approximate two-sided tolerance interval to control the central proportion p* for the target population of measurement difference D is (16) where and .

To demonstrate the properties of the exact and approximate reference ranges {C_MP, C_AMP, C_ET, C_AET}, simulation studies of 10,000 iterations were conducted to examine their coverage performance under various situations. Note that the coverage performance of the exact and approximate reference ranges is a function of sample sizes, distribution proportion, and confidence level. Table 1 presents the simulated coverage probabilities and errors for the 90% two-sided reference ranges of the distribution proportion 0.80. A total of 12 combined balance and unbalance designs (N₁, N₂) are considered where N₁ = 5, 10, and 25, and the other sample size N₂ = rN₁ with r = 1, 2, 5, and 10. On the other hand, the accuracy of the examined coverage does not depend on the population parameters μ_D and σ². Without loss of generality, the underlying parameters are set as μ_D = 0 and σ² = 1.

Download:

Table 1. The simulated coverage probabilities and errors for the 90% two-sided reference intervals of the distribution proportion 0.80.

https://doi.org/10.1371/journal.pone.0278447.t001

The results in Table 1 show that the approximate interval C_AMP is sensitive to unbalance designs especially when the sample sizes are small, such as those with N₁ = 5 and 10. The reported errors are –0.0345, –0.0193 and –0.0108 for the cases of (N₁, N₂) = (5, 50), (10, 100), and (25, 250), respectively. The other approximate interval C_AET also reveals the same disadvantage and the problematic situation continues to deteriorate as the sample size N₁ increases from 5 to 25. Specifically, the resulting errors are –0.0257, –0.0303, and –0.0370 for the unbalance cases (N₁, N₂) = (5, 50), (10, 100), and (25, 250), respectively.

In contrast, the exact reference ranges C_MP and C_ET maintain excellent coverage performance for all the configurations. These model configurations do not really cover a wide range of scenarios in parallel group trial. However, these findings discern the essential behavior of the approximate procedures that may not be revealed under other settings. The two simple modifications C_AMP and C_AET provide computational shortcuts and reasonable approximations, but their accuracy is vulnerable to small magnitude and unbalance structure of sample sizes. Moreover, the calculations of approximate intervals may still require the aid of computer software. The additional computational effort of the exact approaches seems inconsequential if the corresponding computer algorithms are available. In short, the exact reference ranges can be recommended for general use in comparative studies.

Results

Sample size determinations

Within the context of research design, a subject of great importance is to determine the optimal sample sizes so that the resulting confidence interval will meet the designated precision requirement. The existing sample size procedures for reference ranges or tolerance intervals mainly focus on the one-sample or paired-sample frameworks, such as Meeker, Hahn, and Escobar [13] for two-sided tolerance intervals, among others. Also, the common criterion constricts the tail proportions will not be too small at a given confidence level. As an alternative, the control of the expected width and the assurance probability of the width within a designated bound of Beal [31] and Kupper and Hafner [32] are considered and demonstrated here for precision evaluations of reference ranges.

Precision of the one-sided reference ranges for a major proportion.

The one-sided reference ranges C_LP = (–∞, T_LPU) and C_UP = (T_UPL, ∞) for a major proportion have an open end. Due to the unbounded limits, their precision levels rely on the finite limits T_LPU and T_UPL. It is an intuitive and feasible way to measure their precision through the half-width between and T_LPU for C_LP, and the width between T_UPL and for C_UP. In both cases, the half-width is denoted by H_τ = τ_1−α(S²/M)^1/2.

Notably, it is desired to calculate the least sample size such that the expected half-width of the one-sided reference range is within the given threshold: (17) where η (> 0) is a constant. In addition, one may compute the minimum sample size needed to guarantee, with a given assurance probability, that the suggested half-width of a one-sided reference range will not exceed the planned value: (18) where 1 –γ ∈ (0, 1) is the specified assurance level and η (> 0) is a constant.

It can be shown that the expected half-width η_τ = E[H_τ] is (19) where u = (ν/2)^1/2Γ(ν/2)/Γ{(ν + 1)/2} is an adjusting factor so that E[uS] = σ and ν = N₁ + N₂ − 2. Moreover, the probability of half-width π_τ = P(H_τ ≤ η) is (20) where .

Precision of the two-sided reference ranges for a major proportion.

The half-width of the two-sided reference range C_MP for a major proportion is H_g = (T_MPU−T_MPL)/2 = g_1−α(S²/M)^1/2. Thus, the expected half-width η_g = E[H_g] is readily obtained as (21)

Also, the probability of interval half-width π_g = P(H_g ≤ η) has the form (22) where .

Precision of the two-sided reference ranges for the central proportion.

For the two-sided reference range C_MP of a central proportion, the half-width is H_h = (T_ETU−T_ETL)/2 = h_1−α(S²/M)^1/2. Accordingly, the associated expected half-width η_h = E[H_h] is immediately derived as (23)

In this case, the probability of interval half-width π_h = P(H_h ≤ η) has the form (24) where .

Sample size requirements.

The precision quantities {η_τ, η_g, η_h} and {π_τ, π_g, π_h} are a monotone function of sample sizes (N₁, N₂) for fixed values of confidence level 1 –α, desired proportion p*, and standard deviation σ. However, the interval precision does not depend on the mean difference μ_D. With the selected measure, the sample sizes needed to attain the specified precision can be found with a simple incremental search. Note that the critical values {τ_1−α, g_1−α, h_1−α} are functions of sample sizes (N₁, N₂). In the iteration search, they need to be recalculated because the actual magnitude varies the sample sizes in each of the iterative processes. Accordingly, supplemental computer programs are presented to facilitate the required computations.

The features and differences of the suggested sample size procedures for precise reference range estimation are demonstrated with numerical assessments under the expected width and assurance probability criteria for two confidence levels 1 –α = 0.90 and 0.95. The selected three thresholds of expected half-width are η = 2.5, 3.0, and 3.5. For assurance evaluation, the designated assurance levels are 1 –γ = 0.8 and 0.9 combined with η = 2.5, 3.0, and 3.5. Moreover, the parameter configurations are fixed as μ_D = 0, σ² = 1, and p* = 0.90 throughout the empirical appraisal. These configurations are chosen to reflect the potential discrepancy among necessary sample sizes of different reference ranges and precision configurations. The first step of the numerical illustration is to compute the required sample sizes for achieving the specified precision under balance designs and unbalance designs with r = N₂/N₁ = 1, 2, 5, and 10. The computed sample sizes are presented in Tables 2–7 for the expected half-width and assurance probability, respectively.

Download:

Table 2. The computed sample size, simulated half-width and errors for the 90% reference intervals of the distribution proportion 0.90.

https://doi.org/10.1371/journal.pone.0278447.t002

Download:

Table 3. The computed sample size, simulated half-width and errors for the 95% reference intervals of the distribution proportion 0.90.

https://doi.org/10.1371/journal.pone.0278447.t003

Download:

Table 4. The computed sample size, simulated assurance probability and errors for the 90% reference intervals of the distribution proportion 0.90 with assurance probability 0.80.

https://doi.org/10.1371/journal.pone.0278447.t004

Download:

Table 5. The computed sample size, simulated assurance probability and errors for the 90% reference intervals of the distribution proportion 0.90 with assurance probability 0.90.

https://doi.org/10.1371/journal.pone.0278447.t005

Download:

Table 6. The computed sample size, simulated assurance probability and errors for the 95% reference intervals of the distribution proportion 0.90 with assurance probability 0.80.

https://doi.org/10.1371/journal.pone.0278447.t006

Download:

Table 7. The computed sample size, simulated assurance probability and errors for the 95% reference intervals of the distribution proportion 0.90 with assurance probability 0.90.

https://doi.org/10.1371/journal.pone.0278447.t007

In the second stage, simulated values of expected width and assurance probability associated with the reported sample sizes and selected parameter configurations were computed through a Monte Carlo study of 10,000 independent data sets. For each replicate, two groups of N₁ and N₂ values were generated from the normal distribution N(0, 1). The sample variance and critical values were obtained to compute the half-width estimates {H_τ, H_g, H_h} for the one- and two-sided confidence intervals {C_LP, C_UP, C_MP, C_ET}. The simulated expected half-width is the mean of the 10,000 replicates of the half-width estimates. Also, the simulated assurance probability is the proportion of the 10,000 replicates whose half-width estimates are less than or equal to the specified bound η. The adequacy of the examined sample size procedures for precise interval estimation is determined by the difference between the attained precision quantity and the simulated result. Due to the discrete nature of sample sizes, the obtained half-width of the reported sample sizes are marginally smaller than the designated threshold η. On the other hand, the attained assurance probability is slightly higher than the nominal assurance level 1 –γ. Accordingly, the adequacy of the sample size procedures for precise interval estimation is determined by the difference between the simulated half-width and the computed half-width, and the difference between the simulated assurance probability and the computed assurance probability. These values are also summarized in Tables 2–7 for the expected half-width and assurance probability evaluations.

The prescribed numerical assessments suggest that the required sample sizes in Tables 2–7 vary drastically across the types of reference ranges and precision considerations. In general, under the expected half-width consideration, the optimal sample size in Tables 2 and 3 increases with a smaller width bound of η when all other factors are fixed. For the assurance probability criterion in Tables 4–7, a larger sample size is needed to attain a higher assurance level 1 –γ when the designated width η and other configurations remain identical. With the same settings, the optimal sample size of the one-sided reference ranges C_LP and C_UP is less than those of the two-sided reference ranges. Also, the two-sided reference range C_ET to control equal-tails demands more sample size than the counterpart two-sided C_MP for a major proportion. Under the examined precision settings, the results reveal that the assurance probability principle generally requires larger sample size than the expected half-width criterion as shown in Tables 2, 4 and 5. The extensive evaluations reveal that it is more efficient to adopt a balance design than an unbalance design because the former require less sample size to attain the same precision for the one- and two-sided reference ranges. Regarding the appraisal of sample size determinations, both the simulated half-widths and simulated assurance probabilities are nearly the same as the nominal precision bounds for all cases examined here. The performance of the suggested sample size procedures is sufficiently accurate for most purposes. Accordingly, these sample size formulas are useful for obtaining precise reference ranges under the prescribed expected half-width and assurance probability criteria.

An example

A randomized parallel group study was conducted in Liu et al. [33] to examine the pharmacokinetic properties and bioequivalence of two sulfadoxine/pyrimethamine fixed-dose combination tablets. The study found that the test and reference formulations of sulfadoxine/pyrimethamine FDC 500/25-mg tablet have similar pharmacokinetic profiles both in terms of rate and extent of absorption. One of the primary endpoints used in the study is the area under the plasma concentration-time curve at 72 hours of pyrimethamine. For the test and reference formulations, the sample means and sample standard deviations are , , S₁ = 0.1132, and S₂ = 0.1314 of the N₁ = N₂ = 23 log-transformed measurements. Accordingly, the conventional 95% two-sided confidence interval of mean difference is (–0.1225, 0.0233).

For illustration, numerical examination is extended to compute the reference ranges of the proportion p* = 0.9 of measurement differences between the test and reference formulations. With the suggested algorithms and formulations, the three critical values for a 95% confidence level are τ_0.95 = 8.3847, g_0.95 = 9.8477, and h_0.95 = 10.9049. The one-sided 95% reference ranges of the 0.9 proportion computed from Eqs 5 and 6 are C_LP = (–∞, 0.2536) and C_UP = (–0.3528, ∞), respectively. The two-sided 95% reference range for a 0.9 proportion without an equal-tail restriction is C_MP = (–0.4057, 0.3065). Whereas the two-sided 95% reference range to control equal-tail proportions is C_ET = (–0.4440, 0.3448). These reference ranges for distribution proportion of measurement differences are evidently much wider than the usual confidence interval of a single parameter of mean difference.

Moreover, these model settings are employed to explicate precision assessments and sample size calculations for future comparative study. With σ = 0.122638, p* = 0.9, and 1 –α = 0.95, the necessary sample sizes for the interval procedures C_LP, C_MP, and C_ET to have an expected half-width η = 0.3 are (N₁, N₂) = (24, 24), (301, 301), and (929, 929), respectively. The corresponding sample sizes for attaining the assurance probability 1 –γ = 0.9 are (40, 40), (640, 640), and (1508, 1508) for the three reference ranges C_LP, C_MP, and C_ET. Notably, the influence of expected half-width and assurance probability principles on sample size requirements not only differs but also depends on the types of reference range procedures. The presented sample size formulas and accompany computer algorithms can be readily used to assess the nonlinear relationship between the key features. For ease of application, the exemplifying configurations are incorporated in the user specification sections of the supplemental computer programs. Users can easily modify the exemplifying settings in the programs to accommodate their own model specifications.

Conclusions

Reference ranges are constructed to comprise a desired large proportion of the population measurements and provide the potential spread of the target probability distribution. With the connection between reference range and tolerance interval procedures, the technical features of reference ranges can be readily obtained from the statistical literature for the estimation of tolerance intervals. This study addresses two important aspects to complement the existing appraisals and demonstrations. First, a complete collection of reference ranges is presented within the context of parallel group designs under variance homogeneity assumption. They include one- and two-sided reference ranges to control the major proportion of a distribution and two-sided reference ranges to control equal tail probability of a distribution. Second, sample size determinations for precise reference ranges are considered under the expected half-width and assurance probability principles. Analytical formulas and computer algorithms are described to extend the practical value of reference ranges for parallel group studies. Their usefulness and accuracy are illustrated and justified through empirical illustrations and simulation assessments. The suggested procedures are also demonstrated with a real data regarding the comparison of tablet absorption rate and extent between two drug formulations.

Supporting information

S1 File. Appendix A: The critical values of two-sided reference ranges.

https://doi.org/10.1371/journal.pone.0278447.s001

(PDF)

S2 File. SAS/IML programs for performing the suggested procedures.

https://doi.org/10.1371/journal.pone.0278447.s002

(PDF)

S3 File. R programs for performing the suggested procedures.

https://doi.org/10.1371/journal.pone.0278447.s003

(PDF)

References

1. Harris E. K., & Boyd J. C. (1995). Statistical bases of reference values in laboratory medicine. CRC Press.
2. Horn P. S., & Pesce A. J. (2005). Reference intervals: A user’s guide. Washington, DC: American Association for Clinical Chemistry.
3. Horowitz G., Altaie S., Boyd J., Ceriotti F., Garg U., Horn P., et al. (2008). Clinical and Laboratory Standards Institute: Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory; Approved Guideline. Wayne, PA: CLSI.
4. Ceriotti F., Hinzmann R., & Panteghini M. (2009). Reference intervals: the way forward. Annals of Clinical Biochemistry, 46, 8–17. pmid:19103955
- View Article
- PubMed/NCBI
- Google Scholar
5. Geffré A., Friedrichs K., Harr K., Concordet D., Trumel C., & Braun J. P. (2009). Reference values: a review. Veterinary Clinical Pathology, 38, 288–298. pmid:19737162
- View Article
- PubMed/NCBI
- Google Scholar
6. Henny J., & Petersen P. H. (2004). Reference values: from philosophy to a tool for laboratory medicine. Clinical Chemistry and Laboratory Medicine, 42, 686–691. pmid:15327000
- View Article
- PubMed/NCBI
- Google Scholar
7. Jensen A. L., & Kjelgaard-Hansen M. (2006). Method comparison in the clinical laboratory. Veterinary Clinical Pathology, 35, 276–286. pmid:16967409
- View Article
- PubMed/NCBI
- Google Scholar
8. Klee G. G., Ichihara K., Ozarda Y., Baumann N. A., Straseski J., Bryant S. C., et al. (2018). Reference intervals: Comparison of calculation methods and evaluation of procedures for merging reference measurements from two US Medical Centers. American Journal of Clinical Pathology, 150, 545–554. pmid:30169553
- View Article
- PubMed/NCBI
- Google Scholar
9. Siest G., Henny J., Gräsbeck R., Wilding P., Petitclerc C., Queralto J. M., & Petersen P. H. (2013). The theory of reference values: An unfinished symphony. Clinical Chemistry and Laboratory Medicine, 51, 47–64. pmid:23183761
- View Article
- PubMed/NCBI
- Google Scholar
10. Wellek S., Lackner K. J., Jennen-Steinmetz C., Reinhard I., Hoffmann I., & Blettner M. (2014). Determination of reference limits: statistical concepts and tools for sample size calculation. Clinical Chemistry and Laboratory Medicine, 52, 1685–1694. pmid:25029084
- View Article
- PubMed/NCBI
- Google Scholar
11. Wald A., & Wolfowitz J. (1946). Tolerance limits for a normal distribution. The Annals of Mathematical Statistics, 17, 208–215.
- View Article
- Google Scholar
12. Krishnamoorthy K., & Mathew T. (2009). Statistical tolerance regions: Theory, applications, and computation (Vol. 744). New York, NY: Wiley.
13. Meeker W. Q., Hahn G. J., & Escobar L. A. (2017). Statistical intervals: A guide for practitioners and researchers. New York, NY: Wiley.
14. Francq B. G., Berger M., & Boachie C. (2020). To tolerate or to agree: A tutorial on tolerance intervals in method comparison studies with BivRegBLS R Package. Statistics in Medicine, 39, 4334–4349. pmid:32964501
- View Article
- PubMed/NCBI
- Google Scholar
15. Liu W., Bretz F., & Cortina-Borja M. (2021). Reference range: Which statistical intervals to use?. Statistical Methods in Medical Research, 30, 523–534. pmid:33054684
- View Article
- PubMed/NCBI
- Google Scholar
16. Roshan D., Ferguson J., Pedlar C. R., Simpkin A., Wyns W., Sullivan F., et al. (2021). A comparison of methods to generate adaptive reference ranges in longitudinal monitoring. PLoS One, 16, e0247338. pmid:33606821
- View Article
- PubMed/NCBI
- Google Scholar
17. Gerke O. (2020). Reporting standards for a Bland-Altman agreement analysis: A review of methodological reviews. Diagnostics, 10, 334. pmid:32456091
- View Article
- PubMed/NCBI
- Google Scholar
18. Jan S. L., & Shieh G. (2018). The Bland-Altman range of agreement: Exact interval procedure and sample size determination. Computers in Biology and Medicine, 100, 247–252. pmid:30056297
- View Article
- PubMed/NCBI
- Google Scholar
19. Shieh G. (2018). The appropriateness of Bland-Altman’s approximate confidence intervals for limits of agreement. BMC Medical Research Methodology, 18, 45. pmid:29788915
- View Article
- PubMed/NCBI
- Google Scholar
20. Wilks S. S. (1941). Determination of sample sizes for setting tolerance limits. The Annals of Mathematical Statistics, 12, 91–96.
- View Article
- Google Scholar
21. Faulkenberry G. D., & Weeks D. L. (1968). Sample size determination for tolerance limits. Technometrics, 10, 343–348.
- View Article
- Google Scholar
22. Faulkenberry G. D., & Daly J. C. (1970). Sample size for tolerance limits on a normal distribution. Technometrics, 12, 813–821.
- View Article
- Google Scholar
23. Guenther W. C. (1972). Tolerance intervals for univariate distributions. Naval Research Logistics Quarterly, 19, 309–333.
- View Article
- Google Scholar
24. Odeh R. E., Chou Y.-M., & Owen D. B. (1987). The precision of coverages and sample size requirements for normal tolerance intervals. Communications in Statistics-Simulation and Computation, 16, 969–985.
- View Article
- Google Scholar
25. Chou Y. M., & Mee R. W. (1984). Determination of sample sizes for setting β-content tolerance limits controlling both tails of the normal distribution. Statistics & Probability Letters, 2, 311–314.
- View Article
- Google Scholar
26. Jennen-Steinmetz C., & Wellek S. (2005). A new approach to sample size calculation for reference interval studies. Statistics in Medicine, 24, 3199–3212. pmid:16189809
- View Article
- PubMed/NCBI
- Google Scholar
27. Young D. S., Gordon C. M., Zhu S., & Olin B. D. (2016). Sample size determination strategies for normal tolerance intervals using historical data. Quality Engineering, 28, 337–351.
- View Article
- Google Scholar
28. Hahn G. J. (1970). Statistical intervals for a normal population, Part I. Tables, examples and applications. Journal of Quality Technology, 2, 115–125.
- View Article
- Google Scholar
29. Hahn G. J. (1970). Statistical intervals for a normal population, Part II. Formulas, assumptions, some derivations. Journal of Quality Technology, 2, 195–206.
- View Article
- Google Scholar
30. Krishnamoorthy K., Lian X., & Mondal S. (2011). Tolerance intervals for the distribution of the difference between two independent normal random variables. Communications in Statistics-Theory and Methods, 40, 117–129.
- View Article
- Google Scholar
31. Beal S. L. (1989). Sample size determination for confidence intervals on the population mean and on the difference between two population means. Biometrics, 45, 969–977. pmid:2790131
- View Article
- PubMed/NCBI
- Google Scholar
32. Kupper L. L., & Hafner K. B. (1989). How appropriate are popular sample size formulas? The American Statistician, 43, 101–105.
- View Article
- Google Scholar
33. Liu Y. M., Zhang K. E., Liu Y., Zhang H. C., Song Y. X., Pu H. H.,… & Zhu J. M. (2012). Pharmacokinetic properties and bioequivalence of two sulfadoxine/pyrimethamine fixed-dose combination tablets: a parallel-design study in healthy Chinese male volunteers. Clinical Therapeutics, 34, 2212–2220. pmid:23084093
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Harris E. K., & Boyd J. C. (1995). Statistical bases of reference values in laboratory medicine. CRC Press.

[ref2] 2. Horn P. S., & Pesce A. J. (2005). Reference intervals: A user’s guide. Washington, DC: American Association for Clinical Chemistry.

[ref3] 3. Horowitz G., Altaie S., Boyd J., Ceriotti F., Garg U., Horn P., et al. (2008). Clinical and Laboratory Standards Institute: Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory; Approved Guideline. Wayne, PA: CLSI.

[ref4] 4. Ceriotti F., Hinzmann R., & Panteghini M. (2009). Reference intervals: the way forward. Annals of Clinical Biochemistry, 46, 8–17. pmid:19103955
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref5] 5. Geffré A., Friedrichs K., Harr K., Concordet D., Trumel C., & Braun J. P. (2009). Reference values: a review. Veterinary Clinical Pathology, 38, 288–298. pmid:19737162
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref6] 6. Henny J., & Petersen P. H. (2004). Reference values: from philosophy to a tool for laboratory medicine. Clinical Chemistry and Laboratory Medicine, 42, 686–691. pmid:15327000
View Article
PubMed/NCBI
Google Scholar

[13] View Article

[14] PubMed/NCBI

[15] Google Scholar

[ref7] 7. Jensen A. L., & Kjelgaard-Hansen M. (2006). Method comparison in the clinical laboratory. Veterinary Clinical Pathology, 35, 276–286. pmid:16967409
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref8] 8. Klee G. G., Ichihara K., Ozarda Y., Baumann N. A., Straseski J., Bryant S. C., et al. (2018). Reference intervals: Comparison of calculation methods and evaluation of procedures for merging reference measurements from two US Medical Centers. American Journal of Clinical Pathology, 150, 545–554. pmid:30169553
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref9] 9. Siest G., Henny J., Gräsbeck R., Wilding P., Petitclerc C., Queralto J. M., & Petersen P. H. (2013). The theory of reference values: An unfinished symphony. Clinical Chemistry and Laboratory Medicine, 51, 47–64. pmid:23183761
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref10] 10. Wellek S., Lackner K. J., Jennen-Steinmetz C., Reinhard I., Hoffmann I., & Blettner M. (2014). Determination of reference limits: statistical concepts and tools for sample size calculation. Clinical Chemistry and Laboratory Medicine, 52, 1685–1694. pmid:25029084
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref11] 11. Wald A., & Wolfowitz J. (1946). Tolerance limits for a normal distribution. The Annals of Mathematical Statistics, 17, 208–215.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref12] 12. Krishnamoorthy K., & Mathew T. (2009). Statistical tolerance regions: Theory, applications, and computation (Vol. 744). New York, NY: Wiley.

[ref13] 13. Meeker W. Q., Hahn G. J., & Escobar L. A. (2017). Statistical intervals: A guide for practitioners and researchers. New York, NY: Wiley.

[ref14] 14. Francq B. G., Berger M., & Boachie C. (2020). To tolerate or to agree: A tutorial on tolerance intervals in method comparison studies with BivRegBLS R Package. Statistics in Medicine, 39, 4334–4349. pmid:32964501
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref15] 15. Liu W., Bretz F., & Cortina-Borja M. (2021). Reference range: Which statistical intervals to use?. Statistical Methods in Medical Research, 30, 523–534. pmid:33054684
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref16] 16. Roshan D., Ferguson J., Pedlar C. R., Simpkin A., Wyns W., Sullivan F., et al. (2021). A comparison of methods to generate adaptive reference ranges in longitudinal monitoring. PLoS One, 16, e0247338. pmid:33606821
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref17] 17. Gerke O. (2020). Reporting standards for a Bland-Altman agreement analysis: A review of methodological reviews. Diagnostics, 10, 334. pmid:32456091
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref18] 18. Jan S. L., & Shieh G. (2018). The Bland-Altman range of agreement: Exact interval procedure and sample size determination. Computers in Biology and Medicine, 100, 247–252. pmid:30056297
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref19] 19. Shieh G. (2018). The appropriateness of Bland-Altman’s approximate confidence intervals for limits of agreement. BMC Medical Research Methodology, 18, 45. pmid:29788915
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref20] 20. Wilks S. S. (1941). Determination of sample sizes for setting tolerance limits. The Annals of Mathematical Statistics, 12, 91–96.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref21] 21. Faulkenberry G. D., & Weeks D. L. (1968). Sample size determination for tolerance limits. Technometrics, 10, 343–348.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref22] 22. Faulkenberry G. D., & Daly J. C. (1970). Sample size for tolerance limits on a normal distribution. Technometrics, 12, 813–821.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref23] 23. Guenther W. C. (1972). Tolerance intervals for univariate distributions. Naval Research Logistics Quarterly, 19, 309–333.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref24] 24. Odeh R. E., Chou Y.-M., & Owen D. B. (1987). The precision of coverages and sample size requirements for normal tolerance intervals. Communications in Statistics-Simulation and Computation, 16, 969–985.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref25] 25. Chou Y. M., & Mee R. W. (1984). Determination of sample sizes for setting β-content tolerance limits controlling both tails of the normal distribution. Statistics & Probability Letters, 2, 311–314.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref26] 26. Jennen-Steinmetz C., & Wellek S. (2005). A new approach to sample size calculation for reference interval studies. Statistics in Medicine, 24, 3199–3212. pmid:16189809
View Article
PubMed/NCBI
Google Scholar

[80] View Article

[81] PubMed/NCBI

[82] Google Scholar

[ref27] 27. Young D. S., Gordon C. M., Zhu S., & Olin B. D. (2016). Sample size determination strategies for normal tolerance intervals using historical data. Quality Engineering, 28, 337–351.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref28] 28. Hahn G. J. (1970). Statistical intervals for a normal population, Part I. Tables, examples and applications. Journal of Quality Technology, 2, 115–125.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref29] 29. Hahn G. J. (1970). Statistical intervals for a normal population, Part II. Formulas, assumptions, some derivations. Journal of Quality Technology, 2, 195–206.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref30] 30. Krishnamoorthy K., Lian X., & Mondal S. (2011). Tolerance intervals for the distribution of the difference between two independent normal random variables. Communications in Statistics-Theory and Methods, 40, 117–129.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref31] 31. Beal S. L. (1989). Sample size determination for confidence intervals on the population mean and on the difference between two population means. Biometrics, 45, 969–977. pmid:2790131
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref32] 32. Kupper L. L., & Hafner K. B. (1989). How appropriate are popular sample size formulas? The American Statistician, 43, 101–105.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref33] 33. Liu Y. M., Zhang K. E., Liu Y., Zhang H. C., Song Y. X., Pu H. H.,… & Zhu J. M. (2012). Pharmacokinetic properties and bioequivalence of two sulfadoxine/pyrimethamine fixed-dose combination tablets: a parallel-design study in healthy Chinese male volunteers. Clinical Therapeutics, 34, 2212–2220. pmid:23084093
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

Figures

Abstract

Background and objectives

Methods

Results

Conclusions

Introduction

Methods

Reference ranges

One-sided reference ranges for a major proportion.

Two-sided reference ranges for a major proportion.

Two-sided reference ranges for the central proportion.

Approximations and numerical comparisons.

Results

Sample size determinations

Precision of the one-sided reference ranges for a major proportion.

Precision of the two-sided reference ranges for a major proportion.

Precision of the two-sided reference ranges for the central proportion.

Sample size requirements.

An example

Conclusions

Supporting information

S1 File. Appendix A: The critical values of two-sided reference ranges.

S2 File. SAS/IML programs for performing the suggested procedures.

S3 File. R programs for performing the suggested procedures.

References