Introduction

Common diseases are considered to be caused by numerous genetic and environmental factors, and many studies have been done on the association between disease and genetic markers. Association studies are expected to be more powerful than linkage studies to detect alleles that confer modest risk to common diseases (Risch and Merikangas 1996), and common and modest-risk alleles are thought to be more important in public health (Carlson et al. 2004). Population-based association studies using case–control designs have become widespread in an attempt to identify genes related to common diseases. Previously, associations between disease and genetic factors targeted several markers within a few candidate genes. Recent advances in high-volume genotyping technology have made it possible to scan whole genomes using hundreds of thousands of single nucleotide polymorphisms (SNPs). Whereas these large-scale studies have many advantages to detect causal genes, the critical issues for efficiency remain, and such studies require stepwise screening processes to increase efficiency in both single and multiple studies (van den Oord and Sullivan 2003; Hirschhorn and Daly 2005; Thomas et al. 2005). In this study we focused on the second screening studies with several hundreds to thousands of sparse markers after the first screening of the whole genome or the selection of tag SNPs to eliminate redundancy in the information provided by the SNPs.

Generally, the disease-associated markers can be identified by hypothesis testing, and the efficiency is considered by power (true positives), sample size, false positives, and cost. Because variants that contribute to common disease have modest effect, large sample sizes are needed. Additionally, significance criteria are rigorous because of multiple-testing correction, requiring further sample sizes. Consequently, these studies are expensive (Botstein and Risch 2003).

In genetic epidemiology studies many multi-stage designs have been suggested and applied to reduce time and cost on average (Böddeker and Ziegler 2001). Most of these designs are two-stage. One of these approaches is to increase sample size in subsequent stages for more promising markers. This approach in two-stage design evaluates all markers from a subset of individuals in the first stage, while the most promising markers selected from the first stage are evaluated in the second stage, using additional individuals (Satagopan et al. 2002, 2004; Satagopan and Elston 2003). These studies show that use of the two-stage approach as a sampling strategy for marker selections is more cost-effective than the one-stage approach. However, in addition to the issue of cost, other problems arise from multiple testing, because a large number of markers is tested simultaneously. The commonly controlled family-wise error rate (FWER) is the probability of yielding one or more false positives when all hypotheses are null. The most familiar example of this is Bonferroni’s method. Though controlling FWER is a familiar method for multiple comparison problems, what happens in the situation when no marker is associated with disease? Often, studies aim to find a number of interesting features in numerous markers rather than to detect any one marker. The investigators would assume that more than one marker is associated with disease. In this case, FWER confers such strict criteria that many features may be missed. Furthermore, these criteria depend on the number of markers used in the study, making the interpretation of results difficult (Colhoun et al. 2003; Wacholder et al. 2004). Benjamini and Hochberg (1995) introduced a new multiple-testing error measure called the false discovery rate (FDR), which is the expected proportion of falsely rejected null hypotheses among all rejected hypotheses. Although a prior probability, which is the proportion of truly associated markers, must be set to use the FDR criteria, it is easy to interpret the outcome of tests. Another advantage is that the efficiency of the subsequent study can be evaluated by the FDR definition. In brief, it is possible to avoid additional costs incurred by examining waste markers that are not associated with a disease. In recent years the FDR has been thought to be more appropriate than the FWER for exploratory studies (Colhoun et al. 2003; Sabatti et al. 2003; Wacholder et al. 2004).

Benjamini and Yekutieli (2005) have reported that optimal multi-stage designs can be constructed with the goal of decreasing the total cost, while the study still controls the FDR at a nominal level. Also, Zehetmayer et al. (2005) have optimized the significance level at the first stage and the fraction of total sample size used in the first stage by controlling the FDR and maximizing the expected power under the fixed total costs; this showed that these two-stage designs have higher power than one-stage designs. Such results are seen to have value for conducting association studies. However, the sample size and effect size must also be considered and may not be very realistic for an association study.

In practice, the total number of subjects will often be limited. The cohort study in which biological samples have been collected at baseline is an example. In two-stage designs with a total sample size constraint, the cost of the study is affected by the sample sizes and selection criteria used at each stage. Thus, we propose two-stage designs in case–control association studies by controlling the FDR and by optimizing the sample sizes and marker selection criteria in each stage and cost ratios to those of a one-stage design, under the assumption that the markers are independent. By this method we can utilize investigators’ knowledge as a prior probability (assumed to be determined subjectively by the investigators). Consequently, the design for each study allows one the flexibility to choose the optimal sample size and selection criteria that fit the actual situation. At the same time, the subjective prior probability would affect the efficiency of the study. Therefore, the robustness of the proposed procedure was evaluated for mis-specification of a prior probability.

Materials and methods

Consider evaluating m markers, using a case–control design with a fixed number of subjects (n) available for each group. It is assumed that the m markers are not in linkage disequilibrium (LD) with each other and can be considered to be independent. Suppose there are D markers associated with disease; it would be assumed that these markers are in complete LD with D disease loci or that the markers themselves would be the disease-susceptible polymorphisms. Also, these D markers have the same effect size μ=μ d on disease, which is the target effect size that we want to detect. π 1 is a prior probability that denotes the proportion of truly associated markers to all markers; π 1=D/m. Any typical test statistic, X, for the association of a marker derived from n subjects will follow an asymptotic normal distribution N(, 2), in which and 2 are mean and variance, respectively, of a distribution; μ would be zero when there is no association between the disease and the marker. Without loss of generality, it can be assumed that σ 2=1 and can be considered one-sided testing. In a case–control design the allele (or genotype or haplotype) frequencies in cases and controls are compared at every marker locus. Generally, the odds ratio (OR) is used to measure the association between a disease and a marker. If we assume that we focus on the effect of certain genotype(s), the relationship between the OR (θ), the effect size (μ d), and the genotype, the frequency of interest in the controls (p 0) would be:

$$ \log (\theta) = \mu _{d} \times {\sqrt {\frac{1}{{p_{0} }} + \frac{1}{{1 - p_{0} }} + \frac{{(1 - p_{0}) + \theta p_{0} }}{{\theta p_{0} }} + \frac{{(1 - p_{0}) + \theta p_{0} }}{{1 - p_{0} }}} }. $$
(1)

One-stage procedure

In this situation m markers are evaluated using all 2n subjects and controlling the FDR at a nominal level. Let FDR 1 be the desired FDR level. α is the significance level for each marker, and 1−β denotes the power for effect size μ d. FDR 1 is defined as follows:

$$ {\rm FDR}_{1} = \frac{{\pi _{0} \alpha }}{{\pi _{0} \alpha + \pi _{1} (1 - \beta)}}, $$
(2)

in which π 0=1−π 1. Hence, α can be obtained, given nμ d, π 1, and FDR 1. The false non-discovery rate (FNR), the expected proportion of the falsely accepted null hypotheses among all accepted null hypotheses, can be thought of as another error in this study. FNR 1, which is the FNR in this design, is calculated by the following:

$$ {\rm FNR}_{1} = \frac{{\pi _{1} \beta }}{{\pi _{0} (1 - \alpha) + \pi _{1} \beta }}. $$
(3)

In addition, the cost of genotyping, T 1, can be written with the arbitrary constant k as:

$$ T_{1} = k(n \times m). $$
(4)

Proposed two-stage procedure

In the first stage, all m markers are evaluated using n 1 subjects for each group, and (m′<m) markers are selected by testing with a significance level of α 1. The proportion of this selected marker is denoted by s′ (=m′/m). In the second stage n 2 subjects for each group are added, thereby giving a total of n (=n 1+n 2) subjects per group. m′ markers are tested at a significance level of α 2 using all 2n subjects. The markers that are significant in the second stage are identified as disease-associated markers. In proposed two-stage designs, there are three design parameters: (1) n 1, sample size for each group in the first stage; (2) α 1, significance level (i.e., the criterion for selecting markers) in the first stage; and (3) α 2, significance level in the second stage. These parameters (n 1, α 1, α 2) are decided to control the FDR at a predetermined nominal level.

(X 1, X 2) denote the test statistics of a marker in the first and second stages, as X 1N(n 1μ, n 1), X 2N(, n). Because cases and controls are unrelated individuals, the test statistics follow the Markov property, i.e., X 2X 1 is independent of X 1. Therefore, the statistics pair (X 1, X 2) has bivariate normal distribution with a mean (n 1 μ, ) and covariance matrix Σ, in which \(\Sigma ={\left[ {\begin{array}{*{20}c} {{n_{1} }} & {{n_{1} }} \\ {{n_{1} }}& {n} \\ \end{array} } \right]}.\)

In the first stage, the FDR is given by the following:

$$ {\rm FDR}_{2}{\prime } = \frac{{\pi _{0} \alpha _{1} }}{{\pi _{0} \alpha _{1} + \pi _{1} (1 - \beta _{1})}},$$
(5)

in which 1−β 1 denotes the power for the effect size μ d. The expected proportion of the significant markers in the first stage, E[s′], is given by the following:

$$ E[s{\prime } ] = \pi _{0} \alpha _{1} + \pi _{1} (1 - \beta _{1}). $$
(6)

Now consider the second stage. Let T i and Γ i denote the statistics and the rejection regions at the ith stage, respectively. The FDR for the two-stage approach is represented below:

$$ \begin{aligned} &{\rm FDR}_{\text{for\ two-stage\ approach}} \\ & \quad= \frac{{\pi _{0} \Pr _{{H = 0}} [T_{1} \in \Gamma _{1} ]\Pr _{{H = 0}} [T_{2} \in \Gamma _{2} |T_{1} \in \Gamma _{1} ]}} {{\pi _{0} \Pr _{{\mu = 0}} [T_{1} \in \Gamma _{1} ]\Pr _{{H = 0}} [T_{2} \in \Gamma _{2} |T_{1} \in \Gamma _{1} ] + \pi _{1} \Pr _{{H = 1}} [T_{1} \in \Gamma _{1} ]\Pr _{{H = 1}} [T_{2} \in \Gamma _{2} |T_{1} \in \Gamma _{1} ]}} \\ &\quad= \frac{{\pi _{0} \Pr _{{H = 0}} [T_{1} \in \Gamma _{1}, T_{2} \in \Gamma _{2} ]}}{{\pi _{0} \Pr _{{H = 0}} [T_{1} \in \Gamma _{1}, T_{2} \in \Gamma _{2} ] + \pi _{1} \Pr _{{H = 1}} [T_{1} \in \Gamma _{1}, T_{2} \in \Gamma _{2} ]}}, \\ \end{aligned} $$
(7)

in which PrH=0 and PrH=1 denote the probabilities under the null hypothesis and the alternative. Hence, this can be obtained by two probabilities: one is the overall α error, or the probability that a true null marker is selected in the first and second stage, and the other is the overall power, or the probability that a true associated marker is selected in the first and second stage. In this case, P μ=μ denotes the probability that a marker with effect size μ is rejected in the first stage and also in the second stage, and the overall α error and the overall power for effect size μ d can be represented by P μ=0 and \(P_{{\mu = \mu_{{_{d}}}}},\) respectively. Note that this overall power, \(P_{{\mu=\mu_{d}}},\) is thought of as the expected proportion of rejected markers among all truly associated markers, because it is assumed that truly associated markers have the same effect size, μ d. The expected power is called \(P_{{\mu = \mu _{{_{d}}}}}.\) Let Φ (.) and ϕ (.) denote the cumulative distribution function and the probability density function of a standard normal distribution. Z a denotes the 100×ath percentile of a standard normal distribution. P μ=μ is given by the following:

$$ \begin{aligned} P_{{\mu = \mu }} &= {\text{Prob[}}X_{{\text{2}}} > Z_{{1 - \alpha _{2} }} {\sqrt n, }X_{1} > Z_{{1 - \alpha _{1} }} {\sqrt {n_{1} } }] \\ &= {\int_{Z_{{1 - \alpha _{1} }} }^\infty {{\left[ {1 - \Phi {\left({{{\left\{ {Z_{{1 - \alpha _{2} }} {\sqrt n } - z{\sqrt {n_{1} } } - n_{2} \mu } \right\}}} \mathord{\left/ {\vphantom {{{\left\{ {Z_{{1 - \alpha _{2} }} {\sqrt n } - z{\sqrt {n_{1} } } - n_{2} \mu } \right\}}} {{\sqrt {n_{2} } }}}} \right. \kern-\nulldelimiterspace} {{\sqrt {n_{2} } }}} \right)}} \right]} \times \phi (z - \mu {\sqrt {n_{1} } }){\rm d}z} }. \\ \end{aligned} $$
(8)

The FDR after the second stage can be written, using P μ=0 and \(P_{{\mu = \mu _{{_{d}}}}}\) , as follows:

$$ {\rm FDR}_{2} = \frac{{\pi _{0} P_{{\mu = 0}} }}{{\pi _{0} P_{{\mu = 0}} + \pi _{1} P_{{\mu = \mu _{d} }} }}. $$
(9)

FDR2 means the proportion of non-associated markers among the identified markers by this two-stage procedure. The FNR can also be written as follows:

$$ {\rm FNR}_{2} = {{\left\{ {\pi _{1} \beta _{1} + \pi _{1} (1 - P_{{\mu = \mu _{d} }})} \right\}}} \mathord{\left/ {\vphantom {{{\left\{ {\pi _{1} \beta _{1} + \pi _{1} (1 - P_{{\mu = \mu _{d} }})} \right\}}} {{\left\{ {\pi _{0} (1 - \alpha _{1}) + \pi _{1} \beta _{1} + \pi _{0} (1 - P_{{\mu = 0}}) + \pi _{1} (1 - P_{{\mu = \mu _{d} }})} \right\}}}}} \right. \kern-\nulldelimiterspace} {{\left\{ {\pi _{0} (1 - \alpha _{1}) + \pi _{1} \beta _{1} + \pi _{0} (1 - P_{{\mu = 0}}) + \pi _{1} (1 - P_{{\mu = \mu _{d} }})} \right\}}}. $$
(10)

The expected cost of genotyping, T 2, is given by the following:

$$ \begin{aligned} E[T_{2} ] &= k(n_{1} \times m + n_{2} \times m \times E[s_{1}{\prime } ]) \\ &= k{[(Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} \times m + \{ n\mu ^{2}_{d} - (Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} \} \times m \times \{ \pi _{0} \alpha _{1} + \pi _{1} (1 - \beta _{1})\} ]} \mathord{\left/ {\vphantom {{[(Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} \times m + \{ n\mu ^{2}_{d} - (Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} \} \times m \times \{ \pi _{0} \alpha _{1} + \pi _{1} (1 - \beta _{1})\} ]} {\mu ^{2}_{d} }}} \right. \kern-\nulldelimiterspace} {\mu ^{2}_{d} }. \\ \end{aligned} $$
(11)

The expected cost ratio in the one-stage design, F, is written as follows:

$$ \begin{aligned} F &= \frac{{E[T_{2} ]}}{{T_{1} }} \\ &= \frac{{(Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} + \{ n\mu _{d} ^{2} - (Z_{{1 - \alpha _{1} }} + Z_{{1 - \beta _{1} }})^{2} \} \{ \pi _{0} \alpha _{1} + \pi _{1} (1 - \beta _{1})\} }}{{n\mu _{d} ^{2} }}. \\ \end{aligned} $$
(12)

Estimating optimal parameters

The purpose of this study was to minimize the expected cost by optimizing the three unknown design parameters n 1, α 1, and α 2. Given parameters of n, π 1μ d, and FDR 2, the optimal parameters are defined as follows. The FDR is controlled by FDR 2, the difference between the power of the two-stage design and the power of the one-stage design, which is ≤1%, and the expected cost is the minimum among the parameters that satisfy the former two conditions.

Let the values of n, π 1μ d, and FDR 2 be given. The n 1 value is fixed, and α 1 and β 1 are calculated under the given FDR 2′. Using these values, the expected cost ratio can be obtained using Eq. 12, and P μ=μ is expressed as a function of α 2. The value of α 2 can be estimated from Eq. 9 by a predetermined FDR 2. The expected power, \(P_{{\mu = \mu _{{_{d} }}}},\) can also be obtained at once. Every point is searched on a grid with values that correspond to the values of n 1 and FDR 2′. The design parameters (n 1, α 1, α 2), expected power, and expected cost are estimated at every grid point; then, the optimal parameters that satisfy the above conditions can be identified.

Evaluating the performance for mis-specification of a prior probability

To perform the study using optimal parameters, investigators have to determine a prior probability, π 1. The current situation is that the optimal parameters of expected power and expected cost depend on π 1, so π 1 must be subjectively determined by investigators. Hence, π 1* denotes the subjective prior probability, while π 1 is the true one. Controlling the FDR at a nominal level to achieve the expected power and expected cost requires π 1* to equal π 1. Therefore, we evaluated the performance related to mis-specification of a prior probability by estimating the FDR, the expected power, and the expected cost ratio. These estimates were obtained from Eqs. 8, 9, and 12, with the optimal parameters corresponding to π 1* and π 1.

Results

Optimal parameters

The optimal parameters were estimated for combinations of n=200, 500, and 1,000 for each group, and π 1= 0.001, 0.002, 0.005, 0.01, 0.02, 0.025, 0.05, and 0.1. In addition, the target effect sizes, μ d, were evaluated from 0.1 to 0.3 at intervals of 0.02. Table 1 shows the ORs that correspond to μ d(= 0.1–0.3) and p 0 (=0.1–0.4). The FDR levels are evaluated for FDR 2= 0.05 and FDR 2=0.1. The optimal parameters, expected power, expected cost ratio, and FNR 2 were calculated. Note that the proposed optimal parameters and these performances depend on π 1 rather than on the number of markers, because of the use of FDR criteria against the multiple-comparison problems.

Table 1 OR (θ) corresponding to effect size (μ d) and genotype frequency of interest in the controls (p 0)

The examples of the optimal parameters when n=1,000 are shown in Table 2. Suppose that the genotype frequency of interest in the controls is 0.3 and the target OR is 1.5; then, 1% of all markers used in the study are likely to be associated with the disease. This OR corresponds to μ d=0.14 (Table 1). In Table 2, in the first stage, each marker is tested at a significance level of 0.153 using 421 of the 1,000 subjects, and in the second stage, the selected markers from the first stage are tested at a significance level of 0.00048, using all 1,000 subjects. Table 3 illustrates the expected values of the proportion of selected markers at each stage, the power and the cost ratio by using the optimal parameters, and also the power of one-stage design. In the above situation the expected power was then approximately 86%, and the expected cost was 51% of the one-stage design; the proportion of the identified markers associated with disease was 0.9%. In Table 3 the number of markers finally selected in the study increased with π 1. In contrast, the number of markers selected in the first stage did not have a monotonic relationship with π 1, due to restriction on the expected cost. These results show that the proposed two-stage design has power similar to that of the one-stage design and that the expected costs are reduced by 40–60%.

Table 2 Examples of the two-stage designs with the optimal parameters in which n=1,000 and FDR 2=0.05 (μ d target effect size, π 1 a prior probability, n 1 sample size in the first stage, α 1 significance level in the first stage, α 2 significance level in the second stage)
Table 3 The expected proportion of selected markers, expected power, expected cost ratio, and the power of one-stage design in which n=1,000 and FDR 2=0.05 (μ d target effect size, π 1 a prior probability, s′ proportion of selected markers in the first stage, s proportion of selected markers in the second stage)

When n and FDR 2 are given, the values of the optimal parameters, expect power, and expected cost ratio, vary with μ d and π 1. Figure 1 illustrates the relationships between a prior probability and the optimal parameters, expected power, expected cost ratio, and FNR 2. n 1/n was about 0.4 and α 1 was from 0.1 to 0.2 in any prior probability (Fig. 1a). These two parameters influence one another to minimize the expected cost. In contrast, as α 2 is decided by a predetermined FDR 2, not the expected cost, α 2 increased monotonically with a prior probability (Fig. 1b). In Figure 1a the expected power naturally increased with a prior probability, and the expected cost ratio also increased monotonically with a prior probability but was convergent to approximately 0.55. This is because the expected cost ratios depend on n 1 and α 1 as well as on a prior probability. FNR 2 also increased with a prior probability (Fig. 1b), but even so, FNR 2 remained <5%, which is considered low enough and an acceptable range. These tendencies in Fig. 1 were the same, regardless of the effect size and sample size that we set.

Fig. 1
figure 1

Characteristics of the optimal two-stage design in which n=1,000, FNR 2=0.05, and μ d=0.14. a The relationships between a prior probability and significance level in the first stage (alpha 1), proportion of sample size in the first stage to all sample size (n 1/n), expected power (power) and expected cost ratio (cost). b The relationships between a prior probability and significance level in the second stage (alpha 2) and false negative rate (FNR 2)

Mis-specification of a prior probability

As mentioned earlier, the results are assured only when the subjective prior probability, π 1*, equals the true prior probability, π 1. Therefore, we evaluated the performance related to the mis-specification of the prior probability. Figure 2 illustrates the results when the optimal parameters for π 1*=0.001, 0.005, 0.01, 0.05 and 0.1 were used and when π 1 is 0.01 (Fig. 2a) and 0.05 (Fig. 2b). These results are for n=1,000, μ d=0.14, and FDR 2=0.05. The vertical and horizontal reference lines in each figure show that π 1*=π 1 and FDR=0.05, respectively. The optimal parameters for lower π 1* lead to loss of power, because the expected power depends on only the optimal parameters. Although the expected cost ratio and the FDR are related to both π 1 and π 1*, the expected cost largely depended on the optimal parameters with π 1*, regardless of π 1. The main difference between these two figures is the FDR. The FDR was also affected by the degree of mis-specification. When π 1* was not more than π 1, the FDR always provided the conservative bias, i.e., control at a nominal level. Also, the FDR was inflated as π 1* was overestimated and had a nearly linear relationship with π 1*.

Fig. 2
figure 2

Influence of mis-specified prior probability on FDR, expected power, and expected cost ratio, in which n=1,000, FNR 2=0.05, and μ d=0.14. a, bπ 1=0.01 and π 1=0.05, respectively

Discussion

We proposed optimal sample sizes and criteria to select markers for each stage using the FDR in a two-stage design and then evaluated those performances. Differing from methods in which FWER has been controlled, the FDR approach has an advantage because it can reflect prior information. Therefore, the proposed method allowed us to optimize the designs by utilizing π 1, which is the proportion of truly associated markers to all markers used in the study. This flexible strategy allows each study to be done effectively.

The cost was defined as proportionate to the number of genotypings, that is, sample size ×  the number of markers. Regardless of choice of the design parameters, which are μ d and π 1, at least a 40% reduction in cost was demonstrated. These results show that costs, in both time and personnel associated with genotyping, can be reduced. In this study we assumed the total sample size was fixed. Future improvements in genotyping technology will reduce the cost of genotyping, and the substantial study costs will then depend on only the sample size. To reduce sample size, some investigators (Sobell et al. 1993; König et al. 2001, 2003) have proposed designs that are similar to group sequential methods with application to clinical trials. Although these procedures are considered to be complicated for practical applications, it would be a challenge to improve efficiency by considering the required sample size.

We examined the problem of multiple testing in association studies. Previous proposed two-stage designs, for example, Satagopan and Elston (2003), have proposed controlling the overall α error of the two-stage procedure. Also, Wen et al. (2006) showed the overall α error and FDR when certain fixed design parameters were used, of which the only second-stage significance level was corrected by Bonferroni’s method. Recently, controlling the FDR has been considered to be more relevant for eliminating false positives, especially in the exploratory studies. A lot of measures, estimation methods, and properties have been examined (Benjamini and Hochberg 2000; Efron et al. 2001; Storey and Tibshirani 2001, Storey and Tibshirani 2003; Efron and Tibshirani 2002; Genovese and Wasserman 2002; Storey 2002, 2003; Tsai et al. 2003; Black 2004; Fernando et al. 2004; Pounds and Cheng 2004; Storey et al. 2004). In addition to the strategies to control either FWER or FDR, another type of two-stage strategy using both FWER and FDR criteria has been proposed. Rosenberg et al. (2005) assumes the association studies with several SNPs in a number of candidate genes, and the association between each gene and disease is summarized by a single gene-wise P value that controls the FWER in each gene in the first stage. In the second stage, these gene-wise P values are adjusted to control the FDR to detect particular genes associated with the disease. Accordingly, the application range of the FDR would be considered to become broader.

To control the FDR in our proposed designs, however, π 1 must be decided before the study is performed. Because π 1 affects the performance of the design, it is important to examine its impact on efficiency. The use of an underestimated π 1*, while leading to loss of the expected power, ensured the FDR was controlled at a nominal level. On the other hand, an overestimated π 1* led to an inflated FDR. However, the expected power increased with π 1*, and the FDR was kept at less than double the nominal level if π 1* ≤2π 1. This tendency would be acceptable for exploratory association studies using a number of markers, because the purpose of such studies is generally to select as many true positives as possible, accepting a few false positives. It will also be useful to examine the degree of the loss of power or the inflation of FDR or cost for some range of prior probabilities (e.g., Fig. 2). Particularly, when π 1 was expected to be very low, the FDR increased sharply with π 1*. From this point, we considered that the proposed procedure is more appropriate for π 1 at least ≥0.01. Here, π 1 was determined subjectively by investigators; other proposed methods to estimate π 1 have used available data, and Hsuesh et al. (2003) compared these latter methods. The optimal parameters and performances of the designs will be affected by the method used to estimate π 1. Therefore, the optimization to include the stage to estimate π 1 might be an issue in the future.

The FNR, which is a measure of the truly associated markers left behind, should also be considered. While the FNR was always kept low because of low prior probability, in this study, it is possible to determine the error measure using the FNR together with the FDR and to optimize the design parameters, as both are controlled.

Finally, we assumed that the markers are not in LD with each other. This is considered reasonable for exploratory studies using spaced markers, in which it would be appropriate to use the FDR. With several recent advances, however, whole-genome association studies using a fixed array of high density SNPs are now feasible. Because there would be LD between these SNPs, the proposed procedure could not guarantee that the FDR would be controlled; in this case, making an informed SNP selection could be important for efficiency. Some work has been done on tag SNP selection, which could also improve efficiency by utilizing the information of LD (Zhang et al. 2002, 2004; Stram et al. 2003; Stram 2004; Thomas et al. 2004). Incorporating these approaches into the optimization of designs is one of the challenges of the future.