Introduction

Restless legs syndrome (RLS)1 is a sensory-motor disorder characterized by an urge to move and unpleasant sensations in the lower limbs at rest. In Caucasians, RLS is present in up to 10% of the population.2, 3 Pregnancy, uremia, celiac disease, and iron deficiency are considered to be risk factors.4 RLS also has a considerable heritability and is associated with multiple genetic risk factors. Genome-wide association studies5, 6, 7, 8, 9, 10 identified variants in six loci encompassing the genes MEIS1, BTBD9, MAP2K5/SKOR1, PTPRD, and TOX3/non-coding RNA.

Aiming for the identification of further genetic risk factors of RLS, we addressed the apparent relation of RLS susceptibility to iron metabolism and iron availability11 that has been explained by cerebral iron being crucial in the etiology of RLS.12 We therefore tested the SNPs at the loci of a candidate set of genes known to be involved in iron metabolism for association with RLS. Vice versa, we asked whether RLS genes are involved in iron metabolism because the RLS-associated SNP rs3923809 at the BTBD9 locus has previously been reported to be associated with the serum level of ferritin in a study on RLS subjects and their relatives.6

We also performed a general power consideration (ie, sample size calculation) concerning genetic effects on an end-phenotype (eg, a disease) that depend entirely on transmission via an intermediate phenotype (eg, a metabolic parameter).

Materials and methods

Informed consent, written in the respective language, was obtained from each participant. The work has been approved by the institutional review boards of the contributing centers. The primary review boards were located in Munich, Bayerische Ärztekammer und Technische Universität München.

RLS samples and controls

RLS cases were of German or Austrian origin (954 in the discovery step, 735 in replication step 1, and 736 in replication step 2). Diagnosis was based on the diagnostic criteria of the International RLS Study Group13 as assessed in a personal interview conducted by an RLS expert. Patients with probably secondary RLS in case of uremia, dialysis, or anemia because of iron deficiency were excluded. The presence of secondary RLS was determined by clinical interview, physical and neurological examination, blood chemistry, and nerve conduction studies whenever deemed clinically necessary.

Controls were recruited from the KORA S3/F3 and F4 surveys of the Cooperative Health Research in the South-East German region of Augsburg. KORA procedures and samples have been described before.14 For the discovery phase, we included 1814 subjects. For the replication steps 1 and 2, we included 736 and 735 subjects, respectively.

Iron-related serum parameters in the KORA cohorts

In the KORA surveys F3 (n=1638) and KORA F4 (n=1809), iron-related serum parameters (ferritin, transferrin, and soluble transferrin receptor) were determined. Serum parameters were measured by standard laboratory methods, that is, by electrochemiluminescence immunoassay (Roche Diagnostics, Mannheim, Germany) for ferritin, Tina-quant immunoturbidometry (Roche) for soluble transferrin receptor, colorimetry (Roche) for iron, and immunonephelometry (Siemens Healthcare Diagnostics, Eschborn, Germany) for transferrin. For more details, see Oexle et al.15

Genotyping and statistical analysis

Genome-wide genotyping was performed on Affymetrix Human SNP Arrays using 5.0 arrays for RLS cases, and 500K or 6.0 arrays for KORA subjects with all RLS controls being genotyped on 6.0 arrays. Calling, quality control, and imputing procedures have been described previously.10, 15 To identify and correct for population stratification, multidimensional scaling (MDS; identifying 18 outliers) and Genomic Control analyses were performed. For association testing, logistic regression as implemented in PLINK 1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/)16 was applied after standard filtering, that is, 98% calling for both SNPs and individuals, HWE>0.00001, and minor allele frequency (MAF)>5% (except for the hemochromatosis-causing SNP rs1800562 of the HFE gene, which was included in the RLS discovery sample in spite of a MAF<5%).

After filtering, there remained 301 495 SNPs in 922 cases and 1526 controls for the discovery step of the RLS case–control analysis. In this step, we applied age, sex, and the first four axes of variation resulting from MDS analysis as covariates. We performed a candidate-based analysis focusing on SNPs within 4 Mb intervals (2 Mb in each direction) surrounding each of 111 genes (see Supplementary Table 1) known to be involved in iron metabolism15, 17, 18 or in neurodegeneration with brain iron accumulation (NBIA), NBIA1–NBIA3 (PANK2, PLA2G6, and FTL). From that set of SNPs, we selected those with a nominal P-value <5 × 10−4 for replication. SNPs of known RLS genes were ignored. Genotyping for replication (24 SNPs including technical replicates) was performed on the MassARRAY system using MALDI-TOF mass spectrometry with the iPLEX Gold chemistry (Sequenom Inc., San Diego, CA, USA). Automated genotype calling was done with SpectroTYPER 3.4. Genotype clustering was visually checked by an experienced evaluator. Except one (rs8029116), all SNPs in replication step 1 were genotyped successfully. Replication step 2, intended to provide finemapping of genes in the vicinity of rs2576036 that had appeared to replicate in replication step 1, was performed in the same way and comprised 33 SNPs of whom 32 were genotyped successfully. SNPs for finemapping were selected using the tag SNP selection algorithm ‘Tagger’ as implemented in Haploview 4.1 (Broad Institute, Cambridge, MA, USA). In both replication steps, age and sex were used as covariates.

For association analysis of top RLS-associated SNPs5, 7, 8, 9, 10 with iron-related serum parameters, we selected rs12469063 and rs2300478 in MEIS1, rs9357271 in BTBD9, rs1975197 in PTPRD, rs12593813 in MAP2K5/SKOR1, rs6747972 in an intergenic region on chromosome 2p14, and rs3104767 in TOX3/BC034767. Moreover, we included rs3923809 in BTBD9 that has been reported to be associated with ferritin6 and rs2576036 in KATNAL2 that we initially suspected to be associated with RLS in this study (see Results section). Imputed SNPs were used if selected SNPs were not genotyped directly. Imputation was performed with Impute19 for KORA F3 and F4 separately using hapmap 2 as reference. The genetic association of selected SNPs was tested with a linear regression on log10-transformed iron traits with age and sex as covariates. The results of the two cohorts F3 and F4 were combined by meta-analysis using a fixed-effect model analogous to Oexle et al.15 Calculations were done using PLINK16 v1.07 and METAL (http://www.sph.umich.edu/csg/abecasis/Metal/index.html).

Results

For the present investigation, we used the same discovery sample as in Winkelmann et al,10 that is, 954 RLS cases and 1814 controls (before filtering, see above), but relaxed the cut-off level for replication from 1 × 10−4 to 5 × 10−4 and focused on a set of genes that have been related to iron metabolism or to NBIA. Besides the three NBIA genes (see OMIM database), this set comprised 35 genes listed by the HealthIron17 consortium and another 72 genes that have been discussed in recent reviews on iron metabolism.15, 18 The complete list is given in Supplementary Table 1. Of the SNPs within the 4 Mb-intervals surrounding these genes by 2 Mb in both directions, 18 had P-values below the cut-off level (ignoring the SNPs in high LD with a SNP in one of the already known RLS genes). The set of top hits did not contain rs1800562 (P>0.1). This SNP was specifically included in the discovery step in spite of its MAF being <5% because, of all known genetic polymorphisms, it explains the largest fraction of the variance of the body iron storage indicator ferritin.15,20 (In the KORA cohorts, it explained 0.5% (KORA F3) to 0.8% (KORA F4) of the variance of the age- and sex-adjusted log10(ferritin) values).

Including technical replicates, 24 SNPs were then tested for replication in a sample of 736 RLS cases and 736 controls (Supplementary Table 2). Genotyping of one SNP failed (rs8029116 on chr15: 72 610 147 bp) but a nearby replicate (rs11072496) was genotyped successfully. For rs2576036, an intronic SNP of KATNAL2 on chromosome 18q21.1 (chr18: 42.85 Mb), a significant association with the RLS phenotype was detected in the first replication step (P=0.00085, logistic regression using age and sex as covariates).

In order to further evaluate the possible effect of rs2576036 on expression, we checked an in-house whole blood transcriptome database for association with transcript levels of neighboring genes. Neither the expression of KATNAL2 nor of any other transcripts was significantly associated with rs2576036. The selection of rs2576036 for this study resulted from its being located within the intervals surrounding the iron-related candidate genes SMAD2 at chr18: 43.65 Mb and SMAD7 at chr18: 44.71 Mb. However, neither the expression of SMAD2 nor the expression of SMAD7 was associated with rs2576036.

To confirm and to finemap the seeming association of the rs2576036 locus with RLS, we run a second replication analysis on 736 German RLS cases and 735 German controls, specifically addressing KATNAL2, its neighbor PIAS2, and CORL2 (SKOR2, FUSSEL18). CORL2 was included because it is located in the same region but was not represented in the expression database. For this replication step 2, 33 SNPs were selected, of which 32 were genotyped successfully. None of these SNPs resulted in a significant association signal. SNP rs2576036, which was significant in replication step 1, now showed a P-value of 0.78. The joint analysis for rs2576036 of steps 1 and 2 yielded a P-value of 0.044 which, after Bonferroni correction for 18 loci in replication step 1, also was not significant.

Having tested whether iron-related genes influence the genesis of RLS, we then asked whether RLS-associated genes influence iron-related parameters. We selected seven top hits from the previously reported RLS loci as well as rs3923809 in BTBD9 that has been reported to be associated with ferritin6 and rs2576036 in KATNAL2 that we initially suspected to be associated with RLS (see Materials and Methods section). In a set of altogether 3447 KORA individuals, none of these SNPs were associated with serum iron or any of the iron-related parameters in serum (ferritin, transferrin, transferring saturation, and soluble transferrin receptor; see Supplementary Table 3 for association results). In view of a recent report of Catoire et al,21 we further tested whether the risk haplotype of the RLS gene MEIS1 (G alleles of rs12469063 and rs2300478) is associated with any serum iron parameter. This test also gave a negative result with no P-value being smaller than 0.45 (KORA F3)/0.27 (KORA F4).

Discussion

Three possible causes may contribute to the failure of our candidate approach to detect RLS-associated genes among a set of iron-related genes. First, the association between serum iron parameters and RLS may be weaker than usually assumed. Second, our study may be biased. Third, our study may lack power because of a dilution of the genetic effect by the transmission via an intermediate trait. In the following, we discuss all three. The third is presented in terms of a general power analysis.

(1) Iron deficiency has been considered to have a causal role in RLS ever since the first modern description of RLS in the middle of the last century.12 Iron substitution is a common therapeutic approach to RLS. Several association studies on RLS and serum iron parameters have been performed. In a retrospective study on 18 cases and 18 matched controls, O’Keeffe et al11 described a significant association to serum ferritin, an indicator of the level of body iron storage. Other retrospective studies with sample sizes between 27 and 302 RLS patients identified associations of serum ferritin to RLS severity and/or the need for therapeutic augmentation.22, 23, 24, 25 Recently, low serum ferritin was described as a significant predictor of RLS in 301 hospital patients older 50 years of whom 55 had RLS.26 On the other hand, cross-sectional studies on 365, 701, and 714 individuals from German,2 Tyrolean,3 and Korean27 population cohorts with 36, 74, and 59 RLS cases, respectively, did not show an association to serum ferritin. The same was true for most other serum iron parameters except for the soluble transferrin receptor in the study that included 74 RLS patients. Although the results of these well-designed studies do not exclude the possibility that iron, especially12 cerebral iron, is involved in the pathophysiology of RLS (at least in a subgroup of patients), the association between peripheral iron parameters and RLS may be weaker than assumed, thus impeding the power of our approach.

(2) The set of iron-related candidate genes that we selected from the literature is biased by the current state of knowledge. It cannot be excluded that future insights in iron physiology will identify genes that have a stronger effect on the pathogenesis of RLS. Moreover, in the discovery step of the association we only considered polymorphisms with MAFs >5% (except for the HFE missense mutation C282Y). This filtering is reasonable in association studies because for small values the MAF is inversely proportional to the power (necessary sample size) of a study (see Appendix). However, it is possible that rare variants of iron-related genes with MAF<5% but strong effect contribute to the genesis of RLS. Detection of such variants will be difficult and, besides next-generation sequencing, necessitate specific study design.28 A further possible bias of our study resides in the fact that our sampling scheme for the RLS GWAS (on which we based the discovery step of this study) excluded cases that had anemia because of iron deficiency. Although this exclusion criterion only affected cases with severe iron deficiency (which already caused anemia), our study would possibly have been more powerful without this criterion.

(3) Dilution of the genetic effect is a third possible reason why none of the iron-related genes was found to be associated with RLS. Consider the constellation delineated in Figure 1a where the influence of a gene on an end-phenotype (eg, disease) entirely depends on the mediation by an intermediate trait (eg, serum parameter), which both are also subject to various other genetic and non-genetic influences. As one can easily show in case of small effect sizes, this constellation implies that the necessary sample size nxz to detect an association between gene (x) and end-phenotype (z) is proportional to product nxynyz of the sample sizes necessary to detect the associations between gene and intermediate trait (y) and between intermediate trait and end-phenotype. Assume that the intermediate trait y in an individual i is influenced by a genetic effect according to yi=axyxi+ɛxy,i where xi {0,1,2} indicates the number of effect alleles, axy is the effect parameter (assumed to be small, axy « 1) and ɛxy,i is a noise parameter with standard normal distribution No(0,1) that represents a variety of other influences. For simplicity, y is chosen as to have zero mean. As the effect size axy is small, the variance σy2 is close to 1 and the necessary sample size nxy to detect the genetic influence in a linear regression analysis is proportional to 1/axy2 (see equation (A2) in the Appendix). Second, assume that the influence of y on the occurrence probability P(z=1|y) of the end-phenotype follows a logistic model, logit(P(z|y))=b0+byzy, where b0 and byz « b0 are constants again. For a test to successfully detect the association between y and z, a sample size of nyz1/byz2 is required (see equation (A3) in the Appendix). Now replace the intermediate trait y by its constituents, that is, logit(P(z|x,ɛ))=b0+byz(axyx+ɛ), which according to equation (A4) in the Appendix results in nxz1/(axy2byz2) yielding the required proportionality nxznxynyz and indicating the dilution phenomenon suggested above.

Figure 1
figure 1

(a) If transmitted by an intermediate trait, the effect of a gene may be diluted by other genetic or environmental influences, which thus impair the power of an association study. (b) If the trait is not truly intermediate and a substantial part of the correlation with the disease results from the pleiotropic effects of the gene, the power of an association study is not impaired in the same way.

It has to be considered, however, that the assessment of a candidate gene does not demand the same level of Bonferroni correction as a genome-wide association analysis. Still, this does not entirely compensate for the dilution phenomenon. With cxz2=axy2byz2, equations (A2), (A3), (A4) of the Appendix yield

where the quantile Zβ represents the required power 1−β (usually, Zβ=Z0.2=−0.84) and the Zα/2’s are the quantiles of the required significance levels α in two-sided tests with Zα/2 necessitating correction for multiple testing, that is, Zα/2=Z0.025=−1.96 in a single test, and Z α / 2  =  Z ×  10 -8 = -5.33 in a genome-wide test.29 Assuming that the analysis of a candidate gene is a single test and that the association between intermediate trait and disease also was detected in a single test while the GWAS on the intermediate trait to detect the candidate gene required correction for multiple testing, we get n x z n x y n y z / ( Z 0.2  +  Z 5 ×  10 -8 ) 2  =  n x y n y z / 38 . Thus, only if the association between the intermediate trait (y) and the disease (z) was strong enough to be detectable with sample size nyz=38, will the association analysis on a candidate gene derived from the GWAS on the intermediate trait have sufficient power with a sample size (nxz) not larger than the sample size (nxy) that was required in the GWAS. For the HFE-mutation C282Y (rs1800562), the variant that explains the largest single genetic fraction of the ferritin variance,15, 20 the KORA cohorts indicated an allele frequency of 4.9% (KORA F3)/4.6% (KORA F4) and an effect size parameter of 0.09 (KORA F3)/0.11 (KORA F4). With the variance of log10(ferritin) being (0.41)2 in KORA F3 and (0.42)2 in KORA F4, these numbers correspond to a necessary GWAS sample size of about (−0.84−5.33)2/(0.102 × 2 × 0.047 × 0.953/0.422)≈7500 (see derivation of equation (A2) in the Appendix with σy2=0.422≠1). Thus, taking into account that sample sizes in the range of 36 to 74 failed to confirm the association between RLS and serum ferritin,2, 3, 27 the discovery step in our study with 954 cases and 1814 controls (ie, considerably smaller than 7500) was not powerful enough to detect an influence of HFE on RLS if that influence fully depends on mediation by serum ferritin.

We also could not confirm the association between the RLS-associated SNP rs3923809 at the BTBD9 locus and serum ferritin although our population sample from Southern Germany (KORA F3 and F4 with n=1638 and n=1809, respectively) was considerably larger than the Islandic sample (n=965 individuals) used by the group that claimed this association.6 In fact, none of the other top SNPs at the known RLS loci was associated with any serum iron parameter. Again, this failure may be due to a ‘dilution’ phenomenon analogous to the one explained above. Recently, Catoire et al21 reported that in RLS patients the risk haplotype of the RLS gene MEIS1 is associated with increased thalamic ferritin expression, whereas the expression in another cerebral tissue (pons) or in lymphoblastoid cell lines did not depend on that haplotype. Data on liver expression and data on the general population were not provided but may be desirable in view of the fact that we could not detect an association between this haplotype and serum ferritin in the general population. Of course, our results do not exclude the possibility that MEIS1 may have a differential influence on ferritin expression in certain cerebral regions.

In summary, the analysis presented here puts some caveat on the expectation that genetic elucidation of intermediate traits will always simplify the genetic dissection of end-phenotypes. Under certain conditions, candidate approaches can be successful, of course. Figure 1b shows a constellation where the seeming intermediate trait is not truly intermediate but is modified by pleiotropic actions of genes that also influence the end-phenotype. If the correlation between the trait and the end-phenotype is largely due to a small number of such genes a candidate approach can be quite powerful.