Introduction

In recent years there has been a gathering appreciation of the role of gene-environment (GE) interactions in the development of complex behavioural traits (Caspi and Moffitt 2006). Though our knowledge is still growing, the identification of such GE interactions holds real theoretical and practical promise. On a theoretical level, uncovering interactive mechanisms might explain the absence of hypothesized main effects of certain environments or might caution against ‘one size fits all’ intervention approaches. On a practical level, a deeper understanding of GE interplay could facilitate the development of effective environmental intervention strategies tailored to an individual’s genotype. Quantitative genetic research that incorporates measures or manipulations of the environment can identify environments that modify heritability, which is a type of genotype-environment interaction. However, molecular genetics provides a much sharper lens with which to identify gene-environment interaction because it can investigate environmental factors that moderate associations between specific genes and complex traits. This has led to a dramatic increase in research on GE interaction in the last few years; in 2009 alone there were more than 250 papers and more than 6000 citations (ISI Web of Knowledge, February 2010), although most of these papers were not in the behavioral sciences.

Nearly all of these GE interaction studies focused on candidate genes, epitomized by one of the most highly cited papers in the behavioral sciences in recent years which reported that an association between the serotonin transporter gene (5-HTT) and depression only emerged in high-stress environments (Caspi et al. 2003). This finding was replicated in a dozen studies and received support from experimental neuroscience, non-human primates and genetically engineered mice (Uher and McGuffin 2010; Uher and McGuffin 2008). Although two meta-analyses failed to find evidence for significant GE interaction (Munafo et al. 2009; Risch et al. 2009), studies with better measures yielded stronger evidence for GE interaction even though such studies have smaller sample sizes and thus contribute less to the results of meta-analyses (Caspi et al. 2010).

These debates about GE interaction have led to an increased appreciation of the daunting demands for power in GE interaction research (Burton et al. 2008), especially if the effect sizes of ‘main-effect’ associations are as small as they appear to be from genome-wide association studies (McCarthy et al. 2008). One way to increase power is to search for GE interactions with larger main-effect associations by aggregating DNA polymorphisms (e.g., single-nucleotide polymorphisms, SNPs) in what we call SNP sets (Harlaar et al. 2005), which also reduces the multiple-testing problem. In the present study, we use SNPs associated with mathematical ability in 10-year-old children in the first genome-wide association study of mathematical ability (Docherty et al. 2010). Ten SNPs were nominated from scans of pooled DNA and validated in an individually genotyped sample of 2356 children. A SNP set comprised of these 10 SNPs accounted for 2.9% of the variance of mathematical ability.

The practical reason for investigating GE interaction in relation to individual differences in mathematical ability is that our recent genome-wide association study was based on mathematical ability (Docherty et al. 2010). Here we use the sample genotyped on the 10-SNP set from this previous study to investigate interactions with environmental measures in the home and school. Although several quantitative and molecular genetic studies have reported gene-environment interaction for other cognitive abilities, no such studies have yet focused upon mathematical ability (Asbury et al. 2005; Caspi et al. 2007; Fischbein 1980; Harden et al. 2007; Harlaar et al. 2005; Price and Jaffee 2008). Although mathematics ability has been found to be heritable in our work (Kovas et al. 2005; Kovas et al. 2007b) and the work of others (Alarcón et al. 2000; Husén 1959; Light et al. 1998), it is possible that individual differences in genetic strengths and weaknesses towards learning and using mathematics are accentuated by experiences in home and in school. Conversely, the response of individuals to home and school environments is likely to vary as a function of genetic factors. Attempts to combat low mathematics performance may benefit from an understanding of such GE interplay, allowing genetically-determined needs to be met with more effective environmental assistance.

Methods

Participants

The sample was drawn from members of the Twins Early Development Study (TEDS) (Oliver and Plomin 2007; Trouton et al. 2002) who took part in a recent molecular genetic study of mathematical ability and disability (Docherty et al. 2010). Only one twin from a pair was used, and we excluded children with serious medical or perinatal problems, non-white ethnic origin (to mitigate population stratification), English spoken as a second language at home (to facilitate a fair comparison of test performance scores), and those without DNA samples available. Children were included in GE interaction analyses if both composite mathematics performance scores at age 10 (see ‘Measures’) and complete genotype data for all 10 SNPs previously associated with mathematical ability were available, providing a sampling frame of 1888 individuals.

SNP genotyping

As part of a recent genome-wide association study of mathematical ability in TEDS (Docherty et al. 2010), 46 SNPs were selected from a high versus low mathematical ability scan of pooled DNA in 10-year-old children for progression to the individual genotyping stage of the study. In this study we focused upon the 10 SNPs whose associations were replicated in the individually genotyped sample: rs11225308, rs363449, rs17278234, rs11154532, rs12199332, rs12613365, rs6588923, rs2300052, rs6947045 and rs1215603. These SNPs were genotyped using the Sequenom MassARRAY iPlex Gold® system (Sequenom, San Diego, USA) in 2397 TEDS subjects possessing mathematics data at age 10. 31 individuals with persistently low call-rates were removed entirely from further analyses, leaving a sample of 2366. 1888 of these individuals had no missing genotype data across all 10 SNPs, and formed the sampling frame for the present study. All 10 SNPs were in Hardy–Weinberg equilibrium at the p > 0.01 level.

SNP set

SNP-set scores have been used in several studies to aggregate the small effects of SNPs (Butcher et al. 2008; Harlaar et al. 2005; Haworth et al. 2007a; Meaburn et al. 2007; Pharoah et al. 2002; Purcell et al. 2009; Wray et al. 2008; Yang et al. 2010). This is especially useful in samples that are underpowered to detect the effects of the SNPs when analyzed separately. For the current analyses, a SNP-set score was created to combine the 10 SNP associations that replicated (p < 0.05) in the original study of mathematical ability (rs11225308, rs363449, rs17278234, rs11154532, rs12199332, rs12613365, rs6588923, rs2300052, rs6947045 and rs1215603). Genotypes at each SNP were additively coded 0, 1 and 2, with 0 representing the homozygote genotype associated with lower mathematical ability, 1 representing the heterozygote genotype, and 2 representing the genotype associated with higher mathematical ability. As none of the 10 SNPs are in linkage disequilibrium with one another, the additive genotypic scores are independent (uncorrelated). These genotypic scores were then summed to create a 10-SNP-set score with a theoretical range of 0–20. In the TEDS sample (N = 3919), the SNP set score was normally distributed, m = 11.4, SD = 2.1). Only the 1888 individuals with mathematics data and complete genotyping data were included in analyses. The distribution, mean, and SD of mathematics scores were the same as for the whole sample.

Measures

In this section, we describe the web-based tests and National Curriculum teacher ratings that were included in a composite measure of mathematics ability, which was used in our genome-wide association analysis (Docherty et al. 2010).We also describe measures of the home and school environments used in our GE interaction analyses.

Web-based testing

The merits of web-based approaches have been well documented and findings appear consistent with traditional methods of data collection (Haworth et al. 2007b). The battery used in this study included items from three components of mathematics: ‘Understanding Number’, ‘Computation and Knowledge’ and ‘Non-Numerical Processes’ (Kovas et al. 2007c). These components correspond to the UK National Curriculum and thus increased the relevance of the study to education. Items were based on the National Foundation for Educational Research 5–14 Mathematics Series, which is linked closely to curriculum requirements in the UK and the UK Numeracy Strategy (nferNelson 2001). The results across the three categories were combined to generate a composite score of ability across the diverse domain of mathematics because multivariate genetic analyses show that the components are highly correlated genetically (Kovas et al. 2007c), suggesting that the genetic effects influencing ability across diverse areas of mathematics are general (Plomin and Kovas 2005).

National Curriculum-based teacher ratings

Mathematical ability was also measured by teachers’ assessments towards the end of the school year on UK National Curriculum (NC) Key Stage 2 criteria for mathematical attainment (QCA 2001). The National Curriculum is a framework used by all government-funded schools across the UK to ensure that teaching and learning is balanced and consistent. NC-based ratings therefore provide a reliable and uniform measure of mathematical ability across our sample. Teacher assessments have been shown to be valid measures of academic achievement, particularly for mathematics, reading and language (Hoge and Coladarci 1989). The teachers assessed three aspects of mathematical ability: using and applying mathematics; numbers and algebra; and Shapes, space and measures. We created a mathematics composite mean score by summing standardized scores for the three ratings because our multivariate genetic analyses show that the ratings are highly correlated genetically (Kovas et al. 2007a), indicating that the genetic effects are general (Plomin and Kovas 2005).

Composite score of mathematics at age 10

A composite mathematics score was generated based on UK National Curriculum teacher ratings and web-test results at 10 years of age following the same method used in our recent molecular genetic study of the trait (Docherty et al. 2010). Our multivariate genetic research indicates that these two types of measure are correlated phenotypically and genetically (0.53 and 0.62 respectively) (Kovas et al. 2007b). Each measure was standardized to a mean of zero and standard deviation of one. For the 1149 children with data on both measures the mean of the two measures was then standardized to form the composite score. For an additional 402 children, only teacher ratings were available and for 337 children only web-based measures were available. To increase power, these children were also included in the study, with their one available score standardized to a mean of zero and standard deviation of one. The distribution of the measure was normal and very similar for the whole TEDS sample and for those included in the current study. As one might expect, the pattern of missing data in our sample was not random. Compared to children with full data, the 402 children lacking web-test data had significantly lower teacher ratings of mathematics, and the 337 children lacking teacher ratings scored significantly lower on the mathematics web-test, however these differences were small (<0.24 standard deviation difference in mean scores). One likely hypothesis is a participation bias in that children with poor mathematics skills were less likely to take part in mathematics web-tests, and teachers of children with poor mathematics skills were less likely to submit mathematics ratings to TEDS.

Measures of the environment

Because there are so few environmental measures related specifically to mathematics, the 10 measures in the present study included only one measure specific to the mathematical environment. The other environmental measures were included because they were thought to be candidates for GE interaction of mathematics regardless of their correlation with the SNP set or with the mathematics score. These included measures about the home such as household disorganization (‘chaos’), parental negativity, and harsh parental discipline, as well as measures about school such as chaos, teacher negativity, and peer attitudes. Each scale derived from these measures was scored in the direction of the scale’s label so that high scores referred to high chaos, negativity or discipline.

We were especially interested in environmental conditions contemporaneous with the age-10 mathematics assessment. Where age-10 environmental data were unavailable, we attempted to gauge environmental conditions by including measures taken at ages 9 and 12. The phenotypic correlations between the same environmental measures assessed at ages 9 and 12 (shown in Table 1) range from 0.46 to 0.63, suggesting moderate stability of these environments over time. Data for our environmental measures of interest were widely available across the TEDS sample, with parent-ratings available for 3385 families at age 9 and 5854 at age 12, child-ratings available for 3421 families at age 9 and teacher-ratings available for 3254 families at age 10. Table 1 contains the number of individuals within the current sampling frame of 1888 TEDS subjects with data available for each environmental measure, and clearly shows that data were not available on all environmental measures for all individuals in our sample. As we did not believe data were missing at random, we did not wish to impute missing environmental data. For each measure, we found that those individuals with missing environmental data possessed slightly––but significantly––lower mathematics composite scores than those with no missing data. As we wished to include individuals with low mathematics scores in our investigation, we used all individuals possessing the relevant environmental data in our GE interaction analyses––regardless of whether they lacked data on other environmental measures.

Table 1 Correlations between the 10-SNP set previously associated with mathematical ability at age 10, the mathematics composite score at age 10, and measures of home and school environment at ages 9, 10 and 12, within the sampling frame of N = 1888

Mathematical environment

Data on mathematical environments were collected via the internet from children when they were 10 years old using the Grade 4 Student Background Questionnaire ((NEAP 2005; nces.ed.gov) shortened and adapted for English children. The following questions assessing mathematics-relevant activities in the school were selected for the present study: (1) In school, how often do you solve maths problems with a partner in small groups?; (2) In school, how often do you work with objects like rulers, counting blocks, or stopwatches?; (3) In school, how often do you talk with other students during class about how you solved a maths problem? For each item, the child was asked to click on the best response from a set of options presented on screen (Never or hardly ever- Once or twice a month- Once or twice a week- Almost every day; for the last item: None- 15 min- 30 min- 45 min or more). These three items were combined into a composite called Negative Mathematics Environment, with higher scores indicating low levels of school mathematics-relevant activities.

Home Chaos

Data on household disorganization were collected using a short version of the Confusion, Hubbub and Order Scale (CHAOS; Matheny et al. 1995). When the children were 9 and 12 years of age, parents provided the ratings of CHAOS via postal booklets by rating the following 5 statements (Not true-Somewhat true-Certainly true): ‘The twins have a regular bedtime routine’, ‘You can’t hear yourself think in our home’, ‘We are usually able to stay on top of things’, ‘There is usually a television turned on somewhere in our home’, ‘The atmosphere in our house is calm’. An additional item: ‘It’s a real zoo in our home’ was included in the composite at age 9. Items (reversed where necessary) were summed to produce a total household disorganization rating, with higher scores indicating greater disorganization. This CHAOS measure has previously been shown in TEDS to be reliable and a significant predictor or moderator of cognitive and behavioral outcomes (e.g., (Oliver et al. 2008)). These scores were called Home Chaos at 9 and Home Chaos at 12. A Home Chaos composite score was also created by taking the mean of the standardized age 9 and 12 scores.

Parental feelings

Data on parental feelings towards the child were collected from parents using postal questionnaires when the children were 9 and 12 years of age. A shortened version of the parental feelings questionnaire (PFQ; Deater-Deckard 2000) included the following seven statements: ‘I feel impatient with my child’, ‘I feel happy about my relationship with my child’, ‘I am amused by my child’, ‘I sometimes wish my child would leave me alone for a few minutes; ‘My child makes me angry’, ‘I feel close to my child’, ‘I feel frustrated by my child’. The responses (Rarely or never-Sometimes-Often) were summed (reversed where necessary) to create composite scores, with higher scores indicating greater negativity towards the child. These scores were called Parent Negativity at 9 and Parental Negativity at 12. In post-hoc analyses, a third scale was created in exactly the same way using data obtained when the children were 7 years old to represent Parental Negativity at age 7. Parental Negativity composite scores were created across ages 7, 9 and 12, as well as only across ages 9 and 12, by calculating the mean of the standardized scores at each relevant age.

Parental discipline

Data on parental discipline (methods used by parents when their child misbehaves) were collected from parents using postal questionnaires when the children were 9 and 12 years of age. The following 4 statements, derived from Deater-Deckard et al. (1998) were rated (Rarely or never-Sometimes-Often), reversed where necessary, and then summed up to create a composite score, with higher scores indicating harsher discipline: ‘When my child misbehaves I use the following methods: (1) give a smack; (2) tell him/her off or shout at him/her; (3) Explain or reason with him/her; (4) am firm or calm with him/her. These scores were called Harsh Parental Discipline at 9 and Harsh Parental Discipline at 12. In post-hoc analyses, a third score was created in exactly the same way using 7-year data to represent Parental Discipline at age 7. Harsh Parental Discipline composite scores were created across ages 7, 9 and 12, as well as only across ages 9 and 12, by calculating the mean of the standardized scores at each relevant age.

Classroom Chaos

Data on classroom disorganization were collected using a short version of the CHAOS (Matheny et al. 1995) adapted for describing the classroom. When the children were 10 years of age, teachers provided ratings of classroom CHAOS via postal questionnaires with yes–no responses to the following seven statements: ‘My classroom is a place where: (1) There is very little noise; (2) We almost always seem to be rushed; (3) Adults can talk to each other and to children without being interrupted; (4) We get a lot of distracting events in our classroom; (5) Our classroom is a relaxing place for children to be; (6) The atmosphere in our classroom is calm; (7) There are too many children in our classroom given the amount of space we have. Items (reversed where necessary) were summed to produce a total Classroom Chaos score, with higher scores indicating greater classroom disorganization.

Classroom peer context

In addition to classroom disorganization, when the children were 10, their teachers also reported on classroom peer context using a modified version of the School Life Questionnaire (Ainley and Bourke 1992), adapted for TEDS (see Oliver et al. 2008 for details). The teachers provided responses (Certainly true- Somewhat True-Not True) to the following four statements: ‘This pupil’s classroom is a place where: (1) His/her friends care about the work they do; (2) His/her friends try their best to get high marks; (3) His/her friends enjoy learning; (4) His/her friends often get into trouble. Items (reversed where necessary) were summed to produce a total Peer Negativity score, with higher scores indicating more negative peer context.

Children’s perceptions of their teacher

When the children were 9 years old we asked them to provide ratings of their classroom environment using a modified version of the School Life Questionnaire (Ainley and Bourke 1992), adapted for TEDS (see Oliver et al. 2008 for details), collected by post. The questionnaire assessed seven domains: General Satisfaction, Negative Affect, Acceptance, Peer Context, Social Integration, Opportunity, Pupil-Teacher Relationship, and Adventure. For the present analyses we selected the Teacher scale, which has previously been shown to be reliable in TEDS (Oliver et al. 2008). The scale consisted of the following four items (reversed where necessary), with higher scores indicating less support from the teacher: My classroom is the place where (1) My teacher listens to what I say; (2) Teacher takes an interest in helping me with my work; (3) Teacher helps me do my best; (4) Teacher listens to what I say. The format of response was: Certainly, Somewhat, Not True. The average score of these four items was used as an index of Teacher Negativity.

Socio-economic status (SES)

As SES may have been driving GE interactions with the 10-SNP set, significant GE interactions were re-analysed with SES regressed from the environmental measure. At first contact with TEDS families, information was obtained regarding mothers’ occupational status, fathers’ occupational status, mothers’ highest educational qualification, fathers’ highest educational qualification and mothers’ age at the birth of her first child. These five variables were standardized, and the mean was taken to provide a composite score of SES.

Analyses

The mathematics score was corrected for sex and age at time of testing and restandardized using regression procedures. Before GE interaction analyses were conducted, Pearson product-moment correlations were calculated among the SNP set, the mathematics score, and the 10 environmental measures.

GE interactions were assessed using multiple regression analyses in R (R Development Core Team 2008) to examine whether the relationship between the 10-SNP set and mathematics at age 10 was moderated by the effects of each of the environmental measures. The SNP set, the environmental measures, and the mathematics measures were analyzed as continuous variables. Specifically, the mathematics measure was predicted by the SNP set, the environmental measure, and a GE interaction term created as the product of the SNP set and the environmental measure. In a multiple regression in R, the regression coefficients test for the independent contribution of each predictor on the outcome––i.e. the effect of dropping each predictor from the model is tested individually. Because the GE interaction regression coefficient was independent of the SNP set and the environmental measure, this test of GE interaction was independent of GE correlation. Although measures were considered continuously in these analyses, in order to represent our findings graphically, in Fig. 1 we compared results for the lowest and highest thirds of the distribution of each environmental measure.

Fig. 1
figure 1

The relationship between 10-SNP-set scores and mathematical ability at age 10 and the mathematics 10-SNP set for high- and low-scoring groups on various environmental measures

Power

Power was calculated using the Post-hoc Statistical Power Calculator for Multiple Regression (http://www.danielsoper.com/statcalc/calc09.aspx). The sample sizes available for interaction analyses varied depending on the measures involved. At the p < 0.05 level, the smallest sample of 1474 subjects, used in assessing the GE interaction between the 10-SNP set and Negative Mathematics Environment, had 80% power to detect a GE interaction that accounts for 0.52% of the variance of the mathematics composite score. That is, for an association between the mathematics composite score and the three predictors in the multiple regression analysis (the 10-SNP set, the environmental measure and the interaction term), our analysis had 80% power to detect an effect that accounts for 0.73% of the variance of the mathematics score; for any one of the three individual predictors, an effect size 0.52% could be detected with 80% power. The largest sample of 1639 individuals, used in assessing the interaction between the 10-SNP set and Parent Negativity at age 9, had 80% power to detect a total association between the mathematics composite score and the three predictors of 0.66% effect size, and with one individual predictor of 0.47% effect size.

Results

Correlations between SNP set, mathematics score and environmental measures

Table 1 displays the number of participants with data available for each environmental measure within the sampling frame of 1888 TEDS subjects. Due to missing-data differences, these numbers varied across the 10 environmental measures. The overlap in samples was not complete. For example, though 1637 and 1529 subjects within the sampling frame possessed Home Chaos data at ages 9 and 12 respectively, these samples had only 1406 subjects in common. 231 and 123 subjects were unique to the ages 9 and 12 Home Chaos analyses respectively. However, these unique groups differed very little in their mathematics scores from one another, and when only those 915 individuals with no missing environmental data were assessed, the correlations were much the same as those provided in Table 1. Furthermore, when analysed alone the association between the 10-SNP-set and mathematical ability remained similar regardless of sample composition. Means, standard deviations, and correlations among all of the measures are also shown in Table 1. Correlations were generated using the data available in the sampling frame of 1888 TEDS individuals, however the results were similar across the entire TEDS sample. The mean of the SNP-set score was based on the theoretical range from 0 to 20 for the 10 SNPs, each scored additively as 0, 1 or 2 (see Methods). Significant correlations are highlighted in bold print. The correlation between the SNP-set score and the composite Mathematics score (0.167) was similar to the correlation (0.171) reported earlier for a slightly different sample (Docherty et al. 2010). The non-significant, near-zero correlations between the SNP-set score and the environmental measures indicated a general absence of GE correlation between these genetic and environmental factors in our sample. The correlations between the Mathematics score and the environmental measures indicated the general relevance of these environmental measures to mathematics. However, the low magnitude of these correlations suggested that the environmental measures could not be viewed strictly in terms of ‘good’ versus ‘bad’ or ‘low risk’ versus ‘high risk’ environments in relation to mathematics ability.

GE interaction

Figure 1 visualizes the results of our GE interaction investigation. Though in our analyses the environmental measures were considered continuously, in order to convey the interactions graphically we split each sample into low and high groups using the top and bottom thirds of the distribution for each environmental measure. In Fig. 1 the regression (association) between the mathematics SNP-set score and mathematics was plotted separately for the low and high groups. As expected from Table 1, all of the graphs showed an overall regression between the SNP set and the Mathematics score. GE interaction can be seen as different slopes for the low-E versus high-E groups, although the significance of the GE interaction was tested in an analysis of continuous measures of the environment (Table 2). The results presented in Table 2 and Fig. 1 indicated that the association between the SNP set and the mathematics score was greater (i.e., the slope was greater) when teachers were supportive (low Teacher Negativity), when children’s homes were disorganized (high Home Chaos at 12), and when parents were negative (high Parent Negativity at 12).

Table 2 GE interaction results: multiple regression analyses predicting 10-year mathematical ability from the 10-SNP set, measured environments and their interaction

The present investigation focused upon proximal environmental measures, however, previous quantitative GE interaction studies of cognition indicate that the more distal measure of SES can moderate heritability (Fischbein 1980; Scarr-Salapatek 1971; Harden et al. 2007). Though in the TEDS sample this has been shown not to be the case for verbal abilities at age 4 (Asbury et al. 2005; Price and Jaffee 2008), as Home Chaos, and Parent and Teacher Negativity are all correlated with SES, it was possible that their significant interactions simply reflected the effect of SES. When SES was controlled for all three interactions remained significant, and effect sizes were not greatly affected. Furthermore, the 10-SNP-set’s interactions with Home Chaos and Parent Negativity remained significant after Bonferroni correction for the 10 environmental measures tested, however, the interaction with Teacher Negativity did not. As mentioned above, due to missing data differences across the 10 environmental measures, the overlap in the samples used in the GE interaction analyses was not complete. When only those 915 individuals possessing full data across all 10 environmental measures were analysed, the interactions between the 10-SNP-set and Teacher Negativity, Home Chaos and Parent Negativity were weaker, but remained nominally significant (p < 0.05).

Measures of Home Chaos, Parent Negativity and Harsh Parental Discipline were included in this study at ages 9 and 12 as proxies for these environments at age 10. Composite measures were created for each of these measures across ages 9 and 12 and used in GE interaction analyses, the results of which are included in Table 2. The Home Chaos and Parent Negativity composites significantly moderated the 10-SNP set’s association with age-10 mathematical ability. As might be expected from the analyses at ages 9 and 12 individually, the Harsh Parental Discipline composite did not interact significantly with the 10-SNP set. Parent ratings of Parent Negativity––but not of Home Chaos––were also available for our sample at age 7. In post-hoc analyses, we found that Parent Negativity at age 7 showed a significant interaction with the 10-SNP set in influencing mathematical ability at age 10 (age 7 Parent Negativity*10-SNP-set: β = 0.024; S.E = 0.008; t = 3.247; p-value = 0.001; N = 1810). A Parent Negativity composite score spanning ages 7, 9 and 12 also interacted significantly with the 10-SNP-set (ages 7, 9, 12 Parent Negativity composite*10-SNP-set: β = 0.042; S.E = 0.011; t = 3.783; p-value = 0.0002; N = 1872). Our composite scores of Home Chaos and Parent Negativity over time showed greater interaction effects with the 10-SNP set than the single-age measures of these environments did.

We also repeated the GE interaction analyses for all 10 environmental measures using a SNP set constructed from all 43 of the SNPs individually genotyped in our original study of mathematics (Docherty et al. 2010). Although this 43-SNP set showed a slightly stronger association with mathematical ability than did the 10-SNP set, it interacted significantly only with the measures of Teacher Negativity (Beta = −0.028, S.E = 0.013, t = −2.239, p-value = 0.025) and 12-year Parent Negativity (Beta = 0.0.006, S.E = 0.003, t = 2.082, p-value = 0.038). These results suggested that including the extra 33 SNPs did not add anything to our GE interaction analyses.

Discussion

In this first GE interaction analysis of mathematics ability, we found significant GE interactions, after Bonferroni correction, between a mathematics 10-SNP set and two measures of the home environment: Home Chaos and Parent Negativity. The association between the SNP set and mathematics was greater when children’s homes were disorganized and when parents were negative. Moreover, for these two environmental measures there was an opportunity to compare results across ages. For both measures, GE interaction was in the same direction at ages 9 and 12 (Fig. 1), although the GE interactions only reached significance at age 12 (Table 2). In addition, composite measures of Home Chaos and Parent Negativity across ages 9 and 12 showed greater interaction effects. Power is likely to have played a role in this result, as the creation of a composite measure, in which even individuals possessing data at only one age were included, increased sample size. Another possibility involves the fact that subjects with missing data on environmental measures had slightly, but significantly, lower mathematics scores than those missing no environmental data. By including those individuals with missing data at some time points, our composite measures thus permitted GE interaction analyses across a greater range of mathematics performance. It may also be the case that the composite scores captured the cumulative effects of Home Chaos and Parent Negativity over time. As continued exposure to environmental pathogens is likely to be important in shaping a phenotype, it has been suggested that GE interaction studies may profit from the use of repeated measures rather than one-time snap-shots (Moffitt et al. 2005). Our results support the use of repeated-measure GE interaction study designs.

The significant GE interactions we have reported allow us to conclude that the association between the SNP set and mathematics ability in our sample differed as a function of the environment. It is more difficult to interpret the form of the interaction (Manuck 2009). Because the environmental measures are only weakly correlated with mathematics performance, low and high scores could not strictly be construed as good and bad environments, at least specifically in relation to mathematics ability. Nonetheless, the significant GE interactions could be viewed as supporting the diathesis-stress model, in which individuals at genetic risk (diathesis) have worse-than-expected outcomes when subjected to environmental risk (stress) (Asbury et al. 2005). For Home Chaos at age 12, the GE interaction suggests that the genetic effect of a low SNP-set score on mathematics performance was greatest in high-Chaos environments. Similarly, for Parent Negativity at age 12––as well as at age 7 (graph not shown)––the genetic effect of a low SNP-set score on mathematics performance was greatest when parents were negative. These significant GE interactions were in the poor-get-poorer direction of the diathesis-stress model. Diathesis-stress GE interactions suggest a ‘dark side’ to development: Bad environments make bad genotypes worse. However, a more positive way to frame the diathesis-stress model is to say that good environments are good for everyone whereas bad environments are especially bad for children with certain genotypes.

One limitation of our use of SNPs that show ‘main effect’ associations with an outcome is that main effects and interactions are theoretically independent. One would expect ordinal associations (in which associations are stronger in some environments than in others), such as those fitting the diathesis-stress model, to dilute main-effect associations, making them harder to find. Furthermore, disordinal (or cross-over) associations are likely to completely mask any main-effect associations. In the most extreme example, gene-phenotype associations could be in opposite directions in good and bad environments, which would conceal a main-effect association across environments. A new hypothesis about GE interaction suggests that such interactions may be common (Belsky et al. 2009; Belsky and Pluess 2009). This theory posits that individuals are differentially susceptible to environmental influences, both good and bad. Though our study was more likely to identify ordinal GE interactions, like those of the SNP-set with Home Chaos and Parent Negativity at 12, because the main-effect association of the SNP set was so modest it is possible that true disordinal interactions could be found. Although it did not survive Bonferroni correction, Teacher Negativity appeared to show a disordinal (cross-over) interaction that could be interpreted as support for the ‘plasticity’ hypothesis that genes affect sensitivity to both ‘good’ and ‘bad’ environments (Belsky et al. 2009; Belsky and Pluess 2009). Because Teacher Negativity was not significantly correlated with mathematics ability, it was especially difficult to interpret high or low scores as good or bad.

The effect sizes of the two GE interactions surviving Bonferroni correction were very small: 10-SNP-set interactions with Home Chaos at age 12 and Parent Negativity at age 12 each explained only 0.49% of the variance in mathematics performance. The joint effect of the two interactions was larger––but still very small––explaining only 0.71% of the variance in mathematics in our sample. This figure suggests that these interactions did not act completely independently and additively, which is not surprising because the environmental variables were correlated. In statistical interaction analyses based on the analysis-of-variance model, variance attributed to interaction is independent of variance attributed to main effects. Thus, in our analyses, variance attributed to GE interaction was independent of the main effects of G (SNP set) and E (environmental measures). Although parsimony favors the statistical model of no interaction in that main effects are more parsimonious than interactions, it could be argued that some of the variance attributed to main effects could be attributed logically to GE interaction (Rutter 2007). The fact that our GE interaction term was independent of the main effects of G and E also indicates that our findings of GE interaction were not caused by GE correlation. Moreover, the results in Table 1 indicate that our mathematics SNP set was uncorrelated with our measures of the environment. Nonetheless, GE correlation is likely to form an important part of GE interplay in the etiology of complex traits such as mathematical ability, and it is possible that in much larger samples significant correlations between the 10-SNP set and the environment may be detected.

The overlap between the samples used in our GE interaction analyses was incomplete, and these sample differences could explain why the significant GE interactions we reported at age 12 did not reach significance at age 9. For both Parental Negativity and Home Chaos the individuals unique to ages 9 and 12 did not differ significantly from one another in their mathematics scores. Furthermore, when analysed alone in a linear regression, the 10-SNP set’s association with mathematical ability was not affected by these differences in sample composition. However, it is worth noting that the effect of the SNP set on mathematics does vary in the results presented in Table 2. For example, the association of the SNP-set with mathematics was no longer significant when Home Chaos at 12 and a GE interaction term were controlled for in the model. It could follow that much of the observed main effect of the 10-SNP-set was tied up in this GE interaction with Home Chaos at 12. However, as the 10-SNP-set score would be correlated with the GE interaction terms, and as the 10-SNP-set’s effect was also non-significant in the multiple regression analysis involving Harsh Parental Discipline at 9––in which neither the environment or the interaction term was significantly associated with mathematics––caution is advised in attempting to interpret these differences.

For the two significant GE interactions based on the 10-SNP set, we explored GE interactions with each of the 10 SNPs individually. The significant GE interactions were not due to just one SNP––most of the SNPs in the SNP set showed effects in similar directions but we found no systematic patterns of results. However, this exploratory analysis was greatly underpowered: As compared to the SNP set, individual SNPs’ associations with mathematics ability had about one-tenth the effect size but increased multiple testing tenfold. Moreover, the reason for analyzing the individual SNPs would be to look for differential patterns of GE interaction across the SNPs, but the power needed to detect significant differences in GE interaction between SNPs was about four times greater than the power needed to detect significant GE interaction for one SNP without considering the added multiple-testing of comparing 10 SNPs two at a time.

Nonetheless, analysis of individual SNPs is relevant to the issue of what it means to find a GE interaction with a SNP set when each of the constituent SNPs is likely to have very different mechanisms. Our view is that the SNP set for mathematics was meant as an index of the heritability of mathematics ability even though our SNP set only indexed a small proportion of the total heritability. In quantitative genetics, GE interaction involves finding that heritability differs as a function of the environment. In our analysis, we examined the extent to which the SNP set’s association with mathematics ability differs as a function of the environment. This hypothesis-free GE interaction approach limited our ability to interpret our results. As we do not know the mechanisms by which any of the 10 SNPs in the SNP set affects mathematics ability, we have no idea how the SNP set for mathematics ability might interact with the environment. Given the general rules of pleiotropy (each gene affects many traits) and polygenicity (each trait is affected by many genes), it is safe to predict that the answer will be complicated (Kovas and Plomin 2006).

Although the strengths of our study included its use of a SNP set and its composite measure of mathematical ability, the study was limited in terms of its sample, its measures of mathematics-relevant environments and its power. Concerning the sample, although the study’s representative sample could be considered as a strength, it might also be a weakness for identifying GE interaction if, as some have suggested, GE interaction is most likely to be found at the extremes of the environment (Caspi et al. 2010). Nonetheless, a counterargument is that in addition to attempting to demonstrate the existence of GE interaction at the extremes of environment, it is also useful to know the extent of GE interaction in the population.

Concerning the measures of the environment, we were limited to measures obtained in TEDS which was itself limited by the fact that there are few measures of mathematics-relevant environments. Nonetheless, the 10 measures included in our study seemed a reasonable starting point in the search for GE interaction in mathematics. Furthermore, we believe our use of proximal rather than distal measures of the environment to be a strength of the study. Though previous quantitative genetic GE interaction studies of cognition have focused on environmental measures such as parental education (Friend et al. 2008; Kremen et al. 2005), parental employment (Guo and Stearns 2002) and socioeconomic status (SES) (Fischbein 1980; Scarr-Salapatek 1971; Harden et al. 2007), it has been suggested that proximal environments afford more power in GE interaction analyses (Moffitt et al. 2005). Moreover, proximal environments such as Home Chaos may be easier to adjust than distal environments such as SES. The results of GE investigations involving proximal measures should therefore be more readily transferable into practical interventions. Nevertheless, though they were not as all-encompassing as SES, the measures used here were still limited in being fairly general. It has been suggested that, at least in the case of stress, specific stressors yield more replicable GE interaction results than do more general ratings of stressful life events (Caspi et al. 2010). If this suggestion is correct for mathematics-relevant environments, then it is a conservative bias in our study in that it would have made it more difficult to for us to show GE interaction given that our environmental measures were general ratings of the home, parents and teachers.

Another possible weakness of the environmental measures is that they were not all obtained at the same age as the mathematics measures. As we analysed a phenotypic outcome at age 10, our use of environmental measures at age 12 is a disadvantage. It is certainly difficult to conclude that significant GE interactions of the 10-SNP-set with Home Chaos and Parent Negativity at age 12 have a causal effect over mathematics at age 10. However, as Chaos and Parent Negativity are both moderately stable over time (shown in Table 1), we considered both age 9 and age 12 data to be a proxy for the environment at age 10. Though age 9 GE interactions of Home Chaos and Parent Negativity did not reach significance, composite measures across ages 9 and 12 did. Furthermore, we have reported a significant interaction between the 10-SNP-set and Parent Negativity at age 7. The effect of this interaction can more easily be interpreted as causal in the influence of mathematics performance at age 10. For Home Chaos on the other hand, one can not rule out the possible influence of mathematics performance at age 10 over the significant GE interaction reported at age 12.

The major limitation of the present study is power and the need for replication. The original genome-wide association study (Docherty et al. 2010) that identified the 10 SNPs used in the present study was underpowered to detect association effect sizes of the magnitude found for the individual SNPs (i.e., less than 0.6% of the variance), and this set of SNPs has yet to be tested for replication in an independent sample. Even greater power is needed to detect GE interaction, and the small effects reported here were at the limit of our study’s power to detect them. These results must therefore be viewed as preliminary until they are replicated in independent samples. However, there is some weak evidence for replication in the present study in that the environmental measures that were assessed across ages suggested consistent results in terms of direction of effect, as noted above.

A possible limitation of GE interaction studies that is not often appreciated is that the power to detect GE interaction depends on the distribution of the genotypes, environments, and outcomes (Caspi et al. 2010). Our study is less limited by these issues because our measures of environments and outcomes were continuous and representative of the population. SNP sets are also normally distributed, unlike most individual SNPs, and so for this reason, our use of a SNP set was a strength (Plomin et al. 2009). However, the aggregation of 10 SNPs limited our ability to interpret our findings on a biological level. Furthermore, our SNP set was created under the assumption that the 10 SNPs interacted additively, which means epistatic or multiplicative interactions between SNPs would not have been well represented. We adopted this approach because our sample was underpowered to detect epistasis, and because quantitative genetic studies suggest additive genetic influence over mathematics.

Our theory-free genome-wide association approach is complementary to the alternative theory-guided candidate-gene approach that investigates a nomological network of convergent evidence (Caspi et al. 2010). Because there are no candidate-gene studies relevant to mathematical ability we could not focus on candidates but we hope that our nominated GE interactions, after replication, become candidate GE interaction targets in studies that extend the construct validity using a more theory-guided approach.

Research into GE interplay in the etiology of mathematics is only just beginning. Many more GE interactions involving many more genetic and environmental factors are likely to influence this complex trait. Indeed, though we did not find significant GE interaction between the 10-SNP-set and environments such as Classroom Chaos and Harsh Parental Discipline in our sample, we would predict that significant GE interactions with measures such as these will emerge in the future. As neither genetic nor environmental factors act in isolation, GE interplay studies are likely to have important practical implications. Understanding why some individuals suffer or benefit under certain environmental conditions, while others do not, will assist in the development of tailored environmental interventions aimed at improving mathematics and other skills.