Introduction

Autism spectrum disorder (ASD) is an umbrella diagnosis, capturing several previously separate pervasive developmental disorders with various levels of symptom severity, including Autistic Disorder, Asperger’s Syndrome, Childhood Disintegrative Disorder, and Pervasive Developmental Disorder—Not Otherwise Specified (PDD-NOS)1. According to the Diagnostic and Statistical Manual of Mental Disorders (DSM) version 5, diagnosis of ASD requires the presence of at least three symptoms of impaired social communication and at least two symptoms of repetitive behaviors or restricted interests1. ASD has a median prevalence of 1 out of 161 individuals in a study of worldwide data2, with a higher diagnosis rate in some developed countries such as the United States3.

Characterizing the neurobiology of ASD may eventually lead to improved diagnosis and clinical subgrouping, and the development of individually targeted treatment programs4. Although much of the neurobiology of ASD remains unknown, subtle alterations of brain structure appear to be involved (reviewed in ref. 5,6). These include differences in total brain volume (children with ASD have shown a larger average volume7,8,9,10), as well as alterations of the medial and inferior frontal, anterior cingulate, superior temporal, and orbitofrontal cortices, and the caudate nucleus5,6,11. However, the results of structural magnetic resonance imaging (MRI) studies of ASD have often been inconsistent, potentially owing to (1) small study sample sizes in relation to subtle effects, (2) differences across studies in terms of clinical characteristics, age, comorbidity, and medication use, (3) methodological differences between studies, such as differences in hardware, software, and distinct data processing pipelines12, and (4) the etiological and neurobiological heterogeneity of ASD, which exists as a group of different syndromes rather than a single entity13.

In the ENIGMA (Enhancing Neuro-Imaging Genetics through Meta-Analysis) consortium (http://enigma.ini.usc.edu), researchers from around the world collaborate to analyze many separate data sets jointly, and to reduce some of the technical heterogeneity by using harmonized protocols for MRI data processing. A recent study by the ENIGMA consortium’s ASD working group showed small average differences in bilateral cortical and subcortical brain measures between 1571 cases and 1650 healthy controls, in the largest study of brain structure in ASD yet performed14. Relative to controls, ASD patients had significantly lower volumes of several subcortical structures, as well as greater thickness in various cortical regions—mostly in the frontal lobes—and lower thickness of temporal regions. No associations of diagnosis with regional cortical surface areas were found14.

Left–right asymmetry is an important aspect of human brain organization, which may be altered in various psychiatric and neurocognitive conditions, including schizophrenia, dyslexia, and ASD15,16,17. On a functional level, people with ASD demonstrate reduced leftward language lateralization more frequently than controls18,19,20. Resting-state functional MRI data of people with ASD have also shown a generally rightward shift of asymmetry involving various functional networks of brain regions21. In addition, people with ASD have a higher rate of left-handedness than the general population20,22,23. Furthermore, an electroencephalography study reported that infants at high risk for ASD showed more rightward than leftward frontal alpha asymmetry at rest24.

Brain structural imaging studies have also reported altered hemispheric asymmetry in ASD. Diffusion imaging studies indicated reduced asymmetry of a variety of different white matter tract metrics25,26,27, although in one study males with ASD lacked an age-dependent decrease in rightward asymmetry of network global efficiency, compared with controls28. A structural MRI study investigating gray matter reported lower leftward volume asymmetry of language-related cortical regions in ASD (i.e., planum temporale, Heschl’s gyrus, posterior supramarginal gyrus, and parietal operculum), as well as greater rightward asymmetry of the inferior parietal lobule29. The volume and surface area of the fusiform gyrus also showed lower rightward asymmetry in ASD30. However, other studies did not find alterations of gray matter asymmetries in ASD27,31.

Prior studies of structural brain asymmetry in ASD had sample sizes less than 128 cases and 127 controls. The previous ENIGMA consortium study of ASD14 did not perform analyses of brain asymmetry, but reported bilateral effects only as strong as Cohen’s d = −0.21 (for entorhinal thickness bilaterally)14. Comparable bilateral effect sizes were also found in ENIGMA consortium studies of other disorders14,32,33,34,35,36,37,38. If effects on brain asymmetry are similarly subtle, then prior studies of this aspect of brain structure in ASD were likely underpowered. Low power not only reduces the chance of detecting true effects, but also the likelihood that a statistically significant result reflects a true effect39,40. Therefore large-scale analysis is needed to determine whether, and how, structural brain asymmetry might be altered in ASD, to better describe the neurobiology of the condition.

Here, we made use of MRI data from 54 data sets that were collected across the world by members of the ENIGMA consortium’s ASD Working Group, to perform the first highly powered study of structural brain asymmetry in ASD. Using a single, harmonized protocol for image analysis, we derived asymmetry indexes, AI = (Left−Right)/(Left + Right), for multiple brain regional and global hemispheric measures, in up to 1778 individuals with ASD and 1829 typically developing controls. The AI is a widely used index in brain asymmetry studies41,42.

Age and sex are known to affect cortical-43 as well as subcortical asymmetries44 in healthy individuals. In addition, a recent structural imaging study of roughly 500 individuals with ASD, and 800 controls, found that case–control differences of bilateral cortical thickness were greater in younger versus older individuals, whereas also being related to ASD symptom severity, and with larger differences in individuals with lower versus higher full-scale intelligent quotient (IQ) scores45. Other previous case–control MRI findings with respect to these indicators of clinical heterogeneity in ASD are also reviewed in that paper45. In the present study, we therefore carried out secondary analyses in which we tested brain asymmetries in relation to age- or sex-specific effects, IQ, and disorder severity. We also included an exploratory analysis of medication use.

Results

Significant associations of ASD with brain asymmetry

Summary information for the data sets is in Table 1. Out of a total of 78 structural AIs that were investigated (Supplementary Tables 13), 10 showed a significant effect of diagnosis, which survived multiple testing correction (Table 2). Among these were seven regional cortical thickness AIs, including frontal regions (superior frontal, rostral middle frontal, medial orbitofrontal), temporal regions (fusiform, inferior temporal), and cingulate regions (rostral anterior, isthmus cingulate). Two cortical regional surface area AIs, namely of the medial- and lateral orbitofrontal cortex, were significantly associated with diagnosis (medial: β = 0.006, t = 3.2, P = 0.0015; lateral: β = −0.005, t = −3.3, P = 0.0010) (Table 2, Supplementary Table 2), as well as one subcortical volume AI, namely that of the putamen (β = 0.00395, t = 3.4, P = 0.00069) (Table 2, Supplementary Table 3).

Table 1 Characteristics of the different data sets of the ENIGMA ASD working group
Table 2 Linear mixed model results for regional AIs that survived multiple comparisons correction in the primary analysis

Nominally significant effects of diagnosis on AIs (i.e., with P < 0.05, but which did not survive multiple comparison correction), were observed for the fusiform surface area AI (β = −0.005, t = −2.56, P = 0.010) (Supplementary Table 2), pars orbitalis thickness AI (β = −0.003, t = −2.26, P = 0.024), posterior cingulate thickness AI (β = −0.003, t = −2.1, P = 0.034), superior temporal thickness (β = −0.002, t = −1.97, P = 0.049), and caudate nucleus volume (β = 0.003, t = 2.24, P = 0.025).

Sensitivity analyses

When we repeated the analysis after winsorizing outliers, the pattern of results remained the same (Supplementary Tables 46), except that a small change in P value for the effect of diagnosis on medial orbitofrontal surface area AI meant that it no longer survived false discovery rate (FDR) correction (Supplementary Table 5).

When we added a non-linear effect for age, all of the 10 AIs that had shown significant effects of diagnosis in the primary analysis remained significant (Supplementary Tables 46).

When we excluded all individuals below 6 years of age, that may have been more difficult for FreeSurfer to segment, all AIs that had shown significant effects of diagnosis in the primary analysis remained significant, except for the isthmus cingulate thickness AI (Supplementary Tables 46). In addition, one new association with diagnosis, of the fusiform surface area AI, now surpassed the multiple testing correction threshold. These subtle changes of P values do not necessarily indicate that exclusion of younger ages improved signal to noise in the data.

When excluding all individuals aged 40 years or older, the pattern of significant results stayed the same (Supplementary Tables 46).

Finally, when analyzing only the subset of 3T-acquired data, two of the diagnosis effects from the primary analysis (i.e., inferior temporal- and isthmus cingulate thickness AI) were no longer significant after false discovery rate correction, but three other effects now became significant (i.e., superior temporal thickness AI, fusiform surface area AI, and caudate nucleus AI) (Supplementary Tables 46). Again, slight changes in significance levels are expected when changing the sample, and do not necessarily indicate systematic differences of 3 T and 1.5 T data with respect to case–control asymmetry differences.

Magnitudes and directions of asymmetry changes

Cohen’s d effect sizes of the associations between AIs and diagnosis, as derived from the primary analysis, are visualized in Fig. 1. Effect sizes were low, ranging from −0.13 (superior frontal thickness AI) to 0.12 (Putamen AI) (Table 2, Supplementary Tables 13). All of the cortical AIs with significant effects of diagnosis in the primary analysis showed decreased asymmetry in ASD compared with controls, i.e., the AIs were closer to zero in individuals with ASD than in controls, regardless of whether the region was on average leftward or rightward asymmetrical in controls (Table 3). However, the putamen showed increased asymmetry in ASD (mean AI controls = 0.011, mean AI cases = 0.012) (Table 3).

Fig. 1
figure 1

Cohen’s d effect sizes of the associations between diagnosis and AIs. a regional cortical thickness measures, b cortical surface areas, c subcortical volumes. Values are overlaid on left hemisphere inflated brains. Positive Cohen’s d values (yellow) indicate mean shifts towards greater leftward or reduced rightward asymmetry in cases, and negative Cohen’s d values (blue) indicate mean shifts towards greater rightward asymmetry or reduced leftward asymmetry in individuals with ASD

Table 3 Directions of asymmetry changes in cases versus controls

Five of the seven significant changes in regional cortical thickness asymmetry involved left-sided decreases accompanied by right-sided increases of thickness (Table 3). For the other two significant effects on regional thickness asymmetry (the fusiform and inferior temporal cortex), thickness was decreased bilaterally in ASD, but more so in the right than the left hemisphere. For the significant changes in surface area asymmetry (lateral orbitofrontal and medial orbitofrontal cortex), surface area was altered in opposite directions in ASD in the two hemispheres, thus resulting in altered asymmetry (Table 3). Finally, the putamen showed a bilateral decrease in volume in ASD that was more pronounced on the right, resulting in altered asymmetry (Table 3).

Age or sex interaction effects

The distributions of age and sex across all data sets are plotted in Supplementary Fig. 1. In secondary analysis of interaction effects, there was only one significant sex:diagnosis interaction effect after FDR correction, for the rostral anterior cingulate thickness AI (Supplementary Tables 79). This AI had shown a significant effect of diagnosis in the primary analysis. In analysis within the sexes separately, this AI was associated with diagnosis in males (P = 1.4 × 10–5) but not females (p = 0.165) (Supplementary Table 7). For all of the AIs, which showed significant effects of diagnosis in the primary analysis, adding sex:diagnosis interaction terms did not change the pattern of significant main effects of diagnosis, after FDR correction (Supplementary Tables 79).

There were no significant age:diagnosis interaction effects after FDR correction (Supplementary Tables 1012). In general, for AIs which showed significant effects of diagnosis in the primary analysis, adding age:diagnosis interaction terms largely reduced the significance of the main effects of diagnosis, even though the age:diagnosis interaction terms were not significant (all P > 0.05) (Supplementary Tables 1012). However, adding these interaction terms also increased the AIC and BIC scores compared with the primary analysis models without these terms, indicating poorer model fit when including these non-significant interaction terms (Supplementary Tables 1012).

Exploratory analysis of IQ

The distributions of IQ within individuals with ASD and controls are shown in Supplementary Fig. 2. Out of the 10 AIs that showed significant case–control differences in the primary analysis, only one showed an association with IQ within individuals with ASD (uncorrected P < 0.05; Supplementary Table 13). This was the rostral anterior cingulate thickness AI (β = 0.00019, t = 2.49, p = 0.013). The positive direction of this effect indicates that primarily those ASD individuals with lower IQ show reduced leftward asymmetry of the rostral anterior cingulate thickness. This regional asymmetry had also shown a significant sex*diagnosis interaction (see above). For this specific regional AI, we therefore added age:IQ, sex:IQ and age:sex:IQ interactions to the model, but none of these terms were significant (all uncorrected P > 0.05).

Within controls, only the superior frontal thickness AI was associated with IQ at uncorrected P < 0.05 (Supplementary Table 13) (β = −0.00012, t = −3.41, p = 0.001).This effect suggests that controls with lower IQ show relatively increased asymmetry of superior frontal thickness, although this was post hoc, exploratory analysis without multiple testing correction.

Analysis of autism diagnostic observation schedule (ADOS) severity scores

The distributions of ADOS severity scores are plotted in Supplementary Fig. 2. Out of the AIs that showed significant case–control differences in the primary analysis, only the isthmus cingulate thickness AI showed an association (uncorrected P < 0.05) with the ADOS severity score (β = 0.0041, t = 2.6, p = 0.011) (Supplementary Table 14). The positive direction of the effect suggests that primarily cases with low ASD severity have reduced leftward asymmetry of this regional thickness.

Medication use

We found no significant effects of medication use (all uncorrected P > 0.05) (Supplementary Table 15).

Discussion

In this, the largest study to date of brain asymmetry in ASD, we mapped differences in brain asymmetry between participants with ASD and controls, in a collection of 54 international data sets via the ENIGMA Consortium. We had 80% statistical power to detect Cohen’s d effect sizes in the range of 0.12–0.13. We found significantly altered asymmetries of seven regional cortical thickness asymmetries in ASD compared with controls, predominantly involving medial frontal, orbitofrontal, inferior temporal, and cingulate regions. The magnitude of all regional thickness asymmetries was decreased in ASD compared with controls, whether it was reduced leftward, reduced rightward, or reversed average asymmetry. Rightward asymmetry of the medial orbitofrontal surface area was also decreased in individuals with ASD, as was leftward asymmetry of the lateral orbitofrontal surface area. In addition, individuals with ASD showed an increase in leftward asymmetry of putamen volume, compared with controls.

Previous MRI studies of cerebral cortical asymmetries in ASD, based on much smaller data sets, and using diverse methods for image analysis, suggested variable case–control differences29,30, or no differences27,31. Our findings partly support a previously reported, generalized reduction of leftward asymmetry29, as six of the nine significantly altered cortical regional asymmetries (thickness or surface area) involved decreased leftward asymmetries. However, three of the nine significantly altered cortical regional asymmetries involved shifts leftwards in ASD, either driven by a more prominent increase on the left side in ASD (i.e., medial orbitofrontal surface area), or by more prominent right- than left-side decreases in ASD (i.e., fusiform- and inferior temporal thickness). Thus, the directional change of asymmetry can depend on the specific region, albeit that the overall magnitude of asymmetry is most likely to be reduced in ASD.

The significant associations of diagnosis with asymmetry in the present study were all weak (Cohen’s d = −0.13–0.12), indicating that altered structural brain asymmetry is unlikely to be a useful predictor for ASD. Prior studies using smaller samples were underpowered in this context. However, the effect sizes were comparable to those reported by recent, large-scale studies of bilateral disorder-related changes in brain structure, in which asymmetry was not studied, including for ASD14 as well as attention-deficit hyperactivity disorder (ADHD)38,46, schizophrenia37, obsessive compulsive disorder (OCD)32,33, posttraumatic stress disorder34, and major depressive disorder35,36. It has become increasingly clear that anatomical differences between ASD and control groups are very small relative to the large within-group variability that is observed47.

Our findings may inform understanding of the neurobiology of ASD. Multi-regional reduction of cortical thickness asymmetry in ASD fits with the concept that laterality is an important organizing feature of the healthy human brain for multiple aspects of complex cognition, and is susceptible to disruption in disorders (e.g., 16,48). Left–right asymmetry facilitates the development of localized and specialized modules in the brain, which can then have dominant control of behavior49,50. Notably, some of the cortical regions highlighted here are involved in diverse social cognitive processes, including perceptual processing (fusiform gyri), cognitive and emotional control (anterior cingulate), and reward evaluation (orbitofrontal cortex, ventral striatum)51. However, the roles of these brain structures are by no means restricted to social behavior. As we found altered asymmetry of various additional regions, our findings suggest broader disruption of lateralized neurodevelopment as part of the ASD phenotype. We note that many of the regions that showed significant case–control differences in asymmetry, including medial frontal, anterior cingulate, and inferior temporal regions, overlap with the default mode network (DMN). The DMN comprises various cortical regions located in temporal (medial and lateral), parietal (medial and lateral) and prefrontal (medial) cortices52. DMN network organization has shown evidence for differences in ASD11,53,54,55, including alterations in functional laterality56. Our findings may therefore further support a role of altered lateralization of the DMN in ASD, warranting further investigations in this direction.

The medial orbitofrontal cortex was the only region that showed significantly altered asymmetry of both thickness and surface area in ASD, suggesting that disrupted laterality of this region might be particularly important in ASD. The orbitofrontal cortex may be involved in repetitive and stereotyped behaviors in ASD, owing to its roles in executive functions57. Prior studies have reported lower cortical thickness in the left medial orbitofrontal gyrus in ASD58, altered patterning of gyri and sulci in the right orbitofrontal cortex59, and altered asymmetry in frontal regions globally25,31. These studies were in much smaller sample sizes than used here.

As regards the fusiform cortex, a previous study by Dougherty et al.30 reported an association between higher ASD symptom severity and increased rightward surface area asymmetry, but not thickness asymmetry. The fusiform gyrus is involved in facial perception and memory among other functions, which are important for social interactions60. Here we report an asymmetry change in fusiform thickness in ASD that was significant after multiple testing correction, but there was also a nominally significant rightward change of surface area asymmetry in ASD (i.e., that did not survive multiple testing correction). This underlines that separate analyses of regional cortical thickness and surface area are well motivated, as they can vary relatively independently61.

The altered volume asymmetry of the putamen in ASD may be related to its role in repetitive and restricted behaviors in ASD. One study reported that differences in striatal growth trajectories were correlated with circumscribed interests and insistence on sameness62. The striatum is connected with lateral and orbitofrontal regions of the cortex via lateral–frontal–striatal reward and top-down cognitive control circuitry that might be dysfunctional in ASD63. For example, individuals with ASD have shown decreased activation of the ventral striatum and lateral inferior/orbitofrontal cortex during outcome anticipation, and of dorsal striatum and lateral–frontal regions during sustained attention and inhibitory control, compared with typically developing controls11,55,63.

Although the reasons for asymmetrical alterations in many of the structures implicated here are unclear, our findings suggest altered neurodevelopment affecting these structures in ASD. Further research is necessary to clarify the functional relevance and relationships between altered asymmetry and ASD. The findings we report in this large–scale study sometimes did not concur with prior, smaller studies. This may be owing to limited statistical power in the earlier studies, whereas low power reduces the likelihood that a statistically significant result reflects a true effect40. However, the cortical atlas that we used did not have perfect equivalents for regions defined in many of the earlier studies, and we did not consider gyral/sulcal patterns, or gray matter volumes as such. Furthermore, discrepancies with earlier studies may be related to age differences, and differences in clinical features of the disorder arising from case recruitment and diagnosis.

We included subjects from the entire ASD severity spectrum, with a broad range of ages, IQs, and of both sexes. Only one effect of diagnosis on regional asymmetry was influenced by sex, i.e., the rostral anterior cingulate thickness asymmetry, which was altered in males but not in females. This same regional asymmetry was primarily altered in lower versus higher IQ cases. This may therefore be an alteration of cortical asymmetry that is relatively specific to an ASD subgroup, i.e., lower-performing males. In controls, a different asymmetry (i.e., superior frontal thickness AI) showed a nominally significant association with IQ, which may point to different brain-IQ associations in ASD and controls. However, we cannot make strong interpretations based on these exploratory, secondary analyses without multiple testing correction.

As regards symptom severity, thickness asymmetry of the isthmus of the cingulate was associated with the ADOS score, such that the lower severity cases tended to have the most altered asymmetry. Again, this post hoc finding remains tentative in the context of multiple testing, and is reported here for descriptive purposes only. It is clear that most of the AIs that showed significant changes in ASD were not correlated with ADOS scores.

We found no evidence that medication use affected any of the asymmetries altered in ASD, although our medication variable was rudimentary. The role of specific medication usage should be investigated in future studies. As mentioned above, data on comorbidities were only available for 54 of the ASD subjects, precluding a high-powered analysis of this issue. This is a limitation of the study. We did not analyze handedness in the present study, as this had no significant effect on the same brain asymmetry measures as analyzed here, in studies of healthy individuals comprising more than 15,000 participants43,44.

In contrast to some prior studies of ASD, we did not adjust for IQ as a covariate effect in our main, case–control analysis. Rather, we carried out post hoc analysis of possible associations between IQ and brain asymmetries, separately in cases and controls. This was because lower average IQ was clearly part of the ASD phenotype in our total combined data set (Supplementary Fig. 1D), so that including IQ as a confounding factor in case–control analysis might have reduced the power to detect an association of diagnosis with asymmetry. This would occur if underlying susceptibility factors contribute both to altered asymmetry and reduced IQ, as part of the ASD phenotype.

The Desikan–Killiany atlas64 was derived from manual segmentations of sets of reference brain images. Accordingly, the mean regional asymmetries in our samples partly reflect left–right differences present in the reference data set used to construct the atlas. For detecting cerebral asymmetries with automated methods, some groups have chosen to work from artificially created, left–right symmetrical atlases, e.g., ref. 65. However, our study was focused on comparing relative asymmetry between groups. The use of a ‘real-world’ asymmetrical atlas had the advantage that regional identification was likely to be more accurate for structures that are asymmetrical both in the atlas and, on average, in our data sets. By defining the regions of interest in each hemisphere based on each hemisphere’s own particular features, such as its sulcal and gyral geometry, we could then obtain the corresponding relationships between hemispheres. To this end, we used data from the automated labeling program within FreeSurfer for subdividing the human cerebral cortex. The labeling system incorporates hemisphere-specific information on sulcal and gyral geometry with spatial information regarding the locations of brain structures, and shows a high accuracy when compared with manual labeling results64. Thus, reliable measures of each region can be extracted for each subject, and regional asymmetries then accurately assessed.

Although a single image analysis pipeline was applied to all data sets, heterogeneity of imaging protocols was a feature of this study. There were substantial differences between data sets in the average asymmetry measured for some regions, which may be owing in part to different scanner characteristics, as well as differences in patient profiles. We corrected for ‘data set’ as a random effect in the analysis, and sensitivity analysis based on the subset of 3 T acquired data showed similar results to the primary analysis. However, it is possible that between-data set variability resulted in reduced statistical power, relative to a hypothetical, equally sized, single-center study. In reality, no single centre has been able to collect such large samples alone. As long as researchers publish many separate papers based on single data sets, collected in particular ways, the field overall has the same problem. In this case, multi-centre studies can better represent the real-world heterogeneity, with more generalizable findings than single-centre studies66. The primary purpose of our study, based on 54 data sets that were originally collected as separate studies, was to assess the total combined evidence for effects over all of these data sets, whereas allowing for heterogeneity between data sets through the use of random intercepts, and finally adding sensitivity and secondary analyses with respect to relevant variables.

The cross-sectional design limits our capacity to make causal inferences between diagnosis and asymmetry. ASD is highly heritable, with meta-analytic heritability estimates ranging from 64 to 91%67. Likewise, some of the brain asymmetry measures examined here have heritabilities as high as roughly 25%43,44. Future studies are required to investigate shared genetic contributions to ASD and variation in brain structural asymmetry. These could help to disentangle cause-effect relations between ASD and brain structural asymmetry. Given the high comorbidity of ASD with other disorders, such as ADHD, OCD, and schizophrenia68, cross-disorder analyses incorporating between-disorder genetic correlations may be informative.

In conclusion, large-scale analysis of brain asymmetry in ASD revealed primarily cortical thickness effects, but also effects on orbitofrontal cortex asymmetry, and putamen asymmetry, which were significant but very small. Our study illustrates how high-powered and systematic studies can yield much needed clarity in human clinical neuroscience, where prior smaller and more methodologically diverse studies were inconclusive.

Methods

Data sets

Structural MRI data were available for 57 different data sets (Table 1). Three data sets comprising either cases only, or controls only, were removed in this study (Table 1), as our analysis model included random intercepts for ‘data set’ (below), and diagnosis was fully confounded with data set for these three. The remaining 54 data sets comprised 1778 people with ASD (N = 1504 males; median age = 13 years; range = 2–64 years) and 1829 typically developing controls (N = 1400 males; median age = 13 years; range = 2–64 years).

All data sets were collected during the period when DSM-IV and DSM-IV-TR were the common classification systems, between 1994 and 2013, and the clinical diagnosis of ASD was made according to DSM-IV criteria69. The data sets were collected in a variety of different countries, and intended originally as separate studies. Nonetheless, all subjects were diagnosed based on clinical diagnosis by a clinically experienced and board certified physician/psychiatrist/psychologist. This was a criterion for admission of a data set into the ENIGMA-ASD database. For each of the 54 data sets, all relevant ethical regulations were complied with, and appropriate informed consent was obtained for all individuals.

Total scores from the Autism Diagnostic Observation Schedule-Generic (ADOS), a standardized instrument commonly used in autism diagnosis70, were available for a majority of cases (N = 878). Cases from the entire ASD spectrum were included, but only 66 cases had IQ below 70 (cases: mean IQ = 104, SD = 19, min = 34, max = 149; see Supplementary Fig. 1D). The presence/absence of comorbid conditions had been recorded for 519 of the cases, but only 54 cases showed at least one comorbid condition (which could be ADHD, OCD, depression, anxiety, and/or Tourette’s syndrome14). Numbers related to DSM-IV subtypes of ASD were not collated by the ENIGMA ASD working group, as this subtyping scheme has been dropped from DSM-V due to low reliability71.

There was not a homogeneous assessment/recruitment process for controls across the 54 data sets, but the overwhelming majority were typically developing/healthy at the time of MRI, and no controls showed features that might have met criteria for a diagnosis of ASD. Only 19 controls had IQ > 70. In these subjects the exclusion of ASD diagnosis was performed by a senior child psychiatrist/physician. Eighteen of these were from the FSM data set and were clinically diagnosed with idiopathic intellectual disability. Amongst all controls the mean IQ was 112, SD = 15, min = 31, max = 149; see Supplementary Fig. 1D.

Structural MRI

Structural T1-weighted brain MRI scans were acquired at each study site. As shown in Table 1, images were acquired using different field strengths (1.5 T or 3 T) and scanner types. Each site used harmonized protocols from the ENIGMA consortium (http://enigma.ini.usc.edu/protocols/imaging-protocols) for data processing and quality control. The data used in the current study were thickness and surface area measures for each of 34 bilaterally paired cortical regions, as defined with the Desikan–Killiany atlas64, as well as the average cortical thickness and total surface area per entire hemisphere. In addition, left and right volumes of seven bilaterally paired subcortical structures, plus the lateral ventricles, were analyzed. Cortical parcellations and subcortical segmentations were performed with the freely available and validated software FreeSurfer (versions 5.1 or 5.3)72, using the default ‘recon-all’ pipeline, which also incorporates renormalization. Parcellations of cortical grey matter regions and segmentations of subcortical structures were visually inspected following the standardized ENIGMA quality control protocol ((http://enigma.ini.usc.edu/protocols/imaging-protocols). Exclusions on the basis of this quality control resulted in the sample sizes mentioned above (see Data sets). In briefy, cortical segmentations were overlayed on the T1 image of each subject. Web pages were generated with snapshots from internal slices, as well as external views of the segmentation from different angles. All sites were provided with the manual on how to judge these images, including the most common segmentation errors. For subcortical structures, the protocol again consisted of visually checking the individual images, plotted from a set of internal slices. Volume estimates derived from poorly segmented structures (i.e., where tissue labels were assigned incorrectly) were excluded from each site’s data sets and subsequent analyses. In addition, any data points exceeding 1.5 times the interquartile range, as defined per site and diagnostic group, were visually inspected (3D). When identified as error, all values from the affected regions were excluded from further analysis.

Asymmetry measures

Separately for each structural measure and individual subject, left (L) and right (R) data were used in R (version 3.5.3) to calculate an asymmetry index (AI) with the following formula: AI = (LR)/(L + R). Distributions of each of the AIs are plotted in Supplementary Fig. 2. Note that AIs do not necessarily scale with L, R, or brain size, owing to their denominators.

Linear mixed effects random-intercept model mega-analysis

Model: linear mixed effects models were fitted separately for each cortical regional surface and thickness AI, as well as the total hemispheric surface area and mean thickness AI, and the subcortical volume AIs. This was accomplished by means of mega-analysis incorporating data from all 54 data sets, using the ‘nlme’ package in R73. All models included the same fixed- and random effects, and had the following formulation:

$${\mathrm{AI}} = {\mathrm{diagnosis}} + {\mathrm{age}} + {\mathrm{sex}} + {\mathrm{random}}( = {\mathrm{dataset}})$$

where AI reflects the AI of a given brain structure, and diagnosis (‘controls’ (= reference), ‘ASD’), sex (‘males’ (= reference), ‘females’) and data setwere coded as factor variables, with data set having 54 different categories. Age was coded as a numeric variable.

The Maximum Likelihood method was used to fit the models. Subjects were omitted if data were missing for any of the predictor variables (method = na.omit). The ggplot2 package in R was used to visualize residuals (Supplementary Figs. 35). Collinearity of predictor variables was assessed using the ‘usdm’ package in R (version 3.5.3.).

Significance

Significance was assessed based on the P values for the effects of diagnosis on AIs. The FDR74 was estimated separately for the 35 cortical surface area AIs (i.e., 34 regional AIs and one hemispheric total AI) and the 35 cortical thickness AIs, and again for the seven subcortical structures plus lateral ventricles, each time with a FDR threshold of 0.05. Correlations between AI measures were calculated using Pearson’s R and visualized using the ‘corrplot’ package in R (Supplementary Figs. 68). Most pairwise correlations between AIs were low, with only 33/78 pairwise correlations either lower than − 0.3 or > 0.3, with a minimum R = −0.351 between the inferior parietal surface area AI and supramarginal surface area AI, and maximum R = 0.487 between the cuneus surface area AI and pericalcarine surface area AI.

Cohen’s d effect sizes

The t-statistic for the factor ‘diagnosis’ in each linear mixed effects model was used to calculate Cohen’s d75, with

$$d = \frac{{t \ast ({\mathrm{n}}1 + {\mathrm{n}}2)}}{{\sqrt {({\mathrm{n}}1 \ast {\mathrm{n}}2)} \ast \sqrt {df} }}$$
(1)

where n1 and n2 are the number of cases and controls, and df the degrees of freedom.

The latter was derived from the lme summary table in R, but can also be calculated using df=obs − (x1 + x2), where obs equals the number of observations, x1 the number of groups and x2 the number of factors in the model.

The 95% confidence intervals for Cohen’s d were calculated using 95% CI = d ± 1.96 SE, with the standard error (SE) around Cohen’s d calculated according to:

$${\mathrm{SE}} = \sqrt {\frac{{{\mathrm{n}}1 + {\mathrm{n}}2}}{{{\mathrm{n}}1 \ast {\mathrm{n}}2}} + \frac{{d^2}}{{2 \ast ({\mathrm{n}}1 + {\mathrm{n}}2 - 2)}}}$$
(2)

For visualization of cerebral cortical results, Cohen’s d values were loaded into Matlab (version R2016a), and 3D images of left hemisphere inflated cortical and subcortical structures were obtained using FreeSurfer-derived ply files.

Power analyses

As each linear model included multiple predictor variables, the power to detect an effect of diagnosis on AI could not be computed exactly, but we obtained an indication of the effect size that would be needed to provide 80% power, had we been using simple t tests and Bonferroni correction for multiple testing, using the ‘pwr’ command in R. For this purpose, a significance level of 0.0014 (i.e., 0.05/35) was set in the context of multiple testing over the regional and total cortical surface area AIs (N = 35) or thickness AIs (N = 35), and 0.00625 (i.e., 0.05/8) for seven subcortical volume plus lateral ventricle AIs (N = 8). This showed that a difference of roughly Cohen’s d = 0.13 would be detectable with 80% power in the cortical analyses, and Cohen’s d = 0.12 in the subcortical analyses.

Sensitivity analyses

No outliers were removed for the primary analysis, but to confirm that results were not dependent on outliers, all analyses were repeated after having winsorized using a threshold of k = 3, for each AI measure separately. That is to say, the two highest and two lowest values were assigned the value of the third highest or lowest value, respectively, separately per AI. This threshold was chosen after visual inspection of frequency histograms.

The relationships between AIs and age showed no overt non-linearity (Supplementary Figs. 911), so no polynomials for age were incorporated in the models for primary analysis. However, analyses were repeated using an additional non-linear term for age, to check whether this choice had affected the results. As Age and Age2 are highly correlated, we made use of the poly()-function in R for these two predictors, which created a pair of uncorrelated variables to model age effects (so-called orthogonal polynomials)76, where one variable was linear and one non-linear.

As our data included participants as young as 1.5 years of age, and segmentation of very young brains might be especially challenging for the FreeSurfer algorithms, we also repeated our primary analysis excluding all individuals aged below 6 years (N = 64 controls, N = 113 cases), to assess whether they might have impacted the findings substantially (although these had passed the same quality control procedures as all other data sets, and FreeSurfer segmentation in preschoolers is generally of good quality, even before visual QC77).

As adults aged over 40 years were relatively sparsely represented, we also repeated the primary analysis after removing any individuals aged ≥ 40, in case modeling of age as a continuous predictor might have been unduly affected by these individuals. (In addition, see below for further analysis of age, for the purposes of subset and interaction analyses).

Finally, we repeated the primary analysis using only the subset of 3 T acquired data (45 out of 54 data sets), to test for possible sensitivity to this technical variable. The sample was reduced from 1778 cases and 1829 controls in the primary analysis to 1467 cases and 1574 controls in the 3T-only analysis.

Directions of asymmetry changes

For any AIs showing significant effects of diagnosis in the primary analysis, linear mixed effects modeling was also performed on the corresponding L and R measures separately, to understand the unilateral changes involved. The models included the same terms as were used in the main analysis of AIs (i.e., diagnosis, age and sex as fixed factors, and data set as random factor). Again, the Cohen’s d effect sizes for diagnosis were calculated based on the t-statistics. The raw mean AI values were calculated separately in controls and cases, to describe the reference direction of healthy asymmetry in controls, and whether cases showed reduced, increased, or reversed asymmetry relative to controls.

Age- or sex-specific effects

For all AIs, we carried out secondary analyses including age diagnosis and sex diagnosis interaction terms, in separate models. The models were as follows: AI = diagnosis + age + sex + age diag + random (= data set), and AI = diagnosis + age + sex + sex diag + random (= data set).

In addition, we separated the data into two subsets by age, i.e., children < 18 years and adults ≥ 18 years (using the same criteria as van Rooij et al. 2018), or else by sex (males, females). Models were then fitted separately for each AI within each subset, i.e., within each age subset AI = diagnosis + sex + random (= data set), and within each sex subset AI = diagnosis + age + random (= data set).

Analysis of IQ

For each AI that showed a significant effect of diagnosis in the primary analysis, we carried out exploratory analyses of IQ in cases and controls separately, whereby IQ (as a continuous variable) was considered as a predictor variable for the AI, so that AI = IQ + age + sex + random (= data set). This was done to understand whether individual differences in asymmetry might relate to IQ, and whether such relations might be specific to ASD.

ADOS severity score

For each AI that showed a significant effect of diagnosis in the primary analysis, a within-case-only analysis was performed incorporating symptom severity based on ADOS score as a predictor variable for AI: AI = ADOS + age + sex + random (= data set). This was to understand whether the observed asymmetry changes in cases were dependent on ASD severity. ADOS scores were first adjusted using log10 transformation to reduce skewing.

Analysis of medication use

Data on medication use (i.e., current use of psychiatric treatment drugs prescribed for ASD or comorbid psychiatric conditions) was available for 832 individuals with ASD, of which 214 were categorized as medication users. For each AI that showed a significant effect of diagnosis in the primary analysis, a linear mixed model analysis was performed within-cases only, AI = medication + age + sex + random (= data set). ‘Medication’ was coded as a binary variable (0 = no medication, 2 = medication).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.