Sleep problems/disorders (SD) are common in autism spectrum disorder (ASD), and up to 80% of individuals diagnosed with ASD will experience a SD within their lifetime (Baker and Richdale 2017; Richdale and Schreck 2009). These SD’s regularly include increased sleep onset latency, frequent and prolonged night awakenings (Allik et al. 2008; Krakowiak et al. 2008), insomnia (Cortesi et al. 2010; Richdale and Baglin 2015), and/or early morning rise times (Richdale and Baglin 2015). Although SD and ASD are highly comorbid, relatively little is known about the mechanistic intersections of these two disorder classes. The present study aims to inform this gap using a gene enrichment analysis to determine whether genes associated with both SD and ASD risk overlap to a higher degree than expected by chance. This analysis will identify a set of specific risk genes that contribute to both phenotypes, which may help to identify potential mechanistic pathways linking SD and ASD.

In studies of sleep and ASD, genetic mechanisms are often discussed as potentially important (Souders et al. 2017); however, studies directly assessing genes and sleep in ASD are still relatively uncommon. Most often, researchers have assessed genes in the circadian system including CLOCK-related genes (Nicholas et al. 2007), and genes in the melatonin pathway (Jonsson et al. 2010; Veatch et al. 2015). Notably, the steady stream of genetic studies informing our understanding of CLOCK-related or circadian sleep genes in ASD use very small sample sizes (e.g., Ns between 28 and 110) and are thus vastly underpowered to detect the small effect that a specific gene is expected to exert on a complex phenotype. In order to generate new biologically plausible mechanisms to guide future studies, we identified genes that have been empirically linked to both SD and ASD and subjected these genes to an enrichment analysis that uses summarized empirical evidence and accumulated bioinformatics knowledge to understand the downstream impacts of these risk genes.

Methods

Data

This study utilized data from a previously published autism gene database (http://db.cbi.pku.edu.cn/autismkb_v2; Xu et al. 2011) and a hand-curated sleep gene set.

Autism Gene Set

Based on a comprehensive literature review, Xu et al. (2011) created the Autism KB database (newest version, Autism KB 2.0), which includes candidate gene lists that were curated from several studies (e.g., clinical features, experimental design). Meta-data from each study regarding the methodological approach were also recorded and scored in order to generate evidence scores for each gene found by the study. Documentation is available online at http://db.cbi.pku.edu.cn/autismkb_v2/document.php, containing a detailed outline of their search and scoring methodology. Autism KB 2.0 encompasses multiple evidence-based datasets. For the purposes of the current study, we used the ASD ‘Core Dataset’ (hereafter referred to as the ASD core gene set). The ASD core gene set includes 228 genes (see Supplemental Table 2), each with a total evidence score of 16 or higher (the higher the score, the more confident the gene is related to ASD). Total evidence scores above 16 are generally considered pathology-linked or significantly associated (Xu et al. 2011). Information regarding database construction and analyses for the ASD core gene set can be found in the online manual for Autism KB 2.0 (http://db.cbi.pku.edu.cn/autismkb_v2/document.php) and in the supplemental online materials of Xu et al. (2011).

Sleep Gene Set

We hand-curated a set of genes likely related to sleep problem/disorder phenotypes (henceforth referred to as the sleep phenotype data set) from existing literature. The first step in this process involved identifying genes potentially related to sleep by searching (1) the Genome Wide-Association Study (GWAS) catalog (https://www.ebi.ac.uk/gwas/) using the keyword ‘sleep’, (2) National Center for Biotechnology Information (NCBI) Gene (https://www.ncbi.nlm.nih.gov/gene) using the keyword ‘sleep’, and (3) PubMed and Google Scholar using the search terms ‘sleep’ and ‘genetics’. Searches were conducted between June and September 2017. Inclusion in the final gene set required a minimum of two peer-reviewed empirical findings (per gene) indicating the association of structural variations in that gene with sleep phenotypes (rather than the sleep phenotype influencing gene expression), or at least one peer-reviewed GWAS finding. The final sleep gene set contained 154 genes (for a complete list see Supplemental Table 1).

Analytic Strategy

Following Marceau and Abel (2018), our analytic strategy comprised two steps. First, we determined the number of genes appearing in both the sleep and ASD core gene datasets and calculated whether the ASD gene set was enriched for sleep-related genes (e.g., whether more genes were in both gene sets than expected by chance). Specifically, we compared our final sleep gene set against the published ASD core gene set. We used the Nematode bioinformatics analysis tools and data (Lund, n.d.) to calculate the statistical significance of genetic overlap between the two gene sets. Consistent with current estimates of protein-coding genes in the human genome (Ezkurdia et al. 2014), we estimated 19,000 genes to calculate the number of overlapping genes expected by chance.

Then, to further elucidate the biological function of the genes implicated in both sleep and ASD, we subjected the list of overlapping genes to an over-representation pathway analysis (see Marceau and Abel 2018). Over-representation pathway analysis is a systems biology tool that identifies whether sets of genes are differentially expressed in any particular biological pathways (Kamburov et al. 2009, 2011). A pathway can be defined as a set of “functionally-related” genes (Jin et al. 2014, p. 211), which may be more informative regarding genetically influenced biological mechanisms related to both sleep and ASD than single genes. The over-representation analysis was conducted by uploading the overlapping genes into the online over-representation tool maintained by the Consensus Pathway Database (CPDB; http://cpdb.molgen.mpg.de/). We selected pathway-based sets wherein at least 25% of the genes on the overlap list were included in the pathway to ensure high relevance of the pathway to identify potential new mechanisms of association of ASD and SD.

Results

Enrichment Analyses

There were 15 overlapping genes in our sleep phenotype gene set and the ASD core gene set, listed in Table 1. The expected number of overlapping genes was (154 [sleep gene set size] × 228 [ASD core gene set size])/19,000 [total genes in genome] = 1.848. The representation factor was therefore 15 [identified genes]/1.848 [expected genes] = 8.11, p < 5.466e−10, indicating significantly more overlap than expected by chance.

Table 1 List of 15 overlapping genes including function and pathway(s)

Over-representation Pathway Analysis with Core ASD Gene Set

Using a threshold of 25% (a minimum of four genes from the overlap list), the over-representation analysis revealed four enriched pathways: Dopaminergic Synapse, Serotonergic Synapse, MECP2 and Associated Rett Syndrome, and Sudden Infant Death Syndrome Susceptibility Pathways. Table 2 describes each pathway. Table 3 summarizes the genes on the overlap list appearing in each pathway and the affiliated sleep problems/disorders or neurodevelopmental syndrome. Because the WikiPathways database (included in the CPDB search) is community-based without verifiable scientific oversight, we exclude MECP2 and Associated Rett Syndrome and Sudden Infant Death Syndrome Susceptibility Pathways (which were identified only via WikiPathways) from our discussion, as their veracity is less certain.

Table 2 Enriched pathways for the 15-gene overlap set
Table 3 Candidate genes in each of the primary biological pathways

Discussion

This study demonstrated significant overlap in genes related to both SD and ASD risk, highlighting potential genetic and downstream biological mechanisms that may be involved in increased rates of comorbid SD in ASD samples. Our findings serve as a first, preliminary step in understanding the genetic overlap in ASD and SD, which we hope will facilitate interest in new empirical studies on gene mutations and expression in individuals with ASD and comorbid SD. The field is well poised to conduct such studies given our expanding access to the largest autism genome databanks in history (e.g., MSSNG; SPARK, Simons Simplex Collection).

Unexpectedly, the overlapping genes identified in this study did not include those most commonly studied in relation to sleep in ASD samples (e.g., CLOCK). This finding does not imply that CLOCK-related genes do not influence sleep in ASD, but rather, they may not be part of a shared SD–ASD mechanistic pathway. We did identify one overlapping gene that supports circadian entrainment: CACNA1C. CACNA1C aids in the regulation of calcium channels and has a circadian expression pattern. In mouse models, CACNA1C is associated with deficits in phase shifting (Schmutz et al. 2014) and reduced lower spectral power and impaired REM during recovery sleep (Kumar et al. 2015). In individuals with ASD, some studies have noted circadian-based sleep difficulties and atypical EEG patterns during REM and non-REM sleep phases (Godbout et al. 1998, 2000; Limoges et al. 2005). Notably, these differences are not consistent across the ASD literature as some studies reported no remarkable EEG differences from comparison groups (e.g., Tani et al. 2003, 2004). Further explorations of CACNA1C and ASD could inform these inconsistencies.

One overlapping gene, ASMT, is involved in melatonin synthesis. Dysregulation of melatonin synthesis is well documented in individuals with ASD (Tordjman et al. 2013; Veatch et al. 2015) and the findings from the present study implicate one (of several) genes involved in melatonin synthesis. It is unclear what mechanistic role melatonin synthesis may have in ASD phenotypes, but it is likely a contributing factor in the high rates of comorbidity in SD and ASD (Veatch et al. 2015).

Neurotransmitter dysregulation is commonly documented in ASD—with noted differences in the dopamine, serotonin, glutamate, and GABA systems (Horder et al. 2018; Muller et al. 2016; Paval 2017). Within this exploratory study, dopamine and serotonin systems were implicated in our pathway analysis—including genes that aid in regulating calcium channels (CACNA1C, SCN1A), mitochondrial enzyme encoding (MAOA), dopamine transmission (SLC6A3), and serotonin receptors and transmission (HTR2A, SLC6A4). Future SD and ASD research can build on this study by giving attention to the dopamine and serotonin systems and how they may impact sleep and ASD phenotypes. From a treatment approach, pharmacological therapies that promote dopamine or serotonin regulation may have the potential to improve both sleep and ASD-related behaviors. Indeed, several commonly used medications target these systems (e.g., selective serotonin reuptake inhibitors) but additional studies are needed to inform if upregulation, downregulation, or adjusting interacting mechanisms will best support balance in these systems (e.g., Volkow et al. 2012).

It is important to note that the employed pathway analysis can be prone to identifying large-scale pathways involved in fundamental biological processes that therefore may not be specific to ASD and SD. Thus, our analysis may simply reflect that brain-based pathways are necessary for all behavior and higher-order cognitive function. The high percentage of syndromic or neurological conditions associated with ASD in our gene list (10 out of 16 genes) also alludes to a more general brain-based dysregulation, which may manifest as behavioral difficulties during the day and at night. Although the aim of this study was to narrow potential future targets for SD and ASD research, the identified genes may only influence both daytime (ASD behaviors) and nighttime (sleep) behaviors when large-scale brain-based dysregulation is present.

Additional limitations include that the statistical method for assessing gene overlap does not account for gene size, which could lead to increased type I error. Second, GWAS studies are difficult to replicate, and as such the identified genes may include some that are not robustly linked to sleep. Further, many of the large GWAS studies in the sleep phenotype gene set measured sleep problems using self-report methods. Although polysomnography and actigraphy may provide less biased sleep estimates than sleep diaries or self-report questionnaires, these gold-standard approaches are also costly and difficult to implement on a large scale. Our sleep phenotype data set also includes genes implicated in a broad range of SD’s, which are each biologically distinct. Some of these SD’s are more common in individuals with ASD than others. Thus, future research may benefit from a more concise gene set—including genes related to SD’s with the highest rates of comorbidity in ASD, genes that are more heavily replicated in the peer-reviewed literature (e.g., two or more GWAS Studies and excluding genes with evidence only from candidate gene studies with small samples).

Summary and Future Directions

Our overlapping gene set may serve as a preliminary stepping-stone for new studies investigating genetic components of SD in ASD risk (including new gene sets that may be more specific to ASD), or may be useful in exploring existing autism genome databases to answer critical questions about why children with ASD experience increased rates of SD. Future research exploring: (1) potential mutations in genes empirically found to influence SD and ASD, and (2) related neural underpinnings of SD in ASD may prove valuable for understanding mechanistic links between SD and ASD. This study should be replicated using other reputable ASD gene databases, such as SFARI, and should be repeated as more reliable gene discovery is completed in both disorders (particularly sleep).