The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×

Abstract

Objective:

Autism, schizophrenia, and other clinically distinct neurodevelopmental psychiatric disorders (NPDs) have shared genetic etiologies, including single-gene and multigenic copy number variants (CNVs). Because rare variants are primarily investigated in clinical cohorts, population-based estimates of their prevalence and penetrance are lacking. The authors determined the prevalence, penetrance, and NPD risk of pathogenic single-gene variants in a large health care system population.

Methods:

The authors analyzed linked genomic and electronic health record (EHR) data in a subset of 90,595 participants from Geisinger’s MyCode Community Health Initiative, known as the DiscovEHR cohort. Loss-of-function pathogenic variants in 94 high-confidence NPD genes were identified through exome sequencing, and NPD penetrance was calculated using preselected EHR diagnosis codes. NPD risk was estimated using a case-control comparison of DiscovEHR participants with and without NPD diagnoses. Results from single-gene variant analyses were also compared with those from 31 previously reported pathogenic NPD CNVs.

Results:

Pathogenic variants were identified in 0.34% of the DiscovEHR cohort and demonstrated a 34.3% penetrance for NPDs. Similar to CNVs, sequence variants collectively conferred a substantial risk for several NPD diagnoses, including autism, schizophrenia, and bipolar disorder. Significant NPD risk remained after participants with intellectual disability were excluded from the analysis, confirming the association with major psychiatric disorders in individuals without severe cognitive deficits.

Conclusions:

Collectively, rare single-gene variants and CNVs were found in >1% of individuals in a large health care system population and play an important contributory role in mental health disorders. Diagnostic genetic testing for pathogenic variants among symptomatic individuals with NPDs could improve clinical outcomes through early intervention and anticipatory therapeutic support.

Neurodevelopmental psychiatric disorders (NPDs) are leading contributors to disability in the United States, collectively affecting >20% of the population (13). The origins of these conditions are highly complex and include primary genomic underpinnings modulated by environmental and experiential influences throughout the lifespan. Clinically distinct NPDs, including schizophrenia, autism spectrum disorder (ASD), bipolar disorder, and intellectual disability/developmental delay, have shared genetic etiologies, including rare multigenic copy number variants (CNVs) and single-gene sequence variants that confer large deleterious impacts on brain function. The study of rare genetic etiologies has significance for all NPD research, as many distinct rare disorders stem from known disruptions in shared neurobiological pathways affecting a broad range of cognitive and behavioral domains. Genomic research has the potential to transform the overall understanding of NPD prevention and treatment, consistent with the current strategic plan of the National Institute of Mental Health (4).

Exome sequencing with CNV calling is an important clinical test currently used to detect rare genomic variants, particularly in pediatric patients with intellectual disability/developmental delay and ASD (4). In these populations, disease-causing variants have been identified in ∼30% of patients and collectively represent the most common known etiologies of NPDs. Exome sequencing has also been used as a research tool for gene discovery in cohorts of individuals ascertained for specific NPDs, such as ASD (57), schizophrenia (8, 9), bipolar disorder (10, 11), intellectual disability/developmental delay (1214), attention deficit hyperactivity disorder (ADHD) (15), and epilepsy (16, 17), leading to the identification of a large number of candidate genes. There are also ongoing efforts to curate and catalog de novo rare variants in NPD-associated genes reported in the literature to find supporting evidence of a causal relationship (1821). In our Developmental Brain Disorder Gene Database (https://dbd.geisingeradmi.org), we curate loss-of-function (LOF) and missense variants among individuals with intellectual disability/developmental delay, ASD, ADHD, schizophrenia, bipolar disorder, and/or epilepsy from published case-control and clinical case studies of DNA-sequenced probands (18). Curated genes are ranked into evidence-based tiers based on the number of cases with de novo pathogenic LOF variants. Genes that meet the criteria for disease causality through our curation process are publicly available in the Developmental Brain Disorder Gene Database (18). Beyond these curation efforts, however, many high-confidence NPD genes in our database (18) and others (1921) remain uncharacterized at the population level. For example, pathogenic variants in the CHD8 gene are among the most commonly reported findings in genetic studies of ASD (22), yet there have been few studies on the prevalence and clinical presentation of individuals with these variants in the general population.

DNA sequencing of large population-based cohorts is beginning to reshape our understanding of the prevalence and phenotypic effects of rare genetic disorders. A genome-first approach to ascertainment, based on an underlying genetic change rather than a phenotypic presentation, can identify unselected variant-positive patient groups and provide an important counterbalance to clinic-based ascertainment. While several large genome-first studies have reported the prevalence and phenotypic outcomes of pathogenic CNVs in unselected populations (2326), few have investigated single-gene sequence variants. Using a genome-first approach, Ganna et al. (27) analyzed 3,172 LOF-intolerant genes, including those without a prior association with NPDs, in adults from the general population. Rare variants in LOF-intolerant genes were collectively prevalent (11%) and increased the risk of NPDs and other clinical phenotypes. Recently, Rolland et al. (28) reported a 0.61% prevalence of high-confidence rare LOF variants in ASD-associated genes (N=185) in UK Biobank population-based samples. Both studies demonstrated that rare genomic sequence variants are an underappreciated risk factor for NPDs in the general population.

Previously, we investigated 31 recurrent pathogenic CNVs among 90,595 research participants with paired exome and electronic health record (EHR) data in a primarily adult health care system population referred to as the DiscovEHR cohort (29, 30). We identified 708 individuals (0.8%) with a pathogenic CNV and found that they had increased rates of NPDs and congenital anomalies as compared with variant-negative participants (25). Here, we expand our population-based investigation of rare genomic variation and its role in mental health by examining NPD gene sequence variants in the DiscovEHR cohort. Specifically, to examine the impact on NPD phenotypes, we determined the prevalence and penetrance of pathogenic LOF variants in 94 high-confidence NPD genes. We also performed a case-control study to estimate the effects of these variants on NPD risk.

Methods

Participants

The study cohort was identified from Geisinger’s MyCode Community Health Initiative, a research project with a biorepository operating in a Clinical Laboratory Improvement Amendments–certified laboratory environment with >295,000 consented patient-participants unselected for age, gender, or clinical diagnosis (31). We included in this study a subset of MyCode patient-participants, known as the DiscovEHR cohort, with exome sequences linked to EHR data. Exome sequencing was performed in collaboration with the Regeneron Genetics Center for 92,455 DiscovEHR participants, as described previously (25, 29, 30).

Exome Sequencing and Variant Calling

Single-nucleotide variants (SNVs), small insertion/deletion (indel), and CNV calling from exome sequencing data have been described previously (25, 29). A detailed description of variant-level quality control (QC) and methods for identification of pathogenic LOF (frameshift, nonsense, essential [+/− 1,2] splice site, stop-loss, and start-loss) variants is provided in the Methods section of the online supplement. The analysis was performed on a subset of DiscovEHR samples (N=90,595) that passed SNV and CNV QC filtering (Figure 1). Samples were sequenced on the VCRome and xGen platforms as described previously (29).

FIGURE 1.

FIGURE 1. Flow of patient-participants in a study of rare pathogenic variants in neurodevelopmental psychiatric genes in a health care system populationa,b

aCNV=pathogenic copy number variant; EHR=electronic health record; LOF=pathogenic loss-of-function; NPD=neurodevelopmental psychiatric disorder; QC=quality control; SNV=single-nucleotide variant.

bBoth a variant in an NPD gene and a recurrent CNV were identified in two participants.

Evidence-Based Approach to NPD Gene Selection

Ninety-four high-confidence autosomal dominant NPD genes were selected from the Developmental Brain Disorder Gene Database (21). NPD genes on the X chromosome were excluded from this study. A detailed gene selection workflow and annotations across multiple databases are included in the online supplement (see the Methods section, Figures S1 and S2, and Table S1). The comparison set of recurrent pathogenic CNVs in this study includes the same 31 CNVs investigated previously (25).

LOF Variant Frequency in NPD Genes From the DiscovEHR and gnomAD Non-Finnish European (NFE) Populations

To ensure the robustness of all LOF variants (pathogenic and variants of uncertain significance), we compared their frequency in 94 NPD genes in DiscovEHR with that in the gnomAD NFE reference population (32). Methods for the identification of LOF variants in the gnomAD data set are described in the online supplement. The comparison of LOF variant frequencies in the DiscovEHR and gnomAD NFE populations is shown in Figure S3 in the online supplement.

NPD Diagnosis Extraction

NPD clinical diagnoses were extracted from participants’ EHRs using ICD-9 and ICD-10 codes (see Table S2 in the online supplement). Relevant NPD diagnosis codes used here were collected in December 2020 and have been previously described in the DiscovEHR cohort (25). To more fully capture NPD phenotypes beyond those documented in ICD codes, we performed a manual chart review on a subset of 27 participants with variants in genes expected to be highly penetrant for NPDs but lacking phenotypic evidence from ICD codes (see the Methods section in the online supplement).

Analysis of Prevalence and Penetrance of Pathogenic Variants

The prevalence of rare variants in high-confidence NPD genes was calculated as the percentage of individuals in the study cohort (N=90,595) with an LOF variant that met the American College of Medical Genetics and Genomics–Association for Molecular Pathology (ACMG-AMP) criteria (3335) for pathogenic or likely pathogenic classification in one of the 94 genes of interest. In this report, our use of the term “variant” implies a pathogenic or likely pathogenic classification. Penetrance of NPDs was calculated as the percentage of individuals with an NPD diagnosis among the total variant-positive subset. All calculations of penetrance excluded congenital anomalies and two common NPD diagnostic categories—depressive disorders and anxiety disorders—unless otherwise noted.

Statistical Analysis

Significance tests of penetrance were performed using chi-square tests, and uncorrected p values are reported. Associations between aggregated single-gene variants and NPD diagnosis were assessed by case-control analysis, comparing the relative frequencies of these variants. Cases are defined as participants who have relevant ICD-9 or ICD-10 diagnostic codes in their EHR for each disorder, whereas controls do not have the diagnostic codes for the specified disorder. We also performed separate analyses (see Tables S18 and S19 in the online supplement) using a healthy subgroup that included individuals without ICD-9 or ICD-10 codes for any of the targeted NPD diagnoses. The statistical significance of the associations between variants and NPD diagnosis was calculated using a Firth logistic regression model adjusting for age, sex, and the first four principal components of ancestry. Odds ratios, corresponding 95% confidence intervals, and p values were derived from the regression model. The reported p values for odds ratio estimates are Bonferroni corrected for the 14 tests of NPD outcomes (see Table S2A in the online supplement) included in the study. We applied the same modeling approach to test for associations between the set of recurrent CNVs and NPD outcomes. We restricted these analyses to a European subset of DiscovEHR (N=81,717) to reduce the impact of population stratification on the results. This European subset was identified by principal component analysis and clustering with the 1000 Genomes reference population using the same protocol applied to the UK Biobank (36). We separately investigated the effects of pathogenic variants in Black/African American and Hispanic/Latino minority populations and conducted sensitivity analyses to account for familial relatedness and age, as described in the Methods section in the online supplement. All statistical analyses were performed using R, version 4.0.1.

Results

Prevalence of NPD-Related Single-Gene Variants

Overall, the frequency of LOF variants across genes in DiscovEHR was comparable to that in the gnomAD NFE population, demonstrating the robustness of the variants included in our study (see Figures S3A and S3B in the online supplement). The demographic characteristics of 90,595 DiscovEHR participants were also consistent with previous studies (see Table S3 in the online supplement).

A total of 312 individuals (0.34%) in DiscovEHR had a pathogenic variant, including 233 unique variants, involving 61 of the 94 high-confidence NPD genes selected for analysis (see Tables S4 and S5 in the online supplement). The genes with the highest frequency of variants were ANK2 (N=26; 0.029%), ASXL3 (N=21; 0.023%), SHANK2 (N=18; 0.020%), TRIO (N=17; 0.033%), WDFY3 (N=16; 0.018%), DSCAM (N=15; 0.017%), and GIGYF1 (N=15; 0.017%). The high variant frequency in ASXL3 was due to a single deletion CNV observed in 16 individuals, at least nine of whom are first- or second-degree relatives. Table S4 in the online supplement shows variant frequency in an unrelated population subset. We confirmed that pathogenic variants in the highest-frequency genes were distributed across samples from both the VCRome and xGen sets (see Figure S4 in the online supplement). We found that two individuals had both a recurrent CNV and a single-gene variant: one with a 1q21.1 deletion and a WDFY3 variant, and the other with a 16p11.2 duplication and an NSD2 variant.

As shown in Figure 2, the prevalence of variants in the 94 NPD genes (0.34%) was less than that previously reported for the 31 recurrent NPD CNVs in DiscovEHR (0.8%) and CNVs reported in other population-based cohorts (2326). Overall, when we combined the evaluated single-gene variants (312/90,595) with CNVs (708/90,595) from our previous study, the collective prevalence of all rare NPD genomic variants in the DiscovEHR cohort was 1.1% (1,018/90,595).

FIGURE 2.

FIGURE 2. Prevalence of pathogenic variants in 94 NPD genes and 31 NPD-associated recurrent CNVs in DiscovEHR and other population-based cohortsa

aThe prevalence of NPD CNVs in different population-based cohorts has been reported previously (2326). CNV=copy number variant; NPD=neurodevelopmental psychiatric disorder; EGCUT=Estonian Genome Center of the University of Tartu.

Estimated Penetrance of NPD Gene Variants

We estimated the penetrance of pathogenic variants based on ICD codes for NPDs and congenital anomalies (see Table S2 in the online supplement). We found that 34.3% (107/312) of variant-positive individuals had an NPD diagnosis (see Table S6 in the online supplement), compared with 14.6% (13,105/89,577) of those without a variant (p<0.0001). Among the genes for which more than five individuals had a variant, eight (TRIO, NAA15, ASH1L, ZNF292, IRF2BPL, CLTC, GIGYF1, and POGZ) were highly penetrant, as evidenced by 40%–53% of those with variants having an NPD diagnosis (see Table S6 in the online supplement). When we broadened NPD criteria to include ICD codes for depression and anxiety, 68.6% (214/312) of variant-positive individuals had an NPD diagnosis, compared with 57.4% (51,397/89,577) who were variant negative (p=8.11×10–5). This difference was attenuated when these two common psychiatric disorders were included because the rate of these phenotypes is high in the studied population. The presence of a congenital anomaly in one of five categories (CNS, cardiac, renal/urinary, genital, cleft lip/palate) was observed in 10.9% (34/312) of the variant-positive group (see Table S7 in the online supplement), compared with 7.9% (7,067/89,577) of those without a variant (p=0.06). Overall, 71.2% (222/312) of variant-positive individuals had NPDs, including depression and anxiety, and/or a congenital anomaly, compared with 60.2% (53,899/89,577) of variant-negative participants (p<0.0001).

Several individuals expected to have an NPD phenotype due to a variant in a highly penetrant gene did not have any relevant ICD codes. To further explore this discrepancy, we performed manual chart reviews on a subset of participants (N=27) with selected pathogenic variants known to be highly penetrant. In addition to 10 individuals with an NPD diagnosis recorded with ICD codes, four others had clinical NPD diagnoses documented only in unstructured text (e.g., clinicians’ notes) but not with an ICD code (see Table S8 and the Methods section in the online supplement). The percentage of individuals with an NPD diagnosis increased from 37.0% (10/27) when only ICD codes were used to 51.9% (14/27) on manual chart review, demonstrating that analysis of ICD codes alone significantly underestimates the rate of NPD diagnosis.

Associations Between Rare Genomic Variants and NPD Diagnosis

To examine the effects of pathogenic variants in high-confidence NPD genes, we estimated the risk of an NPD diagnosis in variant-positive DiscovEHR participants. The presence of a variant significantly increased the risk for NPD diagnoses in the EHR. As shown in Table 1, among the specific NPDs tested, the highest risk was observed for intellectual disability (odds ratio=8.73, 95% CI=4.83, 14.72, p=1.91×10–8). We also observed increased risk for diagnosis of other neurological disorders (odds ratio=5.89, 95% CI=2.68, 11.86, p=6.41×10–4), specific learning disorder (odds ratio=5.52, 95% CI=2.74, 10.11, p=2.46×10–4), communication disorders (odds ratio=3.47, 95% CI=1.86, 5.97, p=3.92×10–3), epilepsy (odds ratio=2.39, 95% CI=1.56, 3.51, p=2.11×10–3), schizophrenia and other psychotic disorders (odds ratio=2.88, 95% CI=1.75, 4.45, p=1.47×10–3), bipolar disorder (odds ratio=2.32, 95% CI=1.54, 3.39, p=2.04×10–3), ADHD (odds ratio=2.89, 95% CI=1.83, 4.40, p=2.57×10–4), and depressive disorder (odds ratio=1.47, 95% CI=1.15, 1.87, p=2.58×10–2). An increased risk for ASD was observed (odds ratio=4.56, 95% CI=1.58, 11.35, p=0.10), although it was not statistically significant after adjusting for multiple testing. Furthermore, the odds ratios were consistent with higher risk for several other phenotypes, including motor disorder, obsessive-compulsive disorder, and cerebral palsy, but they did not reach statistical significance. When pathogenic variants in NPD genes and recurrent CNVs were compared, both variant types demonstrated consistent risk patterns across disorders (Figure 3).

TABLE 1. Association between single-gene pathogenic variants and NPD diagnosesa

DisorderOdds Ratio95% CICorrected pUncorrected pCases With a Variant (N)Total Cases (N)Controls With a Variant (N)Total Controls (N)
Intellectual disability8.734.83, 14.721.91×10–81.36×10–91542326081,304
ASD4.561.58, 11.350.106.85×10–3622926981,498
ADHD2.891.83, 4.402.57×10–41.84×10–5272,39724879,330
Epilepsy2.391.56, 3.512.11×10–31.51×10–4263,31124978,416
Schizophrenia and related disorders2.881.75, 4.451.47×10–31.05×10–4192,23425679,493
Bipolar disorder2.321.54, 3.392.04×10–31.45×10–4293,63624678,091
Communication disorder3.471.86, 5.973.92×10–32.80×10–41390126280,826
Specific learning disorder5.522.74, 10.112.46×10–41.76×10–51144726481,280
Cerebral palsy2.060.23, 7.53>0.990.433116127481,566
OCD1.470.55, 3.11>0.990.406596627080,761
Other neurodevelopmental disorder5.892.68, 11.866.41×10–44.58×10–51140626481,321
Motor disorder2.000.93, 3.73>0.990.07481,19626780,531
Anxiety1.140.89, 1.46>0.990.28612634,61114947,116
Depressive disorder1.471.15, 1.872.58×10–21.84×10–313332,05714249,670

aBoth Bonferroni-corrected and uncorrected p values are presented. ADHD=attention deficit hyperactivity disorder; ASD=autism spectrum disorder; OCD=obsessive-compulsive disorder.

TABLE 1. Association between single-gene pathogenic variants and NPD diagnosesa

Enlarge table
FIGURE 3.

FIGURE 3. Association between pathogenic variants or recurrent CNVs and NPD diagnosesa

aADHD=attention deficit hyperactivity disorder; ASD=autism spectrum disorder; CNV=copy number variant; NPD=neurodevelopmental psychiatric disorder; OCD=obsessive-compulsive disorder. Associations between pathogenic variants in NPD genes and each disorder are statistically significant (Bonferroni-corrected p values <0.05), except for ASD, cerebral palsy, OCD, motor disorder, and anxiety. Pathogenic NPD CNVs’ associations with each disorder are statistically significant, except for ADHD, schizophrenia and related disorders, bipolar disorder, cerebral palsy, OCD, anxiety, and depressive disorder. Error bars indicate 95% confidence intervals. The p values for association between NPD genes or NPD CNVs and each disorder are listed in Table 1 and in Table S9 in the online supplement, respectively.

We next compared the penetrance of single-gene variants to CNVs and found no significant difference for overall NPDs (penetrance, 34.3% [107/312] vs. 30.1% [213/708]; p=0.18) for the 12 individual diagnoses tested. Among NPD phenotypes, intellectual disability had the strongest association for both CNVs (odds ratio=8.77, 95% CI=6.05, 12.38, p<0.0001) and single-gene variants (odds ratio=8.73, 95% CI=4.83, 14.72, p<0.0001) (see Table 1 and Table S9 in the online supplement). While the risk estimates for CNVs and single-gene variants were consistent across disorders, after adjusting for multiple testing, only CNVs showed a significant risk for motor disorder and ASD. Whereas the risk of single-gene variants for ADHD, schizophrenia, and bipolar disorder was significant, the risk of CNVs was not, after controlling for multiple testing (see Table 1 and Table S9 in the online supplement).

Sensitivity Analyses

Odds ratio estimates from samples stratified by exome capture, VCRome, or xGen were consistent with estimates from the combined samples (see Tables S10 and S11 in the online supplement). The association between pathogenic variants and NPD diagnoses was tested separately in unrelated or adult subsets of the population, but the odds ratios across NPD outcomes did not change significantly (see Tables S12 and S13 in the online supplement). Except for ASD, the association between variant status and NPD outcomes remained statistically significant and was not substantially attenuated when individuals with intellectual disability, the diagnostic outcome with the largest odds ratio, were excluded from the analysis (see Table S14 in the online supplement).

We also explored the association analysis in the full cohort, including European and non-European minority populations, to increase the sample size. The results in the full cohort were consistent with those of the European subset (see Table S15 in the online supplement). One notable exception was that the association with variants and motor disorder reached significance in the full cohort, which was nominally significant when only European samples were analyzed. We also performed association analysis separately in EHR-identified Black/African American or Hispanic/Latino minority populations. However, sample sizes in these groups were too small for reliable tests of association (see Tables S16 and S17 in the online supplement).

In addition, we separately analyzed NPD diagnosis risk conferred by pathogenic variants based on comparisons with a healthy control cohort of individuals who were not diagnosed with any of the 14 relevant NPDs (see Tables S18 and S19 in the online supplement). Because the controls for each test excluded those with known NPDs, the odds ratios in this separate association analysis were higher across disorders, representing the upper-bound risk estimates in this study.

Discussion

In this population-based study of a primarily adult health care cohort, we expanded on an earlier investigation of CNVs in DiscovEHR to investigate the prevalence and penetrance of rare pathogenic variants in 94 high-confidence NPD genes. We also established risk estimates for a range of NPD diagnoses based on the presence of a pathogenic NPD gene variant in participants, all of whom received care in a large health care system but were not specifically ascertained through clinical phenotypes.

Our exome analysis identified an NPD gene variant in 312 of 90,595 (0.34%) study participants, adding to our previous finding of recurrent NPD CNVs in 708 participants in this sample (0.8%). Taken together, the overall prevalence of rare NPD variants is 1,118 of 90,595 (1.1%) individuals, representing 1 in 89 DiscovEHR participants. This prevalence should be considered a lower-bound minimum estimate for NPD variants, as we limited our analyses to a conservative list of 31 pathogenic recurrent CNVs and pathogenic or likely pathogenic LOF variants in 94 high-confidence NPD genes, which did not include missense variants.

Although significant, our estimated prevalence (0.34%) is lower than that recently reported by Rolland et al. (28), who reported high-confidence NPD variants in ASD-associated genes in 0.61% of population-based samples. One explanation is that Rolland et al. included 185 genes, compared to the 94 NPD genes we examined, with 75 genes overlapping between the two sets. We also based our selection of variants on ACMG-AMP pathogenicity criteria, in contrast to Rolland et al., who used an annotation-based variant triage strategy to identify high-confidence variants in SPARK genes. Our conservative approach to gene curation and variant classification methods may have also enriched our cohort for variants that carry a higher NPD risk. Interestingly, both studies identified individuals with variants in genes strongly associated with severe NPDs (e.g., CHD8) who did not present with an expected phenotype, confirming the variable expressivity of these variants and the value of population studies for detecting more mild manifestations (28, 3739).

Penetrance estimates of NPD gene variants were similar to those of CNVs in our study but, as expected, lower than the high rates of phenotypic effects typically reported in clinically ascertained cohorts. As DiscovEHR primarily includes older adults, this finding could reflect a survivor bias, given the increased mortality due to congenital anomalies and severe medical phenotypes associated with some genetic disorders (e.g., 22q11.2 deletions) (40). In addition, DiscovEHR represents a general health care population and is more likely to include mildly affected or unaffected variant-positive individuals who would otherwise escape inclusion in a clinically ascertained cohort (41). Furthermore, our previous study of recurrent CNVs revealed that for many individuals with self-reported mental health issues and/or psychiatric care, a diagnosis of an NPD was not always recorded in their EHR, demonstrating the need for self-reported medical history collection to supplement electronic data capture (25, 42). Therefore, our penetrance estimate informs the lower bound of the true penetrance of these genetic disorders. Population-based rare variant studies, such as this one, are therefore critical in establishing minimum penetrance estimates for expected NPD phenotypes (41).

Notably, even with manual chart review, only half of the study participants with known highly penetrant variants had an NPD diagnosis documented in the EHR. This likely reflects the limitations of using categorical NPD diagnoses for capturing mild, subclinical phenotypes. The large effects of rare pathogenic variants are modulated by other genomic and environmental factors, such as polygenic variation and behavioral intervention, resulting in a wide range of clinical presentations. For example, depending on family genomic background, a particular variant may significantly impact an individual’s social interaction skills without necessarily reaching the threshold for a clinical diagnosis of ASD (37, 38, 43). Given the continuous nature of human behavioral and cognitive traits, penetrance estimates based on categorical NPD diagnoses can only provide a baseline estimate of a variant’s true impact on brain function.

We found that pathogenic variants in high-confidence NPD genes were associated with significantly increased risk for several NPD diagnoses, particularly intellectual disability. When we excluded participants with intellectual disability from the analysis, the effect sizes for each disorder were largely unattenuated, confirming an association between rare variants and psychiatric disorders among individuals without severe cognitive deficits (44). This finding adds to a growing body of evidence implicating the significant etiological role of rare genomic variants in mental health phenotypes, including schizophrenia and bipolar disorder (11, 27), separate from the comorbid presence of intellectual disability.

Conclusions

Our results demonstrate that at least 1% of individuals in a large health care system population have a rare penetrant single-gene or copy number variant conferring significant risk for NPDs. This study confirms the important contributory role of rare genomic variants in NPDs and highlights the potential for scientific advancement using genetics-based research approaches. Similar precision medicine strategies have dramatically hastened breakthroughs in other common disorders, such as cancer and cardiovascular disease, and may one day lead to the discovery of effective targeted treatments for NPDs. In the interim, clinically available diagnostic testing to identify rare pathogenic variants should be offered to symptomatic individuals with NPDs, given the significant personal utility of a genetic diagnosis and the potential to improve outcomes through anticipatory medical monitoring.

Autism and Developmental Medicine Institute, Geisinger, Lewisburg, Pa. (all authors).
Send correspondence to Dr. Martin ().

Dr. Shimelis and Dr. Oetjens share first authorship.

Preliminary work from this study was presented as a poster at the American Society of Human Genetics 2020 virtual meeting, October 27–30, 2020.

Supported by NIMH grants R01MH074090 and U01MH119705 (to Drs. Martin and Ledbetter).

Ms. Wain is an employee of GeneDx. Dr. Myers has received grant support from the AllOne Foundation. Dr. Ledbetter is an employee of Unified Patient Network and has received personal fees from Natera, MyOme, Seven Bridges, and X-Therma. Dr. Martin has received grant support from the Simons Foundation. The other authors report no financial relationships with commercial interests.

The authors thank all patient-participants engaged in the MyCode Community Health Initiative and the MyCode Research Team. They also acknowledge the members of the Geisinger-Regeneron DiscovEHR Collaboration for their critical contributions to the generation of the data used for this study, and they thank the clinical and research teams at Geisinger’s Autism and Developmental Medicine Institute. They also thank Alexander Berry, Ph.D. (Geisinger) for his critical review of the manuscript.

References

1. Zablotsky B, Black LI, Maenner MJ, et al.: Prevalence and trends of developmental disabilities among children in the United States: 2009–2017. Pediatrics 2019; 144:e20190811Crossref, MedlineGoogle Scholar

2. Okoro CA, Hollis ND, Cyrus AC, et al.: Prevalence of disabilities and health care access by disability status and type among adults: United States, 2016. MMWR Morb Mortal Wkly Rep 2018; 67:882–887Crossref, MedlineGoogle Scholar

3. National Institute of Mental Health: Prevalence of any mental illness 2020. https://www.nimh.nih.gov/health/statistics/mental-illness Google Scholar

4. Finucane BM, Ledbetter DH, Vorstman JA: Diagnostic genetic testing for neurodevelopmental psychiatric disorders: closing the gap between recommendation and clinical implementation. Curr Opin Genet Dev 2021; 68:1–8Crossref, MedlineGoogle Scholar

5. Iossifov I, O’Roak BJ, Sanders SJ, et al.: The contribution of de novo coding mutations to autism spectrum disorder. Nature 2014; 515:216–221Crossref, MedlineGoogle Scholar

6. Guo H, Duyzend MH, Coe BP, et al.: Genome sequencing identifies multiple deleterious variants in autism patients with more severe phenotypes. Genet Med 2019; 21:1611–1620Crossref, MedlineGoogle Scholar

7. Ruzzo EK, Pérez-Cano L, Jung J-Y, et al.: Inherited and de novo genetic risk for autism impacts shared networks. Cell 2019; 178:850–866.e26Crossref, MedlineGoogle Scholar

8. Nguyen HT, Bryois J, Kim A, et al.: Integrated Bayesian analysis of rare exonic variants to identify risk genes for schizophrenia and neurodevelopmental disorders. Genome Med 2017; 9:114Crossref, MedlineGoogle Scholar

9. Howrigan DP, Rose SA, Samocha KE, et al.: Exome sequencing in schizophrenia-affected parent–offspring trios reveals risk conferred by protein-coding de novo mutations. Nat Neurosci 2020; 23:185–193Crossref, MedlineGoogle Scholar

10. Kataoka M, Matoba N, Sawada T, et al.: Exome sequencing for bipolar disorder points to roles of de novo loss-of-function and protein-altering mutations. Mol Psychiatry 2016; 21:885–893Crossref, MedlineGoogle Scholar

11. Toma C, Shaw AD, Overs BJ, et al.: De novo gene variants and familial bipolar disorder. JAMA Netw Open 2020; 3:e203382Crossref, MedlineGoogle Scholar

12. Lelieveld SH, Reijnders MRF, Pfundt R, et al.: Meta-analysis of 2,104 trios provides support for 10 new genes for intellectual disability. Nat Neurosci 2016; 19:1194–1196Crossref, MedlineGoogle Scholar

13. Bowling KM, Thompson ML, Amaral MD, et al.: Genomic diagnosis for children with intellectual disability and/or developmental delay. Genome Med 2017; 9:43Crossref, MedlineGoogle Scholar

14. Deciphering Developmental Disorders Study: Prevalence and architecture of de novo mutations in developmental disorders. Nature 2017; 542:433–438Crossref, MedlineGoogle Scholar

15. Wang W, Corominas R, Lin GN: De novo mutations from whole exome sequencing in neurodevelopmental and psychiatric disorders: from discovery to application. Front Genet 2019; 10:258Crossref, MedlineGoogle Scholar

16. Zhu X, Padmanabhan R, Copeland B, et al.: A case-control collapsing analysis identifies epilepsy genes implicated in trio sequencing studies focused on de novo mutations. PLoS Genet 2017; 13:e1007104Crossref, MedlineGoogle Scholar

17. Epi25 Collaborative: Ultra-rare genetic variation in the epilepsies: a whole-exome sequencing study of 17,606 individuals. Am J Hum Genet 2019; 105:267–282Crossref, MedlineGoogle Scholar

18. Gonzalez-Mantilla AJ, Moreno-De-Luca A, Ledbetter DH, et al.: A cross-disorder method to identify novel candidate genes for developmental brain disorders. JAMA Psychiatry 2016; 73:275–283Crossref, MedlineGoogle Scholar

19. Abrahams BS, Arking DE, Campbell DB, et al.: SFARI Gene 2.0: a community-driven knowledgebase for the autism spectrum disorders (ASDs). Mol Autism 2013; 4:36Crossref, MedlineGoogle Scholar

20. Wright CF, Fitzgerald TW, Jones WD, et al.: Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 2015; 385:1305–1314Crossref, MedlineGoogle Scholar

21. Belmadani M, Jacobson M, Holmes N, et al.: VariCarta: a comprehensive database of harmonized genomic variants found in autism spectrum disorder sequencing studies. Autism Res 2019; 12:1728–1736Crossref, MedlineGoogle Scholar

22. Bernier R, Golzio C, Xiong B, et al.: Disruptive CHD8 mutations define a subtype of autism early in development. Cell 2014; 158:263–276Crossref, MedlineGoogle Scholar

23. Männik K, Mägi R, Macé A, et al.: Copy number variations and cognitive phenotypes in unselected populations. JAMA 2015; 313:2044–2054Crossref, MedlineGoogle Scholar

24. Stefansson H, Meyer-Lindenberg A, Steinberg S, et al.: CNVs conferring risk of autism or schizophrenia affect cognition in controls. Nature 2014; 505:361–366Crossref, MedlineGoogle Scholar

25. Martin CL, Wain KE, Oetjens MT, et al.: Identification of neuropsychiatric copy number variants in a health care system population. JAMA Psychiatry 2020; 77:1276–1285Crossref, MedlineGoogle Scholar

26. Crawford K, Bracher-Smith M, Owen D, et al.: Medical consequences of pathogenic CNVs in adults: analysis of the UK Biobank. J Med Genet 2019; 56:131–138Crossref, MedlineGoogle Scholar

27. Ganna A, Satterstrom FK, Zekavat SM, et al.: Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum. Am J Hum Genet 2018; 102:1204–1211Crossref, MedlineGoogle Scholar

28. Rolland T, Cliquet F, Anney RJL, et al: Sub-diagnostic effects of genetic variants associated with autism. medRxiv, April 11, 2022. https://www.medrxiv.org/content/10.1101/2021.02.12.21251621v3Google Scholar

29. Staples J, Maxwell EK, Gosalia N, et al.: Profiling and leveraging relatedness in a precision medicine cohort of 92,455 exomes. Am J Hum Genet 2018; 102:874–889Crossref, MedlineGoogle Scholar

30. Dewey FE, Murray MF, Overton JD, et al.: Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 2016; 354:aaf6814Crossref, MedlineGoogle Scholar

31. Carey DJ, Fetterolf SN, Davis FD, et al.: The Geisinger MyCode community health initiative: an electronic health record–linked biobank for precision medicine research. Genet Med 2016; 18:906–913Crossref, MedlineGoogle Scholar

32. Karczewski KJ, Francioli LC, Tiao G, et al.: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 2020; 581:434–443Crossref, MedlineGoogle Scholar

33. Richards S, Aziz N, Bale S, et al.: Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 2015; 17:405–424Crossref, MedlineGoogle Scholar

34. Tayoun AA, Pesaran T, DiStefano M, et al.: Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criteria. Hum Mutat 2018; 39:1517–1524Crossref, MedlineGoogle Scholar

35. Riggs ER, Andersen EF, Cherry AM, et al.: Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen). Genet Med 2020; 22:245–257Crossref, MedlineGoogle Scholar

36. Bycroft C, Freeman C, Petkova D, et al.: The UK Biobank resource with deep phenotyping and genomic data. Nature 2018; 562:203–209Crossref, MedlineGoogle Scholar

37. Moreno-De-Luca A, Myers SM, Challman TD, et al.: Developmental brain dysfunction: revival and expansion of old concepts based on new genetic evidence. Lancet Neurol 2013; 12:406–414Crossref, MedlineGoogle Scholar

38. Finucane B, Challman TD, Martin CL, et al.: Shift happens: family background influences clinical variability in genetic neurodevelopmental disorders. Genet Med 2016; 18:302–304Crossref, MedlineGoogle Scholar

39. Mitchell KJ: The genetic architecture of neurodevelopmental disorders. bioRxiv, September 19, 2014. https://www.biorxiv.org/content/10.1101/009449v1Google Scholar

40. Finucane B, Oetjens MT, Johns A, et al.: Medical manifestations and health care utilization among adult MyCode participants with neurodevelopmental psychiatric copy number variants. Genet Med 2022; 24:703–711Crossref, MedlineGoogle Scholar

41. Wright CF, West B, Tuke M, et al.: Assessing the pathogenicity, penetrance, and expressivity of putative disease-causing variants in a population setting. Am J Hum Genet 2019; 104:275–286Crossref, MedlineGoogle Scholar

42. Wain KE, Tolwinski K, Palen E, et al.: Population genomic screening for genetic etiologies of neurodevelopmental/psychiatric disorders demonstrates personal utility and positive participant responses. J Pers Med 2021; 11:365Crossref, MedlineGoogle Scholar

43. Moreno-De-Luca A, Evans DW, Boomer KB, et al.: The role of parental cognitive, behavioral, and motor profiles in clinical variability in individuals with chromosome 16p11.2 deletions. JAMA Psychiatry 2015; 72:119–126Crossref, MedlineGoogle Scholar

44. Singh T, Walters JTR, Johnstone M, et al.: The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat Genet 2017; 49:1167–1173Crossref, MedlineGoogle Scholar