Introduction

Intellectual disability (ID) affects approximately 1–2% of the population in Western countries and represents a significant health burden on affected individuals and families. The majority of cases with severe ID are caused by genetic mutations [1,2,3], but the diagnosis of individual cases remains challenging due to the large genotypic and phenotypic heterogeneity. Over the past years, whole exome and whole genome sequencing studies have proven to be valuable discovery tools to identify novel candidate genes in a broad spectrum of neurodevelopmental disorders [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. However, finding sufficient evidence for causality after the initial candidate locus or gene discovery can prove difficult. In addition to functional analyses of mutations, increasing the number of individuals with a (de novo) mutation in the same gene and performing deep-phenotyping in the respective individuals is not only essential to establish causality, but also to gain insight into the phenotypic spectrum related to the mutated gene. This approach is called reverse phenotyping and has been shown to be a highly effective strategy [20,21,22,23]. Moreover, it allows for quantification of the phenotypic information which helps to identify novel disease genes [24]. To increase the number of individuals with a mutation in the same gene, fast and affordable sequencing of candidate genes in a large cohort is required. Recently, molecular inversion probes (MIPs) have been optimized to fulfill these criteria [25], and have been used to identify and confirm genes for their involvement in other similarly heterogeneous phenotypes, including autism spectrum disorder [25,26,27,28], epilepsy [29, 30] and oligodontia [31].

Here we applied a genotype-first approach, in which we screened a large cohort of 3,275 individuals with ID for mutations in the coding sequence of 24 previously reported candidate ID genes [5, 6]. These genes were previously denoted as candidate ID genes based on the identification of a de novo mutation in an individual with severe ID, for which mutation severity was predicted to be (likely) pathogenic [5, 6]. The goal of this study was to identify additional individuals with ID and a de novo mutation in these genes, thereby confirming their implication in disease, and to determine whether particular genotypes manifest as clinically recognizable phenotypic entities.

Materials and methods

We performed targeted re-sequencing of the coding sequence of 24 candidate ID genes (Table 1 ) as described previously [25, 26]. A set of 1,490 MIPs was designed and DNA from 3,275 individuals with ID was collected through an international collaboration (Nijmegen, The Netherlands; Troina, Italy, Seattle, United States). MIP libraries of a maximum of 192 samples each were pooled and sequenced on one lane of a HiSeq2000 instrument (Illumina), resulting in an average coverage of >200 fold per individual. Sequence reads were mapped to UCSC human reference genome assembly hg19 and variants were called as described previously [25]. Variant annotation was performed with an in-house pipeline [3], providing information of the variant effect at cDNA and protein level, evolutionary conservation scores, and frequency information based on > 15,000 in-house exomes and public databases such as ExAC and GoNL [32,33,34].

Table 1 Overview of 24 candidate ID genes with all (de novo) LoF variants in ID/DD identified by MIPs and publicly available large-scale trio-based sequencing studies

Results and discussion

First, we focused on all likely loss-of-function (LoF) variants, defined as stop-gain, frameshift and canonical splice site mutations, predicted to result in haploinsufficiency, as the majority of genes implicated in neurodevelopmental disorders (such as ID) known to date have haploinsufficiency as their underlying pathophysiological mechanism (Supplemental Table 1 ). We identified 40 individuals with ID who carried a LoF variant in 14 different candidate ID genes (Table 1, Supplemental Table 2). Notably, for 7 of these 14 genes, we observed more LoF variants in our cohort of individuals with ID/DD than expected when compared to population frequencies using ExAC (Table 1 ). Interestingly, for some genes, such as MIB1 we identified more LoF variants in our cohort than expected, whereas in fact population constraints metrics indicate that the gene is tolerant to such loss-of-function (pLI = 0; Table 1). In line with this observation, MIB1 LoF variants could be low-penetrance or oligogenic risk factors for nonsyndromic ID. One would hypothesize that phenotypic comparison of individuals with de novo LoF in these genes would, apart from the ID resulting from inclusion bias, not yield overlap. Indeed, when individuals with these LoF mutations were invited back into clinic, and a thorough clinical evaluation was performed, no consistent phenotype, other than ID, was observed. Moreover, the majority of the variants were inherited from a healthy parent (Supplemental Table 2 ). Of note, Individual 3 carried two de novo LoF mutations in both MIB1 and PHIP, without a remarkably different phenotype from other PHIP mutation carriers.

For 19 of 40 individuals with LoF variants in these 14 candidate ID genes, parental DNAs were available to establish the inheritance pattern and showed that 10 had occurred de novo and 9 were inherited through a parent (Table 1, Supplemental Table 2). We next compared the number of LoF de novo mutations identified in our cohort to the expected number based on the gene specific de novo mutation rate, as described before [10, 35]. These analyses showed that we identified an enrichment of de novo LoF mutations in MYT1L (p = 7.59e-03), TRIO (p = 2.63e-03), WAC (p = 4.71e-3) and PHIP (p = 0.043) in our cohort of 3,275 individuals with ID/DD (Table 1 ). To increase power, we selected for de novo mutations in individuals with a neurodevelopmental disorder (ID/DD; n = 5,164 individuals) from denovo-db [36] to have occurred in either of the 24 candidate ID genes, excluding those identified in the initial discovery cohort [6]. This selection yielded 98 de novo mutations (Supplemental Table 3 ). Assessment for enrichment of DNMs (de novo mutations) in our 24 candidate ID genes identified 13 genes (54%) with enrichment for DNMs, suggesting that these genes may be involved in the etiology of ID/DD. For 10 of these 13 genes previous detailed genotype-phenotype studies conclusively linked these to ID/DD pathology (Table 1) [28, 37,38,39,40,41,42,43,44,45,46,47,48,49,50]. Focusing on the identification of genes exerting their effects through haploinsufficiency, we next statistically combined the data obtained through MIPs and those published (Table 1 ). This analysis identified the same four genes, MYT1L, WAC, TRIO, and PHIP (Table 1 ), that showed enrichment for LoF mutations in the total cohort of 8,439 individuals with ID/DD as in our targeted assay. As genotype-phenotype studies for MYT1L, TRIO, and WAC had already been performed [45, 48,49,50], including the individuals described in Table 1, we therefore focused on PHIP.

Our MIP screen identified five individuals with LoF mutations in PHIP, encoding Pleckstrin Homology Domain-Interacting Protein [MIM: 612870]. We used a previously published statistical model to determine whether the 5 initially identified individuals with LoF mutations in PHIP in the cohort consisting of 3,275 individuals with ID were phenotypically more similar to each other than could be expected for any 5 random individuals within the cohort [24]. Based on 26 commonly assessed phenotype features we found that the phenotypic similarity between the 5 individuals with a LoF mutation in PHIP was much larger than by random chance (p = 6.25e-05, permutation test, Supplementary Figure 1). In addition, we screened all private missense variants yielding one additional de novo mutation. Next we reached out to colleagues and data sharing resources to identify additional individuals with de novo mutations in PHIP. These efforts collectively identified 23 individuals with DD/ID and a pathogenic PHIP mutation (Fig. 1, Supplemental Table 4 ). For fourteen mutations, de novo origin could be confirmed by testing parental DNA samples. Interestingly, we identified 2 sib-pairs with PHIP mutations. For one of the sib-pairs (Individuals 18 and 19), we could prove inheritance of the mutation through their father, who himself also experienced learning difficulties. For the other sib-pair (Individuals 6 and 7) the mutation could not be detected in DNA from the mother and paternal DNA was unavailable for testing, leaving both paternal inheritance as well as germline mosaicism as possible explanations. Two individuals did not have pathogenic PHIP point mutations but gross chromosomal abnormalities, either disrupting the coding sequence of PHIP through translocation (Individual 13), or through total PHIP deletion (Individual 14; Supplemental Figure 2), and both had occurred de novo.

Fig. 1
figure 1

Photographs of individuals with a mutation in PHIP, overview of the PHIP gene with the 2 isoforms and location of mutations. a Photographs of 14 individuals with a mutation in PHIP of whom Individuals 18 and 19 were sisters. Shared facial features include a high forehead, full eyebrows/synophrys, characteristic upturned nose with thick alae nasi, a long philtrum and thin lips. Photographs are shown with informed consent of the individual or his/her parents. b Schematic representation of the PHIP gene (NM_017934.6) and the two protein isoforms PHIP/DCAF14 and NDRP. The point mutations are scattered throughout the coding sequence. The structural variants identified in Individuals 13 and 14 are depicted in detail in Supplemental Fig. 2. IRS, insulin receptor substrate domain; PH, pleckstrin homology domain

Detailed clinical information on the 23 individuals with a mutation in, or deletion of, PHIP is provided in Supplemental table 4 and the Supplemental Note Case reports. For all photographs shown (Fig. 1, Supplemental Fig. 3 ) informed consent was obtained. An overview of the main clinical features is shown in Table 2.

Table 2 Summary of clinical features of individuals with a mutation in PHIP

Most individuals had mild-severe ID whereas the individuals who did not have ID, had speech problems, global developmental delay in early childhood or learning difficulties. Behavioral problems were observed in 18 individuals (78%) and consisted of ADHD, features of autism, problems with impulse control, aggressive behavior, mood disorder, and anxiety. In addition to the DD/ID, the other most striking and overlapping feature in these individuals was overweight, which was observed in 17 individuals (74%), of whom 9 were obese.

Shared dysmorphic facial features comprised high forehead (67%), full eyebrows/synophrys (59%) with 15 individuals having a characteristic upturned nose with thick alae nasi (68%), a long philtrum (45%) and thin lips (36%) (Fig. 1 ). 14 individuals had large ears (64%) and 11 individuals had thick helices and earlobes (50%) (Supplemental Fig. 3 ). Most individuals (76%) had tapering fingers, bilateral clinodactyly of the fifth finger (64%) and some had bilateral cutaneous syndactyly of the second and third toes (30%) (Supplemental Fig. 3 ). Other shared features included urogenital problems (22%), vision problems (65%) and hypo/hyperpigmentation (35%).

In addition to our first report in 2012 [6], mutations in PHIP have recently also been reported in two individuals from an ASD cohort [28] and two individuals with ID, behavioral problems, overweight and characteristic facial features [51]. These reports confirm our hypothesis that mutations in PHIP lead to an ID-overweight syndrome and behavioral problems.

PHIP is translated into two different protein-isoforms: the longest isoform being Pleckstrin Homology Domain-Interacting Protein/DDB1- and CUL4-associated factor 14 (PHIP/DCAF14), and the shorter one being Neuronal Differentiation-Related Protein (NDRP), lacking the C-terminal part, when compared to the longest isoform (Fig. 1 ). The full length PHIP/DCAF14 protein contains (i) eight C-terminal WD40 repeats, forming an eight-bladed beta-propeller for protein-protein interactions, (ii) an insulin receptor substrate-1 (IRS-1) domain iii) two bromodomains, for protein-protein interaction and complex-assembly involved in transcriptional activation, and iv) a nuclear localization signal for import into the nucleus by nuclear transport. In addition to these well-defined domains, it also contains a region that is involved in pleckstrin homology (PH) domain-binding (Fig. 1 ).

Ndrp is expressed in neural tissues of mouse embryos suggesting an important role in normal neurodevelopment [52]. Depending on the developmental stage, it is either localized to the nucleus or cytosol [52]. However, there is paucity of studies on this particular isoform.

The presence of two different PHIP isoforms also leads to a speculation on likely genotype-phenotype correlations. Interestingly, insight into this matter is provided by Individual 13, who carried a translocation disrupting PHIP. Quantitative PCR in blood-derived RNA from this individual using two different primer sets, one targeting the exon-exon boundary of exons 10 and 11 (set 1), and one targeting the boundary of exons 35 and 36 (set 2), shows normal expression levels for set 1 when compared to unaffected controls, whereas set 2 shows reduced expression consistent with haploinsufficiency (Supplemental Fig. 2 ). This combination can only be explained by normal (and stable) expression of the shorter NDRP isoform, and abrogated expression of the full length PHIP/DCAF14. Given that Individual 13 is phenotypically indistinguishable from the individuals with disruptive mutations in the more C and N-terminal parts of the protein, it seems unlikely that NDRP is responsible for the phenotype observed, and suggests that disruption of the full length PHIP/DCAF14 is driving the observed phenotypic spectrum.

PHIP/DCAF14 has been shown to play an important role in several processes linked to neurodevelopment. First, PHIP/DCAF14 is one of the multiple substrate receptors of the proteolytic CUL4-DDB1, a cullin-RING finger ligase complex that binds target proteins for ubiquitination which results in degradation of the protein [53]. Mutations in the X-linked CUL4B component of the complex, lead to an ID syndrome in males, with macrocephaly, epilepsy and cerebral malformations [54,55,56]. Second, PHIP/DCAF14 is considered a regulator of the insulin and insulin-like growth factor signaling pathways via specific binding of the PH domain of the IRS-1 [57, 58]. Insulin receptor (IR) signaling plays an important role in the brain since IRs are found in synapses and dendrites of neurons and insulin stimulates the formation of dendritic spines and synapses via the PI3K/Akt/mTOR and Rac1 pathways [10, 59, 60]. Although the name and function might suggest a role in the development of diabetes, only 1 individual in our cohort (Individual 22) had type 2 diabetes, which could also be caused by her obesity.

Third, PHIP1/DCAF14 belongs to bromodomain family III which includes EP300 (E1A-binding protein p300 [MIM: 602700]), CREBBP (CREB-binding protein [MIM: 600140]) and BRWD3 (Bromodomain- and WD repeat-containing protein 3 [MIM: 300553]) [61]. Mutations in EP300 and CREBBP are known causes of Rubinstein-Taybi syndrome whereas BRWD3 mutations result in an X-linked form of ID (MRX93) [62]. Although individuals with a mutation in PHIP are facially distinct from Rubinstein-Taybi syndrome, their facial features do resemble the facial features of individuals with a mutation in BRWD3, such as the high forehead, synophrys and large ears. Notably, a substantial fraction of the individuals with mutations in either EP300, CREBBP [OMIM #180849, #613684], or BRWD3 are overweight [62,63,64]. PHIP itself seems to have a positive effect on cell-growth [65]. Mice deficient of Phip show postnatal growth retardation that is similar to IGF-1 null mice [66]. Although individuals with ID have a higher risk of being overweight in general [67], it is well known that some specific ID syndromes are associated with overweight/obesity [68] and based on our data, it appears that mutations in PHIP can be added to the list of ID-overweight syndromes.

In summary, we present a large genotype-first, targeted-re-sequencing effort, using MIPs for 24 candidate ID genes in >3,500 individuals with ID, aimed at the identification of additional mutations in these individuals. This screen, in combination with database resources and reverse phenotyping, verified the true nature of 11 candidate ID genes, and identified an ID-overweight syndrome caused by PHIP haploinsufficiency. The latter provides not only novel insights into the genetic basis and physiology of intellectual disability but overweight as well.

Web resources

ExAC Browser, http://exac.broadinstitute.org/

GeneMatcher, https://genematcher.org/

GoNL, http://www.nlgenome.nl/

OMIM, http://www.omim.org/