Introduction

Cohesin is a multisubunit protein complex consisting of four core proteins: structural maintenance of chromosome 1 (SMC1), structural maintenance of chromosome 3 (SMC3), RAD21 cohesin complex component (RAD21), and stromal antigen (STAG)1. The cohesion subunit STAG1, STAG2, or STAG3 can directly attach to a tripartite ring (comprising SMC1, SMC3, and RAD21) to entrap chromatids1. Other interacting proteins, such as the cohesin loader NIPBL, also regulate the biological functions of cohesion1.

Cohesin is involved in a range of important functions, including functions in sister chromatid cohesion, DNA repair, transcriptional regulation, and architecture1,2. Hence, germline pathogenic variants of genes encoding cohesin subunits and their interacting proteins, such as NIPBL, SMC1A, SMC3, and RAD21, are known to cause developmental disorders referred to as cohesinopathies3, and these are characterized by intellectual disability (ID), growth retardation, and limb abnormalities4.

Recently, STAG2 was added to the list of genes mutated in cohesinopathies5,6. As STAG2 is essential for DNA replication fork progression, STAG2 defects may result in replication fork stalling and collapse with disruption of the interaction between the cohesin ring and the replication machinery as previously described7. To date, 16 pathogenic variants of STAG2 have been reported, including seven nonsense, four missense, one splicing, and four frameshift variants5,8,9,10,11,12,13. Notably, seven male patients in three families harbored missense variants. In one family, five affected males showed ID and congenital abnormalities11, and two other sporadic males were reported to have dysmorphic features, short stature, hypotonia, developmental delay (DD) and ID9,10. Female patients had truncated and missense variants5,8,10,12,13. Here, we describe the genetic and clinical features of two female cases with de novo nonsense variants of STAG2.

Case 1 was the second conceptus of healthy Japanese nonconsanguineous parents (a 35-year-old mother and 37-year-old father). At 15 gestational weeks, holoprosencephaly, cleft palate, cleft lip, blepharophimosis, nasal bone absence, and hypolastic left heart were noted by ultrasonography. The fetal karyotype determined by amniocentesis at 18 gestational weeks was normal (46,XX). The pregnancy was terminated at 21 gestational weeks because of multiple fetal abnormalities.

Case 2 was a 7-year-old girl who was born as the second child to healthy nonconsanguineous parents. She was born uneventfully at full term. Her birth weight was 2734 g (–1.3 SD). A cleft palate was noted at birth and surgically repaired at 1 year. She presented with mild dysmorphic features, including a long philtrum. At 8 months, she developed afebrile convulsions for which carbamazepine was effective. Anticonvulsants were discontinued at 4 years with no subsequent attacks. She acquired independent gait at 2 years and spoke only a few words at 7 years. Brain magnetic resonance imaging at 7 years revealed white matter hypoplasia. She currently has mild DD, ID, sensorineural hearing loss, and amblyopia with no neurologic abnormalities. She attends a school for hearing-impaired children.

This study was approved by the institutional review board of Yokohama City University School of Medicine. WES was performed in the two cases (Cases 1 and 2) and their parents. Blood leukocytes from the patient (Case 2), parents (Cases 1 and 2) and umbilical cord (Case 1) were obtained after obtaining informed consent. Exome data acquisition, processing, annotation, and filtering and variant calling were performed as previously described14. Possible pathogenic variants were evaluated based on mutational type (nonsense, missense, frameshift, or splice site) using the SIFT score (http://sift.jcvi.org/), Polyphen-2 (http://genetics.bwh.harvard.edu/pph2/), Mutation Taster (http://MutationTaster.org/), and CADD (https://cadd.gs.washington.edu/). Possible pathogenic variants were validated by Sanger sequencing. Parentage was confirmed using 12 microsatellite markers with Gene Mapper software v4.1.1 (Life Technologies Inc., Carlsbad, CA).

Total RNA was extracted from lymphoblastoid cell lines (LCLs) with the RNeasy Plus Mini Kit (Qiagen, Germany) and, reverse-transcribed to cDNA with the Super Script First Strand Synthesis System (Takara, Japan), and the cDNA used as templates for RT-PCR. PCR amplicons were subjected to Sanger sequencing.

CNVs were examined using WES data by two algorithms: the eXome Hidden Markov Model15, and a program based on the relative depth of coverage ratios developed by Nord et al.16.

X chromosome inactivation was determined using the human androgen receptor gene. X-inactivation ratios (expressed arbitrarily as a ratio of the smaller allele to the larger allele) were calculated twice and judged as published criteria: <20:80 (random), >20:80 (skewed), and >10:90 (highly skewed)17.

Ten micrograms of sheared DNA was subjected to library preparation using a single-molecule real-time (SMRT)bell Express Template Prep Kit 2.0 (Pacific Biosciences, 100-938-900) and a SMRTbell Enzyme Cleanup Kit (Pacific Biosciences, 101-746-400) in accordance with the manufacturer’s instructions (Procedure & Checklist - Preparing HiFi SMRTbell® Libraries using SMRTbell Express Template Prep Kit 2.0, Pacific Biosciences). One SMRT cell was used for the patient (Case 1). Secondary analysis using base-called data was performed using SMRT analysis v8.0 (Pacific Biosciences). Circular consensus sequencing (CCS) from single molecules was performed, and the generated sequence was mapped to the hg19 human reference genome using the CCS with Mapping application, provided by SMRT analysis, with the default settings. DeepVariant 0.9.0 (https://github.com/google/deepvariant) was used to detect SNVs and indels in CCS reads. The aligned CCS BAM data from the CCS with Mapping application were used as an input. We ran Google DeepVariant with a model trained for PacBio CCS (--model_type=PACBIO) using the prebuilt Docker image from the DeepVariant public repository (https://github.com/google/deepvariant). Small variant calls from DeepVariant were haplotyped and phased using WhatsHap 0.18 (https://whatshap.readthedocs.io/en/latest/).

We first performed WES in Case 1. Case 1 had no pathogenic variants in 14 known mutated genes associated with holoprosencephaly, namely, SHH, ZIC2, SIX3, TGIF1, GLI2, PTCH1, DISP1, FGF8, FOXH1, NODAL, TDGF1, GAS1, DLL1, and CDON. Moreover, no pathogenic CNVs were identified by exome-based CNV analysis. After analyzing trio-based WES data, three de novo variants were found. (Supplementary Table S1), but two missense variants were likely benign based on computational predictions. The remaining de novo nonsense variant [c.3097 C>T, p.(Arg1033*)] of STAG2 was confirmed by Sanger sequencing (Fig. 1a) and was likely causative. X inactivation was highly skewed (93:7), and the paternal X chromosome was inactivated (Supplementary Fig. S1). Unfortunately, living cells from Case 1 could not be obtained for further mRNA analysis. Using HiFi long-read genome sequencing and haplotype phasing with informative variants, we constructed haplotypes in the vicinity of STAG2 and confirmed that the STAG2 variant occurred de novo on the paternal chromosome in Case 1 (Fig. 2). As the paternal X chromosome is mostly inactivated in blood leukocytes, the X inactivation pattern should be favorable in Case 1. We also identified another STAG2 nonsense mutation [c.2229 G>A, p.(Trp743*)] occurring de novo in Case 2 (Fig. 1a, Table 1). X inactivation was highly skewed (96:4), and the maternal X chromosome was inactivated (Supplementary Fig. S1). RT-PCR indicated that only the wild-type allele was expressed in LCLs of Case 2 (Supplementary Fig. S2). Even after cycloheximide treatment, the mutant allele was completely undetectable, suggesting that it was transcriptionally repressed (through favorably skewed X inactivation) rather than posttranscriptionally diminished (through nonsense-mediated mRNA decay) in cultured LCLs. Regardless of the favorable X inactivation pattern in both cases, Case 1 was clinically much more severe than Case 2. Therefore, it is difficult to discuss phenotype severity in relation to the X inactivation pattern.

Fig. 1: Summary of pathogenic variants of STAG2.
figure 1

a Familial pedigrees and electropherograms of STAG2 variants [Case 1: c.3097 C>T: p.(Arg1033*), Case 2: c.2229 G>A, (p.Trp743*)]. The arrow indicates a heterozygous variant. wt, wild-type; mut, mutation. b Functional domain of the STAG2 protein and pathogenic variants. Truncating and missense variants are shown above and below the protein, respectively. Our cases are shown in bold. The STAG domain predicted by Pfam is shown (http://pfam.xfam.org).

Fig. 2: Confirmation of the pathogenic STAG2 variant in the paternal chromosome in Case 1.
figure 2

a The c.3097 C>T variant could be successfully mapped within the 450-kb phased haplotype block in Case 1 using HiFi sequence and haplotype phasing. b SNP typing confirmed that the mutation occurred in the paternal chromosome (Allele 2, the brown haplotype block in Fig. 2a). POS: position of sequence, Pt: patient, Fa: father, Mo: mother.

Table 1 Clinical features of patients with STAG2 variants.

In addition to the two variants in our cases, a total of 16 pathogenic variants of STAG2 have been reported in unrelated families (Table 1), including 12 truncated variants [p.(Arg69*), p.(Gln140*), p.(Arg146*), p.(Arg259*), p.(Cys535*), p.(Val547Cysfs*29), p.(Lys553Ilefs*6), p.(Arg614*), p.(Ala638Valfs*10), p.(Glu968Serfs*), p.(Arg1012*), and c.2533+ 1 G] and four missense variants [p.(Tyr159Cys), p.(Ser327Asn), p.(Arg604Gln), and p.(Lys1009Asn)]5,8,9,10,11,12,13.

For a female patient with p.(Ala638Valfs*10), no detailed phenotype was provided in the DECIPHER database, therefore, this patient was omitted for further comparison of clinical features8. Twelve cases with STAG2 truncation variants reported in the literature were all females, and one missense variant was reported in a female patient. The 12 female cases with STAG2 truncation shared microcephaly (10/12), abnormal brain MRI findings (10/12, including holoprosencephaly 7/12), thoracic vertebral anomalies (6/12), DD (9/12), and ID (4/12). Case 1 showed severe clinical features, such as holoprosencephaly and hypoplastic left heart, similar to previous literature, while Case 2’s clinical features were relatively mild. p.(Arg69*) was recurrent in two unrelated patients, both showing middle brain anomalies8,12.

Our two cases showed highly skewed X inactivation (93:7 in Case 1 and 96:4 in Case 2) (Supplementary Fig. S1). To date, X inactivation analysis has been reported in only two cases, one with skewed X inactivation and another with random X inactivation, but the exact ratios were not shown in the literature10,12. Interestingly, one splicing variant (c.2533+ 1 G) was inherited from the mother, but unfortunately, an X inactivation study was not conducted12.

In contrast, null STAG2 variants in males have never been reported. We speculate that males with a hemizygous truncating STAG2 aberration are lethal or show severe fetal clinical ends. Interestingly, one missense variant [p.(Ser327Asn)] was transmitted in an X-linked recessive manner in a family with five affected males and two healthy carrier females11. These five males showed ID (5/5), several facial dysmorphisms [large nose (5/5), prominent ears (5/5), frontal baldness (4/5)], hearing loss (3/5), short stature (5/5), and cleft palate (1/5). An additional two hemizygous missense variants [p.(Tyr159Cys) and p.(Lys1009Asn)] were recently reported in two unrelated males9,10. They showed facial dysmorphisms (2/2), cleft lip and palate (1/2), pituitary gland abnormality (1/2), patent foramen ovale (1/2), hypotonia (2/2), DD (2/2), and ID (2/2), as seen in the above family. Male patients with missense variants exhibited milder mutant effects than those with truncated variants, as expected.

In conclusion, of the two female patients with STAG2 variants, one showed a severe prenatal phenotype, while the other showed a mild pediatric phenotype. X inactivation was highly skewed in both cases. This phenotypic difference might depend on another factor, such as a modifier, that is yet to be found.