Combining patient data from five institutions, publicly available databases, and published literature, we established a dataset of 2,288 patients with SF3B1-mutant MDS or AML in which exons 13 through 16 had been sequenced. We first determined how SF3B1 mutations partitioned into WHO 2016 classifications, as these data were available for virtually all patients. This distribution showed several asymmetries (Fig. 1 and Supplementary Fig. 1). Consistent with our previous report14, K666N was enriched in higher-risk disease types: only 2.1% (28/1327) in MDS-RS vs 8.7% (21/242) in MDS-SLD/MLD, 17.3% (46/266) in MDS-EB, and 25.8% (103/399) in AML (p < 0.0001 for all). With the inclusion of cases in which exon 13 had been sequenced, the distribution also revealed enrichment of a variant in this exon, E592K, in higher-risk disease: only 0.08% (1/1327) in MDS-RS vs 3.7% in (9/242) in MDS-SLD/MLD, 3.4% (9/266) in MDS-EB, and 2.5% (10/399) in AML (p < 0.0001 for all). Conversely, E622D was decreased in higher-risk disease: 6.9% (92/1327) in MDS-RS vs 2.9% (7/242) in MDS-SLD/MLD (p < 0.05), 0.4% (1/266) in MDS-EB (p < 0.0001), and 0.5% (2/399) in AML (p < 0.0001). K666R also decreased: 5.4% (72/1327) in MDS-RS vs 1.2% (3/242) in MDS-SLD/MLD, 1.1% (3/266) in MDS-EB, and 1.8% (7/399) in AML (p < 0.01 for all). These data confirm and extend the scope of asymmetrical partitioning of distinct SF3B1 hotspots in low- and high-risk disease.
We next examined RNA splicing events produced by those SF3B1 mutations with the most significant asymmetric partitioning. We expressed FLAG-tagged codon-optimized constructs with the E592K, E622D, K666N, and K666R mutations in HEK293 cells, along with wild type SF3B1 and the dominant K700E mutation. For additional comparison, we expressed variants from solid tumors that are rare (R625H, K741N) or absent (E902K) in myeloid malignancies22. These transfections produced a comparable level of endogenous and exogenous SF3B1 protein (Fig. 2). Upon expression of most hotspots, we observed missplicing in junctions known to be affected by SF3B1 mutations, including SLTM, ZDHHC16 and MAP3K7 (Fig. 2)2,6. This included MDS mutations K700E, E622D, K666R, and K666N but also solid tumor hotspots R625H and K741N, the last of which produced lower magnitude missplicing as recently reported in the context of uveal melanoma23. In contrast, these junctions were not misspliced by E902K, which showed its own unique missplicing pattern as was noted in TCGA E902K bladder cancer samples (Supplementary Fig. 2)22. We also found that two other mutant SF3B1-associated junctions, in DLST and UQCC1, were misspliced to high magnitude by most mutations but were unaffected by K666N, confirming that the pattern of missplicing by K666N is different from that of other MDS/AML-associated hotspots14,24. Notably, this pattern occurred with K666N, but not K666R, demonstrating these variants are not functionally equivalent even though they affect the same starting amino acid, a phenomenon also observed with variants in the yeast homologue of SF3B125. Finally, conspicuously absent was any missplicing of these events by E592K. Coupled with its enrichment in high-risk disease, this distinct missplicing pattern motivated us to investigate E592K further.
We characterized the clinical features of E592K patients in more detail. The hotspot-agnostic collection of all patients with exon 13–16 sequencing had produced 29 cases of E592K. By specifically seeking out additional cases from multiple institutions, we gathered a total of 39 patients with E592K-mutated myeloid neoplasms, 35 of which were MDS or AML (Supplementary Table 2). This expanded E592K cohort also showed enrichment in higher-risk 2016 WHO classifications, and they had higher IPSS-R scores and lower platelets than cases with exon 14–16 mutations (Fig. 3A-C). Hemoglobin and WBC were not significantly different (Fig. 3D-E). E592K cases also had a notable lack of RS (Fig. 3F), with only one instance of low-blast E592K MDS reporting any RS, at 8%, therein being the only E592K patient to meet WHO 2016 criteria for MDS-RS (> 5% RS if SF3B1 mutation present). By contrast, for low-blast MDS with exon 14–16 mutations in which exact RS percentages were available, the average was 37%, consistent with the known pathological and mechanistic links between mutant SF3B1 and RS in MDS (Fig. 3G)5,26. E592K MDS/AML also had a different co-mutation profile (Fig. 4 and Supplementary Fig. 3–5). A striking distinction was the nearly ubiquitous co-mutation of ASXL1 (83%) in E592K cases, compared to only 9% in exon 14–16 disease (p < 0.0001). High ASXL1 co-mutation characterized both low and high blast E592K cases, suggesting this relationship occurs early in disease evolution (Supplementary Fig. 3–5). Also notable was the near absence of DNMT3A co-mutations (2.9%) with E592K, despite DNMT3A being the 2nd most commonly co-mutated gene (21%) in exon 14–16 disease (p < 0.01). In fact, the one DNMT3A mutation in E592K MDS was a VAF of 2.3% in a sample in which the VAFs for E592K and ASXL1 mutations were 20%, raising the possibility that the DNMT3A mutation was in a separate clone with wild type SF3B1. Furthermore, E592K patients had increased co-mutations of RUNX1 (51% vs 11%) and STAG2 (29% vs 3%) (p < 0.0001 for both). Finally, consistent with these high-risk MDS clinical features, both overall survival and leukemia-free survival were markedly shorter in E592K patients (Fig. 5A-B), including two patients (#9 and #1) who progressed from MDS to AML at 6 and 9 months, respectively, despite lower-risk prognostication scores from both the IPSS-M and SEX-GSS (Fig. 5C-D)10,11.
Because our initial splicing analysis merely showed the absence of known SF3B1-mutant events in E592K cells, we next sought missplicing events specifically induced by this variant. To do so, we stably expressed the WT, K700E, and E592K mutations through lentiviral delivery in TF1 cells, isolated multiple independent clones for each genotype, and performed RNA-seq to quantify percent spliced in (PSI) values for all splice junctions (Fig. 6A). K700E produced a characteristic pattern of increased expression of junctions using alternative 3’ acceptors, with high magnitude missplicing of genes like MAP3K7 and ZDHCC16, as expected. In contrast, the E592K mutation produced a fundamentally different pattern of missplicing, with its affected genes nonoverlapping with those misspliced by K700E, and vice-versa. Western blotting showed that exogenous FLAG-tagged SF3B1 was equal or less than the endogenous SF3B1 form, not overexpressed above it (Fig. 6B). We then validated several missplicing events with endpoint and quantitative PCR assays. For K700E-specific junctions, this included missplicing of ZDHHC16, TMEM14C, and ABCB7 (Fig. 6B-C). For E592K-specific junctions, this included RAVER2, CEP43, NUTM2A-AS1, and EZH2 (Fig. 6D). Interestingly, the EZH2 missplicing event was an alternate acceptor in intron 12, creating a premature termination codon predicted to activate nonsense-mediated decay (Fig. 6E). We further validated the hotspot specificity of these events in two other cell contexts: transiently-transfected HEK293T and stably-transduced K562 cells (Supplementary Fig. 6). In all cases, K700E-dependent and E592K-dependent missplicing events were present and distinct. Of note, missplicing of TMEM14C and ABCB7 were recently shown to drive ring sideroblast formation in iPSCs derived from SF3B1-mutant MDS5. Both genes were clearly misspliced by K700E, but not by E592K, consistent with the lack of sideroblastic anemia in E592K patients. Together, these data show that the E592K variant of SF3B1 has a unique pattern of RNA missplicing.
In addition to shared RNA missplicing events, the most well-studied SF3B1 hotspot mutations also share a specific biochemical defect: disruption of the interaction between SF3B1 and SUGP117. This disruption is not incidental to missplicing but directly mediates it; inactivation of SUGP1 recapitulates SF3B1-mutant missplicing and overexpression of SUGP1 partially rescues it17. We therefore asked whether E592K, with its nonoverlapping missplicing events, might preserve the interaction of SF3B1 with SUGP1. Indeed, when His6-FLAG-SF3B1 variants were introduced into HEK293T cells and affinity purified using anti-DYKDDDDK (FLAG) antibody and cobalt beads, the association of His6-FLAG-SF3B1 with endogenous SUGP1 was disrupted by K700E but not by wild type or E592K SF3B1 (Fig. 7). We then asked whether E592K might instead disrupt the interaction between SF3B1 and PHF5A, given that E592 is at the interface with PHF5A27. However, this interaction was preserved by each His6-FLAG-SF3B1 variant (Fig. 7). We also observed these effects in a second cell context, TF1 cells expressing FLAG-SF3B1 variants (Supplementary Fig. 7). Together, these data indicate that, consistent with its induction of unique RNA missplicing events, the E592K variant does not participate in the disruption of the SF3B1-SUGP1 interaction that drives the cryptic splicing of other SF3B1 mutations.
Finally, we identified and analyzed RNA-seq from two patients with E592K mutation, compared to other SF3B1 mutations, from the recent MLL cohort of spliceosome-mutant myeloid malignancy patients28. Inspection of the junctions validated in our cell models also demonstrated the same specificity of missplicing in these primary patient samples, when compared to other SF3B1 mutations (Fig. 8A). In addition, endpoint PCR validation of TMEM14C and RAVER2 from a third primary E592K sample at a separate institution demonstrated the same hotspot specificity of RNA missplicing (Fig. 8B), validating the findings from our cell models.