Histone H3, lysine 4 methylation in animals

Despite the discovery of histone proteins in the 1800s, we have only recently begun to appreciate the immense regulatory role that they play in DNA replication, repair, nuclear organization, and gene expression. Post-translational modifications of all four core histones, H2A, H2B, H3, and H4, are a major determinant of gene expression. Furthermore, post-translational modifications are dynamic and connected to signaling pathways that respond to intracellular and extracellular cues. In this review, we focus on histone H3 lysine 4 methylation (H3K4me) in hematopoietic lineage cells, in both normal physiologic settings, and in hematologic malignancies.

What is the function of this particular H3 modification? First, H3K4 can be mono-, di-, or tri-methylated, with all three modifications co-existing and distributed distinctly, not just representing transition states. H3K4me3/2/1 is associated with euchromatin rather than heterochromatin but plays several roles in active chromatin. Genome-wide chromatin immunoprecipitation-sequencing experiments (ChIP-seq) have illustrated that H3K4me3 is enriched at transcription start sites (TSS) of active genes (Fig. 1) in an asymmetric manner [13]. Recent studies examining the breadth of the H3K4me3 enrichment (rather than the absolute enrichment) across the TSS of genes have indicated that transcriptional fidelity and elongation may be enhanced at these genes [4, 5].

Fig. 1
figure 1

H3K4 methylation patterns at four distinct categories of promoters. “Active expressed” gene promoters are enriched for H3K4me3 modification adjacent to the TSS, with the dip at the start site reflecting a nucleosome free region. “Tissue-specific expressed” genes are enriched for H3K4me2 from the promoter region into the gene body, and typically have an active expressed gene pattern of H3K4me3 as well. “Repressed/poised” gene promoters are associated with broad H3K4me1 across the entire promoter along with the H3K27me3 repressive mark; upon activation, H3K4me1 is converted to H3K4me3 and the repressive mark is replaced. “Bivalent” promoters are characterized by the coexistence of H3K27me3 and H3K4me3 and are primed for lineage-restricted expression or permanent repression. The dashed arrow at the TSS indicates no/low transcriptional output

H3K4me2 also associates with the TSS, but is distributed distinctly in the gene body depending on the class of genes assessed. For example, a meta-analysis of ChIP-seq data identified three classes of genes based on H3K4me2 distribution, with a particular gene-body-biased distribution identifying highly regulated tissue-specific genes (Fig. 1; [6]). In contrast, H3K4me1 is enriched at enhancer element distant from the promoter, representing a defining feature of these regulatory elements [7].

In addition, recent studies have illuminated a new class of poised, tissue-specific gene-harboring H3K4me1 that extends broadly into the gene body (Fig. 1). These promoters are also enriched for H3K27me3 (a modification associated with silenced genes), but lack H3K4me3. These genes initially are not transcriptionally active, but undergo a conversion of H3K4 to the tri-methylated state upon differentiation and induction. This conversion then allows recruitment of H3K4me2/3-binding proteins that enhance gene expression, such as ING proteins [8].

Beyond the correlation with euchromatin, H3K4me3 is a defining feature of a regulatory state of a promoter termed “bivalent”. Originally defined in embryonic stem (ES) cells by the co-occurrence of H3K4me3 and H3K27me3 at histones surrounding the TSS, this combination of modifications reflects a state in which a gene is poised to become rapidly transcribed upon a lineage choice, but is currently not expressed or expressed at a low level [9]. The above two poised transcriptional states differ by the cell type in which they have been described as well as the enzymes and reader proteins involved; however, both may represent similar transitional states.

What do these “flavors” of chromatin do to gene regulation or other aspects of chromatin organization? Since H3K4me modification does not affect the charge or overall structure of the histone octamer, proteins that recognize the various methylation states in the histone tail have been sought (H3K4 “readers”). Many proteins that recognize each of the H3K4 methylation states have been identified, leading to a clearer picture of the variety of roles for H3K4 modifications [10]. For example, the plant homeodomain (PHD) domain of TAF3 (an integral subunit of the TFIID Pol II transcription initiation complex) binds H3K4me3 with high affinity, helping to recruit the pre-initiation complex to previously active TSSs [11]. In addition, connecting splicing to transcription has been attributed to the histone reader, CHD1, which binds H3K4me3 via its tandem chromo-domains. This interaction bridges key spliceosomal components to chromatin to regulate pre-mRNA-splicing processes [12]. In the tissue-specific repressed/poised promoters cited above [8], H3K4me1 prevents ING protein binding, but upon differentiation of myoblasts, conversion to H3K4me2/3 at these promoters allows for the binding of ING protein complexes that enable transcriptional activation. Finally, a non-transcriptional function of H3K4me3-enriched histone octamers is that B- and T-cell antigen-receptor recombination requires H3K4me3 recognition by the PHD domain of the RAG2 protein [13]. Binding to the antigen-receptor loci then enables gene rearrangement. The topic of chromatin readers interacting with H3K4 in its unmodified and methylated forms has been reviewed extensively [10, 14, 15], but these few examples illustrate that the distinct H3K4 modifications can recruit distinct activities that alter transcriptional status, chromatin access, transcriptional fidelity, or splicing outcome.

SET/MLL family histone methyltransferases share common enzymatic components but interact with distinct targeting components

The conserved Su(var)3-9, Ezh2, Trithorax (SET) domain in this family of proteins is similar to that of S. cerevisiae SET1 [16, 17]. SET1 is an H3K4 methyltransferase and is responsible for all three methylation states (mono-, di-, and tri-) of H3K4 [1719]. There are three pairs of SET/MLL family members in mammals (Fig. 2): SETd1A/SETd1B, MLL1/MLL2, and MLL3/MLL4 [20]. Although the nomenclature has been confused between MLL2 and MLL4, we use here MLL2 for the protein encoded by the Wbp7 or Kmt2b gene. MLL5, although harboring an SET domain, is enzymatically inactive [21] so is not included in this discussion.

Fig. 2
figure 2

SET/MLL family structure and components. Motifs and interacting partners unique to the paralog groups or those shared with all members are identified in the legend. The general structure of each paralog group is diagrammed with the members indicated at the left of each diagram

Members of this SET/MLL family exhibit relatively moderate-to-weak enzymatic activity as recombinant domains in vitro but are markedly and variably enhanced by association with core subunits shared by the entire family. The minimal subunit complex is comprised of WDR5, RbBP5, ASH2L, and DPY30 (WRAD, Fig. 2), which can enhance the histone methyltransferase (HMT) enzymatic activity of recombinant SET domains of all six family members [2226]. Of note, most groups find a significant H3K4 mono- and di-methyl activity using recombinant proteins and histone peptide substrates, but some discrepancies exist when tri-methylation is assessed; these may be due to the use of bacterial versus eukaryote-produced complexes or due to the use of distinct substrates (peptides versus nucleosomes) in the assays. For example, the Cosgrove group tested each bacterial core SET/MLL complex (with WRAD) in vitro and found that MLL1/2 are predominantly mono- and di-methyltransferases and MLL3/4 act mostly as mono-methyltransferases, whereas SETd1A/B can catalyze all three methylation states similar to the yeast ortholog [25]. In addition to allosteric effects of WRAD on methyltransferase activity, the WDR5 component may help target the complex to chromatin. WDR5 interacts with histones [27] and lncRNAs [28], thus likely is involved in stabilizing chromatin interactions. In summary, interaction with these core complex components is clearly involved in determining the specificity of the individual enzymes, but how relative dependencies or modulations of the complex components in vivo translates to the distinct activities on target genes is a challenging question not easily answered with recombinant proteins.

Beyond the shared components, each of the SET/MLL members participates also in unique protein interactions. For example, the N-terminus of both MLL1 and MLL2 bind to a Menin/LEDGF subcomplex (Fig. 2). This subcomplex plays a major role in chromatin targeting through recognition of H3K36me2 by LEDGF (Fig. 2) [2931]. MLL3 and MLL4, on the other hand, were originally purified in the Activating Signal Cointegrator Complex (ASCOM), and in addition to not binding Menin/LEDGF, they bind to PTIP, UTX, PA1, and NCOA6 [3235]. SETd1A and SETd1B have the unique partners WDR82 and CFP1, which do not interact with other SET/MLL enzymes (Fig. 2). WDR82 bridges the SET complexes to Ser5-phosphorylated Polymerase II, thus actively transcribed genes, and also contributes to the stability of the protein complex [36]. Interestingly, MLL1 and MLL2 have a domain (CXXC, Fig. 2) that binds to non-methylated CpGs, whereas CFP1 accomplishes a similar targeting function in trans for the SET complexes. These examples of unique components for the paralog groups are important, as targeting these protein interactions has been pursued pharmacologically and through knockout approaches (discussed below). Many additional protein interaction partners have been described, which are unique to individual members or ortholog groups, but these are either tissue-specific or not integral components of the corresponding complexes.

Distinct functions of SET/MLL family methyltransferases in hematopoiesis and leukemia

MLL1 and MLL2 functions in hematopoiesis

Loss-of-function studies in the mouse were initiated before it was even clear how many family members existed or what biochemical activities these genes encoded. For example, the first mouse knockouts disrupting Mll1 were published from 1995 to 2001 [3739], followed by the generation of conditional knockout alleles due to the early lethality of the germline knockouts [4042].

Germline knockout studies established the following principles: first, Mll1 was indeed functionally conserved with its Drosophila counterpart, Trithorax, in that it maintained proper expression domains of several clustered homeodomain (Hox) genes [37]. This functional similarity also placed MLL1 in a well-described genetic network in which Polycomb orthologs were predicted to oppose its function [20]. Homozygous knockout embryos from the different laboratories all exhibited different penetrance in that lethality occurred from pre-implantation to E12.5, but all groups noted homeotic transformation of the axial skeleton and mis-expression as well as reduced expression of Hox cluster genes in mutant embryos or mice [3739]. Second, reduced production of blood lineages was noted in these germline knockouts, mainly manifesting as small fetal livers, reduced colony forming units (CFU) from fetal liver and yolk sac cells, and reduced engraftment activity from aorta-gonad-mesonephros (AGM) cells of knockout embryos, where the earliest definitive hematopoietic stem cells (HSCs) arise [38, 43, 44]. However, none of these studies addressed whether reduced hematopoiesis reflected a cell-intrinsic function of MLL1, since knockout HSCs developed in a mutant environment in germline knockout embryos. Years later, newly engineered mouse models addressed these questions and more precisely defined tissue-specific functions.

The use of conditional deletion for Mll1 naturally first focused on hematopoietic-specific functions [40, 41, 45]. The number and variety of fetal hematopoietic populations in Vav-Cre-mediated (developmentally regulated, hematopoietic-specific) knockout fetal liver cells implied that defective endothelial cells or other support cells contributed to the severity of hematopoietic phenotypes observed in the germline knockouts [38, 41, 43, 45]. Specifically, HSPCs are numerically and phenotypically normal in Vav-Cre-mediated Mll1−/− mid-gestation fetal livers, whereas the corresponding assays using germline knockout fetal liver cells showed a significant reduction in HSPC numbers and function [38]. Nonetheless, even these fetal liver Mll1−/− HSPCs do not at all engraft adult recipient animals [41, 45]. Thus, these studies demonstrated that Mll1 was required in a hematopoietic-intrinsic manner for the development of adult-transplantable HSCs [41, 45].

Further studies using inducible knockout strategies revealed that Mll1−/− HSCs exhibited many specific defects, including failure to maintain G0, lack of self-renewal without loss of differentiation, widespread transcriptional deregulation, and complete failure to engraft recipients. Interestingly, distinct alleles produced milder results (no reduction in steady-state adult bone-marrow populations), suggesting a difference in severity of the alleles [40, 41].

The role of the HMT activity in these severe Mll1−/− hematopoietic phenotypes was addressed by deleting the SET domain harboring the catalytic site. These knockout animals survive as homozygous mutants, and the initial report showed altered Hox gene expression in embryos and some gene-specific reductions in H3K4me1 [46]. Remarkably, adult homozygous mutant animals exhibited no overt hematopoietic or developmental phenotypes. In the hematopoietic system, no changes in gene expression, histone modifications, or function of HSCs could be demonstrated in animals lacking the MLL1 HMT activity [47]. These data revealed that although MLL1 plays an important and non-redundant role in maintaining hematopoietic stem and progenitor cells, it does so with activities other than its HMT activity.

Although Mll1 does not appear to be involved in the homeostasis of mature T cells, several lines of evidence show that it plays a role in immunity and T-cell function. Mll1 heterozygous animals on a Balb/C background exhibited initially normal effector TH2 responses and gene expression, but failed to maintain expression of TH2 cytokines and Gata3 expression upon antigen recall [48]. Furthermore, animals in which Mll1 is deleted only in CD4 + T cells fail to control helminth infections and fail to maintain IL-4 expression (Colby Zaph, Monash University, personal communication). The precise genes and mechanisms controlling these responses in T cells have not yet been elucidated. Given the wealth of recent information regarding epigenetic regulation of T-cell differentiation [49], this avenue of investigation promises to yield important mechanistic insights.

MLL2, such as MLL1, is expressed broadly. The two genes are paralogs likely originating from gene duplication, and both are considered Trithorax orthologs due to the overall similarity and domain structure [50]. In contrast to MLL1, MLL2 is never involved in chromosomal translocations, and in fact, the N-terminus of MLL2 cannot substitute for MLL1 in leukemia oncoproteins, likely due to differences in the CXXC domain [5153]. Germline deletions in Mll2 result in delayed development from at least E8 and lethality by E10.5 with neural tube defects and widespread apoptosis. MLL2 expression increases during oocyte maturation, and is in fact required for oocyte survival; remarkably, MLL2-deficient oocytes exhibit global loss of H3K4me3 and H3K4me2 and a slight increase in H3K4me1 [54]. This is the first tissue in which MLL2 was shown to be the dominant H3K4me2/3 methyltransferase, thus co-expression of other family members in oocytes apparently cannot compensate for loss of MLL2. Despite this major role in oocytes, complete deletion of Mll2 after E10.5 had no effect on the hematopoietic system or other organ homeostasis, as determined using a tamoxifen-inducible Rosa-ERT2-Cre system [50].

One additional hematopoietic cell type provided a significant mechanistic insight into how Mll2 functions in vivo and how it regulates target genes. Using the Rosa-ERT2-Cre animals described above, Mll2−/− bone marrow was used to differentiate macrophages in vitro, which were phenotypically indistinguishable from their wild-type counterparts. When challenged with lipopolysaccharide (LPS), which activates TLR4 signaling, these cells exhibited a specific defect in the induction of NFkB target genes. Surprisingly, this was not due to a direct effect on NFκB-dependent genes, but rather the specific underexpression of Pigp, a gene required for the complex that adds glycophophatidylinositol (GPI) anchors to transmembrane proteins. Without this enzyme, CD14 was reduced at the cell surface and without its function as an LPS co-receptor, and TLR4 signaling was greatly reduced [55]. Pigp and several other MLL2-regulated genes were identified in this study, and all exhibited reduced H3K4me3 enrichment around their promoters, but a global reduction in H3K4me3 was not reported. Interestingly, most of these promoters also exhibited an increase in H3K27me3, a mark associated with repressed and bivalent genes, suggesting that one role of the H3K4me3 enrichment is to resist Polycomb complex-mediated repression. On the other hand, many other genes that were found to be hypo-H3K4 methylated were not changed in expression, suggesting that the role of MLL2 in maintaining its target genes is in some cases HMT-dependent and in other cases through other transcriptional effects of MLL2. Alternatively, the expression of some genes may simply not be as sensitive to lack of H3K4me3 depletion at the promoter. This latter concept is supported by the fact that 90 % of the genes exhibiting reduced H3K4me3 were not altered in transcript level [55].

MLL1 and MLL2 functions in leukemia

As discussed above, despite the similarity in gene structure and overlapping gene expression patterns, MLL1 and MLL2 have diversified to function differently (with respect to the requirement for HMT activity) and do regulate distinct sets of genes. These differences may underlie the fact that only MLL1 is involved in chromosomal translocations in leukemia. The mechanisms by which MLL-fusion oncoproteins transform cells have been reviewed extensively [5659], but the contribution of the non-translocated, wild-type allele remains controversial. Several early studies suggested that the remaining wild-type MLL1 molecule was required for maintaining MLL-AF9 initiated leukemia or that targeting the HMT activity of MLL1 would be therapeutically beneficial [60, 61]. However, studies using the HMT-inactive SET domain-deleted Mll1 bone-marrow cells demonstrated that leukemia initiated by MLL-fusion oncoproteins was absolutely unimpeded in primary or secondary recipients of leukemia cells [47]. Furthermore, CRISPR-Cas9-mediated deletion of the SET domain in transformed cells did not impair their growth, suggesting that the HMT activity or possibly any activity of endogenous MLL1 is not needed for the fusion oncoproteins to transform cells [62]. The role of MLL2 in leukemia thus far is unknown, as is the issue of redundancy with MLL1 in MLL-AF9 leukemia.

MLL3 and MLL4 in hematopoiesis, leukemia, and lymphoma

MLL3/MLL4 in mammals was identified as HMTs in the ASCOM complex, which is a coactivator complex associated with nuclear receptors [32], but have since been demonstrated to interact with a wide variety of transcription factors in a similar complex (reviewed in [63]. Knockout of either gene is embryonic or perinatal lethal, but there is also evidence for redundancy between these genes from studies of adipogenesis and myogenesis, since co-deletion of both was required to observe a global decrease in H3K4me1. Thus, in preadipocytes, MLL3 and MLL4 were responsible for the majority of H3K4me1/2 but not me3 [34]. It is not yet clear if both enzymes are collectively responsible for all H3K4me1 at enhancers in all tissues. Finally, their enzymatic activity has also been associated with the repressive H3K4me1 at promoters discussed above [8].

MLL3 was also cloned as a region deleted in 7q− acute myelogenous leukemia (AML) [64], and interest in MLL3/MLL4 roles in hematopoiesis was revived when MLL4 was discovered to be mutated in 30–90 % of diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL) [65, 66]. Most genomic alterations were predicted to be loss-of-function mutations, with about half mono-allelic and half bi-allelic. Therefore, the role of MLL4 was explicitly tested as a tumor suppressor in several murine leukemia models. First, knockdown of MLL4 in the context of a Vav-Bcl2 transgene or deletion of MLL4 in an activation-induced cytidine deaminase (AID) overexpression model accelerated lymphomagenesis, and even B-cell specific deletion of MLL4 alone resulted in lymphoma development [67, 68]. In normal B-cell differentiation, loss of MLL4 increased steady-state transitional B-cell numbers, enhanced germinal center formation, and enhanced proliferation in response to CD40 stimulation [66, 68]. Furthermore, in B-cells, the deletion of MLL4 is sufficient to observe a global reduction in H3K4me3/2/1 [68]. Therefore, MLL4 functions in a non-redundant manner to suppress mature B-cell proliferation act as a tumor suppressor in FL and DLBCL.

The role of the MLL3/MLL4 proteins in hematopoiesis appears generally similar as either knockdown of MLL3 or knockout of MLL4 in hematopoietic stem/multipotent progenitor cells (HSPCs) results in increased HSPC numbers [69, 70]. However, MLL4−/− HSPCs exhibit reduced engraftment in secondary recipients, accompanied by high levels of reactive oxygen species [69]. These defects were attributed to the loss of MLL4-dependent genes that regulate and protect from oxidative stress. In the case of AML driven by an MLL-fusion oncoprotein, MLL4 loss limited leukemogenesis [69], in contrast to the role of MLL3 in the 7q− AML model of tumor repression [70] or role in FL and DLBCL discussed above. These distinct tumor enabling or tumor suppressive roles likely reflect the different cellular contexts and target genes in AML versus lymphoma.

SETd1A/SETd1B in hematopoiesis and leukemia

Despite the fact that SETd1A and SETd1B are likely to be the most broadly acting, transcription-coupled H3K4 methyltransferases, very little is known about the in vivo functions of these enzymes compared to their paralog MLL1 until recently. Both SETd1A and SETd1B knockout embryos develop to blastocysts and implant, but SETd1A blastocysts exhibit a defective inner cell mass and fail to produce ES cell clones. On the other hand, SETd1B null embryos exhibited growth defects from E7.5 and die before E11.5. In ES cells engineered with inducible (Rosa-ERT2) Cre, SETd1A also exhibited a more severe phenotype, with growth arrest and cell death occurring as soon as protein reduction was observed. Commensurate with reduced SETd1A protein reduction, a global reduction in H3K4me3/2/1 was observed in ES cells, but the mark was not completely depleted, suggesting that the remaining H3K4 methylation in ES cells is due to SETd1B or other methyltransferases. However, parallel deletion of SETd1B in the same study did not result in perceivable global H3K4 methylation changes [71]. These data illustrated that the SET proteins certainly play distinct roles in gene expression, but may also exhibit some redundancy in maintaining global H3K4me3 levels.

The Huang group has focused on the role of SETd1A in hematopoietic cells using an independently produced conditional knockout allele. Using the Mx1-Cre strain to inducibly delete SETd1A in bone-marrow cells, they noted reduced B-cell numbers, particularly inducing a partial block in differentiation from the pro-B to pre-B-cell transitions. It is not clear whether H3K4 methylation was globally reduced from these studies, but reduction of H3K4me3 at important B-cell regulators and the immunoglobulin heavy chain locus was suggested to underlie the block in differentiation [72]. Similarly, erythroid-specific SETd1A knockout partially blocked erythropoiesis, resulting in a mild anemia. Reduced expression of lineage regulators, such as Gata1 and Tal1, was noted, as was reduced H3K4me3 at the promoters of these genes. Although several sequence-specific transcriptional regulators have been implicated in the recruitment of SETd1A, including USF1 in this study, the rules for targeting this complex to specific lineage target genes are not yet clear [73].

Knockouts of several unique and shared SET/MLL complex components have been insightful, particularly in comparison with knockouts of the core HMTs. For example, by knocking out the CFP1 (also called CXXC protein, binds non-methylated CpGs) in murine ES cells, H3K4me3 peak reduction was observed at about 50 % of TSSs. Furthermore, H3K4me3 is acquired in nuclear territories and regions of the genome in which it does not appear in wild-type cells, resulting in ectopic transcription [74, 75]. Therefore, CFP1 is critical for directing H3K4 methylation activity to proper genomic locations. In addition, a global reduction in DNA methylation was also observed, most likely due to reduced expression of DNMT1 [76]. Differentiation cannot occur in these CFP1−/− ES cells, but rescue of the knockout ES cells with different CFP1 mutants indicates that either the PHD domain binding to H3K4me1/2/3 or the CpG binding activity are required to restore H3K4me distribution and ES differentiation [75]. One of the remarkable observations from CFP1−/− ES cells is that although the most highly expressed genes are those that suffer the greatest H3K4me3 loss, yet significant loss of transcription at those loci did not occur [74]. Thus, a proper targeting of SETd1A/B complexes is required for differentiation and embryo development, but the relationship between transcription and H3K4me3 is not a simple one.

Specifically, within the hematopoietic system, knockout of CFP1 using the Mx1-Cre inducible transgene resulted in bone-marrow failure and animal lethality within 2 weeks, loss of lineage-restricted progenitors, and mature cells. Hematopoietic stem cells and primitive progenitors, however, expand in the absence of CFP1 [77]. Interestingly, deletion of SETd1A in HSPCs revealed a similar phenotype. Upon SETd1A deletion, an accumulation of short-term HSCs and multipotent progenitors (MPPs) was observed, but these cells were not able to engraft secondary recipients. Cell-cycle analysis of SETd1A−/− HSCs suggested an accumulation in the G1 phase of the cell cycle, consistent with the overall reduction in proliferation observed by Chun et al. ([72] and Claudia Waskow, Technical University of Dresden, personal communication).

Conditional knockout of the WRAD component, DPY30, resulted in both shared and distinct phenotypes comparing to the SETd1A and CFP1 models above, likely due to the fact that DPY30 interacts with all six MLL/SET family members. In contrast to CFP1 loss, the overall levels of H3K4me3/2/1 were significantly reduced in bone-marrow HSPCs. However, Mx1-Cre-mediated deletion resulted in bone-marrow failure and animal death in the same time frame as CFP1 loss and a significant accumulation of LT-HSCs and ST-HSCs in Dpy30/− bone marrow or cells in primary recipient bone marrow [78]. In this conditional knockout, however, neither abnormal cell proliferation nor decreased HSPC apoptosis were observed. Rather, the authors observe a block in HSC differentiation based on the inability of Dpy30−/− bone marrow to yield differentiated colonies in vitro or to differentiate post-transplant in vivo. Furthermore, despite the accumulation in HSCs in conditionally Dpy30−/− bone marrow, in a chimeric setting, these cells fail to contribute long term to even the HSPC pool [78]. Gene expression studies reveal downregulation of HSC genes by gene-set enrichment analysis, but the identity of those genes is very distinct from those deregulated in the Mll1 knockout, which is the most comparably analyzed SET/MLL family member. The fact that either CFP1, SETd1A, or DPY30 loss each produces an increase in HSPCs distinguishes all three from the Mll1 or Mll2 hematopoietic knockouts, as does the limited gene expression data. Despite the overlapping enzymatic activity, it is very clear that the regulation and gene-specific action of these two paralog groups are very distinct from each other.

Discussion and future perspectives

Despite similar enzymatic activity and C-terminal cofactors, the overall domain structure, and widespread expression, it is now clear that each of the SET/MLL family members performs unique biological functions in vivo. While this may not be surprising, it does present challenges if pharmacologic inhibition of a unique family member is desired. Although subtle features of the HMT activity requirements may be exploited for relatively selective targeting, it is still not clear in many cases whether the HMT activity of each of the family members is the key activity that should be targeted. Nonetheless, the powerful combination of biochemical/structural analyses and in vivo manipulation will likely clarify which activities of these protein complexes are critical in the particular biological setting. However, given the large size and wide variety of protein interactions of all of these complexes nucleated by the SET/MLL family member, this process will likely require significant investment.

Gene-targeting studies of some of the associated WRAD and N-terminal-interacting proteins have provided evidence that all in vivo functions do not necessarily depend on these interacting partners. For example, a comparison of the CFP1, DPY30, and SETd1A loss-of-function models discussed above suggests that the loss of any of these proteins results in overlapping phenotypes in HSPCs. One might have predicted that Dpy30−/− bone marrow would exhibit a phenotype representing the loss of all 6 SET/MLL family members, but instead this knockout resembles most the SETd1A knockout ([40, 72, 78], Claudia Waskow, personal communication). These observations are consistent with the fact that the HMT activity of MLL1 is not required for its role in HSPCs, thus may not be affected by DPY30 loss [47]. Similarly, hematopoietic-specific Menin knockout is not as severe as the corresponding Mll1 knockout [79], suggesting that this physical interaction is not required for all in vivo functions.

With respect to gene regulation, some mechanistic generalities can be made. Most evidences in multiple tissues, including the hematopoietic system, implicate the MLL3/MLL4 proteins in performing H3K4me1 globally, which are predominantly associated with enhancer function, as well as the recently described developmentally poised gene class (Fig. 1). In contrast, the MLL1/MLL2 and SETd1A/SETd1B families have been associated with H3K4me2/3 around TSSs, with the individual protein playing the dominant role highly dependent on the tissue surveyed. However, global and gene-specific histone modification changes in the various knockout settings should even be interpreted with caution. For example, the presence or absence of H3K4 demethylases in the individual tissues surveyed may have an enormous impact on the final H3K4me status. In addition, if H3K4mono methylases are deleted, the observation of H3K4me2/me3 reduction may simply reflect the lack of an appropriate substrate for the di- and tri-methylases to act upon, thus an indirect effect may be misinterpreted as a direct effect on H3K4me2/3 activity.

One important question to answer in the future is how each of these complexes is targeted to act at specific loci in the genome, and whether there are scenarios in which generic maintenance of the histone mark is important for maintaining gene expression can be accomplished by multiple enzymes. Although a few cases of tissue-specific transcriptional regulators recruiting individual complexes to particular promoter/enhancer elements have been reported, this aspect of SET/MLL protein function remains an area of active investigation. Given the number of chromatin-binding motifs in all of the family members (Fig. 2), the targeting of SET/MLL complexes even within a particular cell type is likely combinatorial between recruitment by tissue-specific transcriptional regulators and localized chromatin modifications, as exemplified by the non-methyl CpG binding activities.

As far as the individual roles of these family members in hematologic malignancies, MLL1 and MLL3/MLL4 are the members for which there is a significant body of the literature showing their involvement. MLL1 is disrupted by chromosomal translocations, MLL3 deleted in 7q− AML, and MLL4 acting as a tumor suppressor in lymphoma. Nonetheless, it is likely that all family members participate as contributors in other hematologic malignancies, but the specific roles are just beginning to be worked out using the genetic models discussed here.