Regular articleComparative and functional analyses of LYL1 loci establish marsupial sequences as a model for phylogenetic footprinting☆
Introduction
As the number of genome sequencing projects and the data arising from these projects continue to grow, there is an increasing integration of developmental and evolutionary biology, with the emergence of an ever-greater opportunity for comparison of DNA sequence between different species. Different combinations of species for comparative sequence analysis are being used to answer various biological questions. Examples include the use of pufferfish sequence to identify human genes [1], the identification and functional characterization of Drosophila counterparts to human disease genes [2], and the elucidation of transcriptional networks from the examination of genomic sequences from two related sea urchin species [3]. The utility of each approach largely depends on the phylogenetic distances between the species involved.
It has been suggested that differences in gene regulation make a major contribution to biological diversity between organisms, in some circumstances playing an even greater role than coding region differences [4]. However, certain biological processes, e.g., hematopoiesis, have been markedly preserved throughout evolution and share similar regulatory mechanisms. An important means of gene regulation is at the level of transcription. This is largely encoded in the primary sequence, in the form of regulatory motifs comprising, for example, transcription factor binding sites, matrix attachment regions, and insulator regions. Our ability to identify these regions and predict their function from the primary sequence is extremely limited [5]. One possible solution to the problem is to analyze the noncoding regions of genes using comparative sequence analysis to identify putative regulatory elements of genes with conserved function [6], [7], which can then be subjected to confirmatory functional testing.
We have studied the regulation of the stem cell leukemia (SCL) family of genes. These encode basic helix–loop–helix (bHLH) transcription factors and were identified on the basis of translocations in T cell acute leukemia [8], [9], [10], [11], [12]. The best studied of these is SCL, which functions as a critical regulator of hematopoiesis and vascular development in a highly conserved manner in all species studied, from mammals to teleost fish [reviewed in 13]. We have cloned SCL loci from five vertebrate species, plus a new family member, SLP1, from the pufferfish, Fugu rubripes [14], [15]. Long-range sequence comparisons demonstrated multiple peaks of human/mouse homology, a subset of which corresponded precisely with known SCL enhancers [16]. By contrast, comparisons between mammalian and chicken sequences identified some, but not all, SCL enhancers [16]. Thus, human/mouse comparisons appear to produce “false positives,” whereas human/chicken comparisons produce “false negatives.” As the evolutionary distance across which comparisons are made is increased, the ability to identify known regulatory regions deteriorates [14].
The evolutionary divergence of humans and mice is thought to have occurred around 80 million years (MYr) ago [17]. This compares to a phylogenetic distance of about 300 MYr between humans and chickens. We therefore reasoned that it might be informative to analyze across an intermediate evolutionary distance. Marsupials diverged from the eutheria approximately 130–160 MYr ago [17], [18] and a representative from this infraclass would thus represent a good candidate for an intermediate species.
In the current article, we describe the screening of a bacterial artificial chromosome (BAC) library of the species Sminthopsis macroura, the stripe-faced dunnart. This is a small, Australian, Dasyurid marsupial. The Dasyurids are thought to have diverged from the Macropods (kangaroos and wallabies) about 50 MYr ago [18]. We further describe the identification, sequencing, and comparative analysis of a clone containing the genomic locus of the lymphoblastic leukemia-1 (LYL1) gene, a member of the SCL gene family. We demonstrate that, between human, mouse, and dunnart, gene and exon/intron structures are largely conserved and that protein sequences are remarkably similar. Noncoding homology is, as predicted, reduced in a human/dunnart sequence alignment compared to a human/mouse alignment. We propose that human/marsupial comparisons are indeed of use to identify regulatory sequences and illustrate this by a three-way local sequence alignment of the LYL1 promoter. This permits phylogenetic footprinting [19], [20], [21], [22], [23], which identifies a number of potential transcription factor binding sites, in a pattern intriguingly reminiscent of that seen in the SCL promoters and stem cell enhancer. Finally, we show that the mouse LYL1 promoter is highly active in myeloid progenitor cells and is bound in vivo by Fli1, Elf1, and Gata2. All these transcription factors have previously been shown to interact with the SCL stem cell enhancer.
Section snippets
Results
We set out to assess the value of marsupial genomic sequence for the purposes of identifying candidate regulatory elements. Our goal was thus to identify marsupial members of the SCL gene family and perform comparative analysis with their human and mouse orthologues.
Discussion
Comparative sequence analysis is a powerful tool for identifying critical regulatory motifs within genomic DNA sequences on the basis that evolutionary selective pressure leads to the preferential conservation of functional elements [6], [7]. An important, unresolved issue that remains, however, is the phylogenetic distance across which to compare sequences. If the species are too closely related, there is a risk of loss of specificity, and it can be difficult to prioritize confirmatory
Library preparation
We prepared DNA from the liver of a 20-week-old male stripe-faced dunnart (obtained from The Melbourne Royal Zoological Gardens under the La Trobe University Animal Ethics Committee Permit No. RP96/4/V6) and extracted it in agarose plugs as previously described [50]. This method avoids degradation of the DNA and maintains a large fragment size. We sent the DNA to the Resourcenzentrum für Genomforschung (RZPD; Germany) where it was used to generate a BAC library by cloning into the NotI sites of
Acknowledgements
Work in the authors’ laboratories is supported by the Wellcome Trust, the Leukaemia Research Fund, and the British Heart Foundation. We thank Katrin Welzel at RZPD for her liaison regarding library construction, Bernhard Herrmann for the in situ hybridization pictures in Fig. 4B, Kirsten McLay for carrying out the shotgun sequencing, Elizabeth Gibson for creating the transposon libraries used to close the most difficult gaps in the sequences, and Sequencing Teams 40 and 47 at the Wellcome Trust
References (53)
- et al.
Searching for regulatory elements in human noncoding sequences
Curr. Opin. Struct. Biol.
(1997) Conserved noncoding sequences are reliable guides to regulatory elements
Trends Genet.
(2000)- et al.
The SCL genefrom case report to critical hematopoietic regulator
Blood
(1999) The pufferfish SLP-1 gene, a new member of the SCL/TAL-1 family of transcription factors
Genomics
(1998)Phylogenetic footprinting of hypersensitive site 3 of the beta-globin locus control region
Blood
(1997)Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints
J. Mol. Biol.
(1988)- et al.
Basic local alignment search tool
J. Mol. Biol.
(1990) Differences in expression, actions and cocaine regulation of two isoforms for the brain transcriptional regulator NAC1
Neuroscience
(2002)Expression of a novel immediate early gene during 12-O-tetradecanoylphorbol-13-acetate-induced macrophagic differentiation of HL-60 cells
J. Biol. Chem.
(1991)- et al.
Syntaxin 10a member of the syntaxin family localized to the trans-Golgi network
Biochem. Biophys. Res. Commun.
(1998)
The evolution of human chromosome 21evidence from in situ hybridization in marsupials and a monotreme
Genomics
Lineage-restricted regulation of the murine SCL/TAL-1 promoter
Blood
Distinct mechanisms direct SCL/tal-1 expression in erythroid cells and CD34 positive primitive myeloid cells
J. Biol. Chem.
Transcriptional regulation of the stem cell leukemia gene by PU.1 and Elf-1
J. Biol. Chem.
Distinct 5′ SCL enhancers direct transcription to developing brain, spinal cord, and endotheliumneural expression is mediated by GATA factor binding sites
Dev. Biol.
An improved approach for construction of bacterial artificial chromosome libraries
Genomics
Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence
Nat. Genet.
Comparative genomics of the eukaryotes
Science
A genomic regulatory network for development
Science
Intra- and interspecific variation in primate gene expression patterns
Science
Genomic strategies to identify mammalian regulatory sequences
Nat. Rev. Genet.
Chromosomal translocation in a human leukemic stem-cell line disrupts the T-cell antigen receptor delta-chain diversity region and results in a previously unreported fusion transcript
Proc. Natl. Acad. Sci. USA
Two distinct mechanisms for the SCL gene activation in the t(1;14) translocation of T-cell leukemias
Genes Chromosomes Cancer
The tal gene undergoes chromosome translocation in T cell leukemia and potentially encodes a helix–loop–helix protein
EMBO J.
Chromosomal translocation involving the beta T cell receptor gene in acute leukemia
J. Exp. Med.
TAL2, a helix–loop–helix gene activated by the (7;9) (q34;q32) translocation in human T-cell leukemia
Proc. Natl. Acad. Sci. USA
Cited by (34)
A compendium of genome-wide hematopoietic transcription factor maps supports the identification of gene regulatory control mechanisms
2011, Experimental HematologyCitation Excerpt :For example, within a previously reported array of regulatory elements spread over 100 kb of the mouse Lmo2 locus [17], it was those regions with the strongest hematopoietic activity in transgenic mouse assays that overlapped with the largest number of transcription factor peaks in our new compendium (see Fig. 2A). Similar observations were made for other gene loci with significant experimental knowledge on gene regulatory elements such as Sfpi1/Pu.1 [18], Scl/Tal1 [19,20], Lyl1 [21–23], and Gfi1 [24] (data not shown). Given the consistently seen correlation between known regulatory regions and multifactor binding, we next asked whether the vast amount of gene regulatory information integrated within the HemoChIP compendium might provide a rapid way of identifying additional factors bound to already known regulatory regions.
Transcriptional regulatory networks in haematopoiesis
2008, Current Opinion in Genetics and DevelopmentCitation Excerpt :Four of these clusters were situated in gene loci of TF genes expressed in embryonic HSCs and three of these clusters (controlling Fli1, Prh and Smad6 genes) were found to possess an in vivo activity similar to that of the Scl/Tal1 +19 element when assayed in transgenic mice [9••,10]. Using further bioinformatic and transgenic analyses, additional key regulators of blood development have progressively been integrated into this nascent gene regulatory network, such as the BMP4/Smad pathway and Runx TFs [10–15] (Figure 2). Interestingly, concerted biochemical and cell biological analyses revealed functional cross-talk between the BMP/Smad axis and Runx1.
Friend of GATA-1-independent transcriptional repression: A novel mode of GATA-1 function
2007, BloodCitation Excerpt :Only Rgs13 repression was delayed by the V205G mutation. GATA-2 occupies conserved GATA motifs within the Lyl1 promoter,27 and a 464-bp region of the promoter that includes the GATA motifs is sufficient to drive Lyl1 expression in endothelial and hematopoietic cells.28 GATA-1 occupancy of the Lyl1 locus had not been studied previously.
The paralogous hematopoietic regulators Lyl1 and Scl are coregulated by Ets and GATA factors, but Lyl1 cannot rescue the early Scl<sup>-/-</sup> phenotype
2007, BloodCitation Excerpt :Lyl1 and Scl exhibit more than 90% amino acid identity in their bHLH DNA binding domains, and both interact with the lim-only-domain leukemia oncogenes LMO1 and LMO2.6 Lyl1 and Scl display overlapping expression patterns across several hematopoietic lineages7 and are also both expressed in developing endothelial cells.8 Targeted deletion has shown that Scl is essential for the early specification of hematopoietic stem cells (HSCs) as well as vascular and neural development.9–13
- ☆
Sequence data from this article have been deposited with the DDBJ/EMBL/GenBank Data Libraries under Accession No. AL731834.
- 1
Current address: Department of Medicine and Therapeutics, University of Glasgow, Glasgow G11 6NT, UK.