Elsevier

Genomics

Volume 81, Issue 3, March 2003, Pages 249-259
Genomics

Regular article
Comparative and functional analyses of LYL1 loci establish marsupial sequences as a model for phylogenetic footprinting

https://doi.org/10.1016/S0888-7543(03)00005-3Get rights and content

Abstract

Comparative genomic sequence analysis is a powerful technique for identifying regulatory regions in genomic DNA. However, its utility largely depends on the evolutionary distances between the species involved. Here we describe the screening of a genomic BAC library from the stripe-faced dunnart, Sminthopsis macroura, formerly known as the narrow-footed marsupial mouse. We isolated a clone containing the LYL1 locus, completely sequenced the 60.6-kb insert, and compared it with orthologous human and mouse sequences. Noncoding homology was substantially reduced in the human/dunnart analysis compared with human/mouse, yet we could readily identify all promoters and exons. Human/mouse/dunnart alignments of the LYL1 candidate promoter allowed us to identify putative transcription factor binding sites, revealing a pattern highly reminiscent of critical regulatory regions of the LYL1 paralogue, SCL. This newly identified LYL1 promoter showed strong activity in myeloid progenitor cells and was bound in vivo by Fli1, Elf1, and Gata2—transcription factors all previously shown to bind to the SCL stem cell enhancer. This study represents the first large-scale comparative analysis involving marsupial genomic sequence and demonstrates that such comparisons provide a powerful approach to characterizing mammalian regulatory elements.

Introduction

As the number of genome sequencing projects and the data arising from these projects continue to grow, there is an increasing integration of developmental and evolutionary biology, with the emergence of an ever-greater opportunity for comparison of DNA sequence between different species. Different combinations of species for comparative sequence analysis are being used to answer various biological questions. Examples include the use of pufferfish sequence to identify human genes [1], the identification and functional characterization of Drosophila counterparts to human disease genes [2], and the elucidation of transcriptional networks from the examination of genomic sequences from two related sea urchin species [3]. The utility of each approach largely depends on the phylogenetic distances between the species involved.

It has been suggested that differences in gene regulation make a major contribution to biological diversity between organisms, in some circumstances playing an even greater role than coding region differences [4]. However, certain biological processes, e.g., hematopoiesis, have been markedly preserved throughout evolution and share similar regulatory mechanisms. An important means of gene regulation is at the level of transcription. This is largely encoded in the primary sequence, in the form of regulatory motifs comprising, for example, transcription factor binding sites, matrix attachment regions, and insulator regions. Our ability to identify these regions and predict their function from the primary sequence is extremely limited [5]. One possible solution to the problem is to analyze the noncoding regions of genes using comparative sequence analysis to identify putative regulatory elements of genes with conserved function [6], [7], which can then be subjected to confirmatory functional testing.

We have studied the regulation of the stem cell leukemia (SCL) family of genes. These encode basic helix–loop–helix (bHLH) transcription factors and were identified on the basis of translocations in T cell acute leukemia [8], [9], [10], [11], [12]. The best studied of these is SCL, which functions as a critical regulator of hematopoiesis and vascular development in a highly conserved manner in all species studied, from mammals to teleost fish [reviewed in 13]. We have cloned SCL loci from five vertebrate species, plus a new family member, SLP1, from the pufferfish, Fugu rubripes [14], [15]. Long-range sequence comparisons demonstrated multiple peaks of human/mouse homology, a subset of which corresponded precisely with known SCL enhancers [16]. By contrast, comparisons between mammalian and chicken sequences identified some, but not all, SCL enhancers [16]. Thus, human/mouse comparisons appear to produce “false positives,” whereas human/chicken comparisons produce “false negatives.” As the evolutionary distance across which comparisons are made is increased, the ability to identify known regulatory regions deteriorates [14].

The evolutionary divergence of humans and mice is thought to have occurred around 80 million years (MYr) ago [17]. This compares to a phylogenetic distance of about 300 MYr between humans and chickens. We therefore reasoned that it might be informative to analyze across an intermediate evolutionary distance. Marsupials diverged from the eutheria approximately 130–160 MYr ago [17], [18] and a representative from this infraclass would thus represent a good candidate for an intermediate species.

In the current article, we describe the screening of a bacterial artificial chromosome (BAC) library of the species Sminthopsis macroura, the stripe-faced dunnart. This is a small, Australian, Dasyurid marsupial. The Dasyurids are thought to have diverged from the Macropods (kangaroos and wallabies) about 50 MYr ago [18]. We further describe the identification, sequencing, and comparative analysis of a clone containing the genomic locus of the lymphoblastic leukemia-1 (LYL1) gene, a member of the SCL gene family. We demonstrate that, between human, mouse, and dunnart, gene and exon/intron structures are largely conserved and that protein sequences are remarkably similar. Noncoding homology is, as predicted, reduced in a human/dunnart sequence alignment compared to a human/mouse alignment. We propose that human/marsupial comparisons are indeed of use to identify regulatory sequences and illustrate this by a three-way local sequence alignment of the LYL1 promoter. This permits phylogenetic footprinting [19], [20], [21], [22], [23], which identifies a number of potential transcription factor binding sites, in a pattern intriguingly reminiscent of that seen in the SCL promoters and stem cell enhancer. Finally, we show that the mouse LYL1 promoter is highly active in myeloid progenitor cells and is bound in vivo by Fli1, Elf1, and Gata2. All these transcription factors have previously been shown to interact with the SCL stem cell enhancer.

Section snippets

Results

We set out to assess the value of marsupial genomic sequence for the purposes of identifying candidate regulatory elements. Our goal was thus to identify marsupial members of the SCL gene family and perform comparative analysis with their human and mouse orthologues.

Discussion

Comparative sequence analysis is a powerful tool for identifying critical regulatory motifs within genomic DNA sequences on the basis that evolutionary selective pressure leads to the preferential conservation of functional elements [6], [7]. An important, unresolved issue that remains, however, is the phylogenetic distance across which to compare sequences. If the species are too closely related, there is a risk of loss of specificity, and it can be difficult to prioritize confirmatory

Library preparation

We prepared DNA from the liver of a 20-week-old male stripe-faced dunnart (obtained from The Melbourne Royal Zoological Gardens under the La Trobe University Animal Ethics Committee Permit No. RP96/4/V6) and extracted it in agarose plugs as previously described [50]. This method avoids degradation of the DNA and maintains a large fragment size. We sent the DNA to the Resourcenzentrum für Genomforschung (RZPD; Germany) where it was used to generate a BAC library by cloning into the NotI sites of

Acknowledgements

Work in the authors’ laboratories is supported by the Wellcome Trust, the Leukaemia Research Fund, and the British Heart Foundation. We thank Katrin Welzel at RZPD for her liaison regarding library construction, Bernhard Herrmann for the in situ hybridization pictures in Fig. 4B, Kirsten McLay for carrying out the shotgun sequencing, Elizabeth Gibson for creating the transposon libraries used to close the most difficult gaps in the sequences, and Sequencing Teams 40 and 47 at the Wellcome Trust

References (53)

  • P. Maccarone

    The evolution of human chromosome 21evidence from in situ hybridization in marsupials and a monotreme

    Genomics

    (1992)
  • E.O. Bockamp

    Lineage-restricted regulation of the murine SCL/TAL-1 promoter

    Blood

    (1995)
  • E.O. Bockamp

    Distinct mechanisms direct SCL/tal-1 expression in erythroid cells and CD34 positive primitive myeloid cells

    J. Biol. Chem.

    (1997)
  • E.O. Bockamp

    Transcriptional regulation of the stem cell leukemia gene by PU.1 and Elf-1

    J. Biol. Chem.

    (1998)
  • A.M. Sinclair

    Distinct 5′ SCL enhancers direct transcription to developing brain, spinal cord, and endotheliumneural expression is mediated by GATA factor binding sites

    Dev. Biol.

    (1999)
  • K. Osoegawa

    An improved approach for construction of bacterial artificial chromosome libraries

    Genomics

    (1998)
  • H. Roest Crollius

    Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence

    Nat. Genet.

    (2000)
  • G.M. Rubin

    Comparative genomics of the eukaryotes

    Science

    (2000)
  • E.H. Davidson

    A genomic regulatory network for development

    Science

    (2002)
  • W. Enard

    Intra- and interspecific variation in primate gene expression patterns

    Science

    (2002)
  • L.A. Pennacchio et al.

    Genomic strategies to identify mammalian regulatory sequences

    Nat. Rev. Genet.

    (2001)
  • C.G. Begley

    Chromosomal translocation in a human leukemic stem-cell line disrupts the T-cell antigen receptor delta-chain diversity region and results in a previously unreported fusion transcript

    Proc. Natl. Acad. Sci. USA

    (1989)
  • O. Bernard

    Two distinct mechanisms for the SCL gene activation in the t(1;14) translocation of T-cell leukemias

    Genes Chromosomes Cancer

    (1990)
  • Q. Chen

    The tal gene undergoes chromosome translocation in T cell leukemia and potentially encodes a helix–loop–helix protein

    EMBO J.

    (1990)
  • M.L. Cleary et al.

    Chromosomal translocation involving the beta T cell receptor gene in acute leukemia

    J. Exp. Med.

    (1988)
  • Y. Xia

    TAL2, a helix–loop–helix gene activated by the (7;9) (q34;q32) translocation in human T-cell leukemia

    Proc. Natl. Acad. Sci. USA

    (1991)
  • Cited by (34)

    • A compendium of genome-wide hematopoietic transcription factor maps supports the identification of gene regulatory control mechanisms

      2011, Experimental Hematology
      Citation Excerpt :

      For example, within a previously reported array of regulatory elements spread over 100 kb of the mouse Lmo2 locus [17], it was those regions with the strongest hematopoietic activity in transgenic mouse assays that overlapped with the largest number of transcription factor peaks in our new compendium (see Fig. 2A). Similar observations were made for other gene loci with significant experimental knowledge on gene regulatory elements such as Sfpi1/Pu.1 [18], Scl/Tal1 [19,20], Lyl1 [21–23], and Gfi1 [24] (data not shown). Given the consistently seen correlation between known regulatory regions and multifactor binding, we next asked whether the vast amount of gene regulatory information integrated within the HemoChIP compendium might provide a rapid way of identifying additional factors bound to already known regulatory regions.

    • Transcriptional regulatory networks in haematopoiesis

      2008, Current Opinion in Genetics and Development
      Citation Excerpt :

      Four of these clusters were situated in gene loci of TF genes expressed in embryonic HSCs and three of these clusters (controlling Fli1, Prh and Smad6 genes) were found to possess an in vivo activity similar to that of the Scl/Tal1 +19 element when assayed in transgenic mice [9••,10]. Using further bioinformatic and transgenic analyses, additional key regulators of blood development have progressively been integrated into this nascent gene regulatory network, such as the BMP4/Smad pathway and Runx TFs [10–15] (Figure 2). Interestingly, concerted biochemical and cell biological analyses revealed functional cross-talk between the BMP/Smad axis and Runx1.

    • Friend of GATA-1-independent transcriptional repression: A novel mode of GATA-1 function

      2007, Blood
      Citation Excerpt :

      Only Rgs13 repression was delayed by the V205G mutation. GATA-2 occupies conserved GATA motifs within the Lyl1 promoter,27 and a 464-bp region of the promoter that includes the GATA motifs is sufficient to drive Lyl1 expression in endothelial and hematopoietic cells.28 GATA-1 occupancy of the Lyl1 locus had not been studied previously.

    • The paralogous hematopoietic regulators Lyl1 and Scl are coregulated by Ets and GATA factors, but Lyl1 cannot rescue the early Scl<sup>-/-</sup> phenotype

      2007, Blood
      Citation Excerpt :

      Lyl1 and Scl exhibit more than 90% amino acid identity in their bHLH DNA binding domains, and both interact with the lim-only-domain leukemia oncogenes LMO1 and LMO2.6 Lyl1 and Scl display overlapping expression patterns across several hematopoietic lineages7 and are also both expressed in developing endothelial cells.8 Targeted deletion has shown that Scl is essential for the early specification of hematopoietic stem cells (HSCs) as well as vascular and neural development.9–13

    View all citing articles on Scopus

    Sequence data from this article have been deposited with the DDBJ/EMBL/GenBank Data Libraries under Accession No. AL731834.

    1

    Current address: Department of Medicine and Therapeutics, University of Glasgow, Glasgow G11 6NT, UK.

    View full text