Molecular mechanisms for maintenance of G-rich short tandem repeats capable of adopting G4 DNA structures

https://doi.org/10.1016/j.mrfmmm.2006.01.014Get rights and content

Abstract

Mammalian genomes contain several types of repetitive sequences. Some of these sequences are implicated in various specific cellular events, including meiotic recombination, chromosomal breaks and transcriptional regulation, and also in several human disorders. In this review, we document the formation of DNA secondary structures by the G-rich repetitive sequences that have been found in several minisatellites, telomeres and in various triplet repeats, and report their effects on in vitro DNA synthesis. d(GGCAG) repeats in the mouse minisatellite Pc-1 were demonstrated to form an intra-molecular folded-back quadruplex structure (also called a G4′ structure) by NMR and CD spectrum analyses. d(TTAGGG) telomere repeats and d(CGG) triplet repeats were also shown to form G4′ and other unspecified higher order structures, respectively. In vitro DNA synthesis was substantially arrested within the repeats, and this could be responsible for the preferential mutability of the G-rich repetitive sequences. Electrophoretic mobility shift assays using NIH3T3 cell extracts revealed heterogeneous nuclear ribonucleoprotein (hnRNP) A1 and A3, which were tightly and specifically bound to d(GGCAG) and d(TTAGGG) repeats with Kd values in the order of nM. HnRNP A1 unfolded the G4′ structure formed in the d(GGCAG)n and d(TTAGGG)n repeat regions, and also resolved the higher order structure formed by d(CGG) triplet repeats. Furthermore, DNA synthesis arrest at the secondary structures of d(GGCAG) repeats, telomeres and d(CGG) triplet repeats was efficiently repressed by the addition of hnRNP A1. High expression of hnRNPs may contribute to the maintenance of G-rich repetitive sequences, including telomere repeats, and may also participate in ensuring the stability of the genome in cells with enhanced proliferation. Transcriptional regulation of genes, such as c-myc and insulin, by G4 sequences found in the promoter regions could be an intriguing field of research and help further elucidate the biological functions of the hnRNP family of proteins in human diseases.

Introduction

Mammalian genomes are known to contain various types of repetitive sequences, including microsatellites, triplet repeats, minisatellites, and telomeres [1]. LINE, SINE and LTR sequences are also frequently found in the genome of vertebrates [2]. Although there is some variation in the size definition for repeat families, microsatellite repeats are generally composed of short repetitive units of less than 6 nucleotides [1]. It is estimated that more than 100,000 microsatellite loci occur in the mammalian genome. Microsatellites are typically less than 100 base pairs (bp) in total length [3]. Although triplet repeats are also categorized as members of the microsatellite family, some triplet repeat loci can expand to hundreds or even thousands of repeats in certain human disorders (e.g., fragile X syndrome, myotonic dystrophy, Huntington disease) by un-clarified mechanisms [4], [5]. In contrast, minisatellite repeats (also known as variable number of tandem repeats; VNTR) are composed of longer repetitive units of 5 or 6–100 nucleotides and are found in arrays expanded up to 10–20 kbp. As opposed to microsatellites, only a few 1000 such loci are present in the genome [6]. Although the biological significance of minisatellite repeats remains largely unknown, some repeat regions are known to be hotspots for meiotic recombination [7], [8]. They are also occasionally found in fragile chromosomal sites and could serve as targets for genomic recombination or chromosomal breakage [9], [10].

Genetic alteration at a few specific microsatellite and minisatellite repeats results in several human disorders. Alterations in microsatellite repeats are frequently found in cancer cells, and are caused by both genetic and functional alterations of genes encoding mismatch repair proteins, including MLH1, MSH2, MSH6, PMS1, PMS2 and MLH3 [11], [12]. Mutations in mismatch repair genes, as occurs in hereditary non-polyposis colorectal cancers, lead to microsatellite instabililty (MSI). MSI can, in turn, cause frame-shift mutations in long tracts of mononucleotides; examples of this have been observed in the transforming growth factor-β type II, BAX and insulin-like growth factor II receptor genes [11], [12]. Under mismatch repair-deficient conditions, microsatellite repeats are altered by the insertion or deletion of small numbers of mono- or di-nucleotide repeat units [11], [12].

Alteration at certain minisatelllite loci are also implicated in genetic predisposition to some human disorders, such as insulin-dependent diabetes mellitus type 2 (IDDM-2) [13]. A rare allele of the Ha-ras-VNTR, which is located in the 3′ region of the Ha-ras gene, appears to be associated with various cancers including breast, colon, urinary bladder and acute leukemia [14].

In this review article, characteristic structural features of G-rich short tandem repeats are summarized, and molecular mechanisms involved in maintaining genomic stability at these G-rich repetitive sequences are discussed.

Section snippets

Minisatellite Pc-1 and Pc-1-like repeats

The mouse Pc-1 minisatellite (also known as the expanded simple tandem repeat Ms6-hm) consists of a tandem array of G-rich repeats d(GGCAG)n, flanked with locus specific sequence [15]. The length of the repeat arrays vary widely among mouse strains [16], [17]. Pc-1 was observed to be a recombination hotspot at meiosis, with a germ-line mutation rate as high as 10% per gamete [18], [19]. In normal somatic cells, however, the repeats are relatively stable and the mutation rate has been estimated

Size alteration of G-rich minisatellite repeats in the genome

Minisatellite repeats are generally stable in somatic cells compared to germ cells as described above [24], [25], [26], but alterations at minisatellite regions can be induced both in cell cultures and in vivo [21], [27], [28], [29], [30], [31]. When culture cells are exposed to a variety of chemical carcinogens, ultraviolet irradiation or ionizing radiation, DNA fingerprint analysis reveals alterations of the banding pattern of genomic regions containing tandem repeats. We hereafter refer to

Formation of G4′ structure by d(GGCAG)n in vitro

We used structural analyses to investigate whether higher order structures occur in G-rich repetitive sequences to gain further insight into the molecular mechanisms operating in induced instability at these sites. The formation of secondary structures is likely due to the G-rich nature of the repeats. For example, a triple-stranded DNA between d(GGA:TCC) repeats and d(GGA) repeat oligonucleotides forms a D-loop-like higher order structure [32]. The d(CTG:CAG) repeats from the myotonic

DNA synthesis arrest at the d(GGG) sites in vitro

Several studies have demonstrated the ability of d(GGG) sites to cause DNA synthesis arrest. For example, the in vitro DNA synthesis assay showed DNA synthesis arrest at the first d(GGG) site of a single-stranded phagemid (pYA-3) carrying 12 repeats of d(GGCAG), with additional weaker stops at the second, third and fourth d(GGG) sequences (Fig. 2). A primer extension reaction using a synthetic oligonucleotide containing d(GGCAG)15 gave similar results (further described below). Inhibition of in

Isolation of G-rich minisatellite binding proteins

To elucidate the underlying molecular mechanisms for maintenance of genomic stability at genomic G-rich minisatellite sequences, we carried out an electrophoretic mobility shift assay (EMSA) to identify minisatellite binding proteins (MNBPs) using cell-free extracts from NIH3T3 cells treated with okadaic acid (Table 1). Okadaic acid (OA) is a known tumor promoter [46] and a specific inhibitor of the mammalian serine/threonine protein phosphatases [47]. OA is capable of inducing minisatellite

hnRNP A1 and UP1 bind to G-rich repetitive sequences and unfold G4′ structures

To clarify the consequence of hnRNP A1 binding to G-rich repetitive sequences, we investigated its effect on the stability of the G4′ structure. Recombinant hnRNP A1 and unwinding protein 1 (UP1), a proteolytic product of hnRNP A1 lacking the C-terminal portion of hnRNP A1 [49], [50], [51], were expressed in E. coli as GST fusion proteins and purified, by releasing a GST tag, for use in G4′ binding assays. DNA binding affinity and sequence specificity of UP1 is almost equivalent to those of

Abrogation of DNA synthesis arrest at the G4′ structure by UP1

When template carrying d(GGCAG)12 was used for an in vitro DNA synthesis assay, a primer extension reaction with BcaBEST DNA polymerase was obstructed mainly at the first d(GGG) site (Fig. 4A). A primer extension reaction with a synthetic oligonucleotide containing a d(CAGGG)15 repeat also demonstrated that progression of DNA polymerase was obstructed mainly at the first d(GGG) site followed by additional weaker stops at the second, third, fourth, fifth and sixth d(GGG) sites (Fig. 4B). When

Destruction of the telomere G4′ structure by binding of UP1

UP1 also binds to G5+TEL and TRM4 [d(TTAGGG)4], both of which contain four telomeric repeats [52], [60]. Under physiological-like conditions, d(TTAGGG)4 also gave a positive CD band at 290–295 nm, similar to d(GGCAG) repeats, indicating the formation of a G4′ structure [52]. In contrast, four-stranded parallel quadruplex DNA was not observed except at a much higher DNA concentration (data not shown). In vitro DNA synthesis using synthetic oligonucleotides and several DNA polymerases, including

UP1 unfolds the higher order DNA structures of d(CGG) triplet repeats

Another important functional aspect of UP1 is its role in the unfolding the higher order DNA structures of d(CGG) triplet repeats [64]. The d(CGG)n tract forms hairpin, quadruplex, and homoduplex structures in vitro under physiological-like conditions [64], [65], [66]. In our study, the CD band analysis of d(CGG)16 in either the presence or absence of 150 mM KCl showed a large negative peak at 255 nm and a weak positive peak at 280 nm, suggesting the formation of a non-B-type higher order DNA

Expression of hnRNP A1 and hnRNP A3 in sporadic human colorectal cancers

Both hnRNP A1 and A3 are over-expressed in human colorectal cancers [71]. In the case of hnRNP A1, quantitative gene expression analysis revealed that 60% (18/30) of sporadic human colorectal cancers showed over-expression of hnRNP A1 in cancer tissues by at least two-fold compared to their normal counterparts. Interestingly, 78% of cases at clinicopathological stage II showed increased expression of two-fold or greater; this is two-fold higher than that seen in the more advanced stage IV [71].

C-rich strands of minisatellite binding proteins

C-rich binding proteins, LRP130 and Tudor-SN/SND1, were also isolated from OA-treated NIH3T3 cells using oligonucleotide d(CTGCC)8 as a probe, as described earlier [48]. It has been demonstrated that LRP130 and Tudor-SN/SND1 have some sequence specificity for DNA binding [72], [73]. Interestingly, both LRP130 and Tudor-SN/SND1 bind to C-rich RNA sequences in a sequence specific manner, and are mainly localized at perinuclear regions and in the cytoplasm [74]. This may suggest the involvement of

Implication of G4 DNA structures in promoter regions in gene transcription

Although the biological roles of G-rich repeat sequences capable of adopting G4 DNA structures are still largely elusive, one of the intriguing fields of G4 DNA structure research is in its possible involvement in regulation of gene transcription. The G4′-quadruplex structure of the promoter region of the c-myc oncogene has been found to function as a transcriptional repressor [75] and the expression of c-myc can be inhibited by ligand-mediated G4-quadruplex stabilization [76]. In the case of

Future perspectives

G-rich repetitive sequences are frequently observed in the genome and are a subset of triplet repeats and of minisatellite DNA. Because of the peculiar and specific DNA structures adopted by G-rich repetitive sequences, these genomic regions might induce DNA replication fork arrest, leading to size alterations of the repeats in vivo. Various cellular components, such as WRN, TBPs, UP1 and hnRNP A1, may work as guardians against G-rich repeat length instability. In addition, C-rich sequence

Acknowledgements

The authors thank Drs. Minako Nagao and Takashi Sugimura for their helpful comments and continuing support for the research. This work was supported by a Grant-in-Aid for Cancer Research from the Ministry of Health, Labour and Welfare of Japan, and Grants-in-Aid for Scientific Research (for M.K. and for H.F.) and the Protein 3000 Project (for M.K.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. K.H., E.T. and K.N. are the recipients of a Research Resident

References (79)

  • K.J. Woodford et al.

    A novel K(+)-dependent DNA synthesis arrest site in a commonly occurring sequence motif in eukaryotes

    J. Biol. Chem.

    (1994)
  • M. Katahira et al.

    Intramolecular quadruplex formation of the G-rich strand of the mouse hypervariable minisatellite Pc-1

    Biochem. Biophys. Res. Commun.

    (1999)
  • R.M. Howell et al.

    The chicken beta-globin gene promoter forms a novel “cinched” tetrahelical structure

    J. Biol. Chem.

    (1996)
  • A. Takai et al.

    Smooth muscle myosin phosphatase inhibition and force enhancement by black sponge toxin

    FEBS Lett.

    (1987)
  • H. Fukuda et al.

    Detection and isolation of minisatellite Pc-1 binding proteins

    Biochim. Biophys. Acta

    (2001)
  • G. Herrick et al.

    Purification and physical characterization of nucleic acid helix-unwinding proteins from calf thymus

    J. Biol. Chem.

    (1976)
  • B.M. Merrill et al.

    High pressure liquid chromatography purification of UP1 and UP2, two related single-stranded nucleic acid-binding proteins from calf thymus

    J. Biol. Chem.

    (1986)
  • Y. Enokizono et al.

    Structure of hnRNP D complexed with single-stranded telomere DNA and unfolding of the quadruplex by heterogeneous nuclear ribonucleoprotein D

    J. Biol. Chem.

    (2005)
  • L.A. Dempsey et al.

    G4 DNA binding by LR1 and its subunits, nucleolin and hnRNP D, A role for G–G pairing in immunoglobulin switch recombination

    J. Biol. Chem.

    (1999)
  • R. Erlitzki et al.

    Sequence-specific binding protein of single-stranded and unimolecular quadruplex telomeric DNA from rat hepatocytes

    J. Biol. Chem.

    (1997)
  • G. Sarig et al.

    Purification and characterization of qTBP42, a new single-stranded and quadruplex telomeric DNA-binding protein from rat hepatocytes

    J. Biol. Chem.

    (1997)
  • P. Weisman-Shomer et al.

    Tetrahelical forms of the fragile X syndrome expanded sequence d(CGG)(n) are destabilized by two heterogeneous nuclear ribonucleoprotein-related telomeric DNA-binding proteins

    J. Biol. Chem.

    (2000)
  • H. Sun et al.

    The Bloom's syndrome helicase unwinds G4 DNA

    J. Biol. Chem.

    (1998)
  • M. Fry et al.

    Human werner syndrome DNA helicase unwinds tetrahelical structures of the fragile X syndrome repeat sequence d(CGG)n

    J. Biol. Chem.

    (1999)
  • D.C. Crawford et al.

    FMR1 and the fragile X syndrome: human genome epidemiology review

    Genet. Med.

    (2001)
  • A.S. Kamath-Loeb et al.

    Interactions between the Werner syndrome helicase and DNA polymerase delta specifically facilitate copying of tetraplex and hairpin structures of the d(CGG)n trinucleotide repeat sequence

    J. Biol. Chem.

    (2001)
  • N. Tsuchiya et al.

    LRP130, a single-stranded DNA/RNA-binding protein, localizes at the outer nuclear and endoplasmic reticulum membrane, and interacts with mRNA in vivo

    Biochem. Biophys. Res. Commun.

    (2004)
  • T. Lemarteleur et al.

    Stabilization of the c-myc gene promoter quadruplex by specific ligands’ inhibitors of telomerase

    Biochem. Biophys. Res. Commun.

    (2004)
  • J.C. Venter et al.

    The sequence of the human genome

    Science

    (2001)
  • M. Dewannieux et al.

    LINEs, SINEs and processed pseudogenes: parasitic strategies for genome modeling

    Cytogenet. Genome Res.

    (2005)
  • H. te Riele et al.

    Microsatellite instability in human cancer: a prognostic marker for chemotherapy

    Exp. Cell Res.

    (1999)
  • C.J. Cummings et al.

    Fourteen and counting: unraveling trinucleotide repeat diseases

    Hum. Mol. Genet.

    (2000)
  • N.A. Di Prospero et al.

    Therapeutics development for triplet repeat expansion diseases

    Nat. Rev. Genet.

    (2005)
  • A.J. Jeffreys et al.

    Mutation processes at human minisatellites

    Electrophoresis

    (1995)
  • A.J. Jeffreys et al.

    Meiotic recombination hot spots and human DNA diversity

    Philos. Trans. R. Soc. London B: Biol. Sci.

    (2004)
  • A. de la Chapelle

    Genetic predisposition to colorectal cancer

    Nat. Rev. Cancer

    (2004)
  • P. Peltomaki

    Deficient DNA mismatch repair: a common etiologic factor for colon cancer

    Hum. Mol. Genet.

    (2001)
  • T.G. Krontiris

    Minisatellites and human disease

    Science

    (1995)
  • T.G. Krontiris et al.

    An association between the risk of cancer and mutations in the HRAS1 minisatellite locus

    N. Engl. J. Med.

    (1993)
  • Cited by (21)

    • TMPyP4 porphyrin distorts RNA G-quadruplex structures of the disease-associated r(GGGGCC)n repeat of the C9orf72 gene and blocks interaction of RNAbinding proteins

      2014, Journal of Biological Chemistry
      Citation Excerpt :

      To further test the potential of TMPyP4 to interfere with protein interaction, we assessed its effect upon the binding of hnRNPA1, an r(GGGGCC)n-binding protein that may have implications for ALS-FTD (6, 10). Previously, it was shown that hnRNPA1, or its proteolytic fragment UP1, could bind to various G-rich repeats, including telomeres and the fragile X-associated (CGG)n repeat (44–46). To test the potential interaction between hnRNPA1 and r(GGGGCC)8, a band shift assay was performed in the presence of increasing concentrations of purified hnRNPA1 (Fig. 2B).

    • The evolving world of protein-G-quadruplex recognition: A medicinal chemist's perspective

      2011, Biochimie
      Citation Excerpt :

      Subsequent electrophoretic mobility shift assays revealed tight binding (nanomolar Kd) of the Pc-1 oligonucleotides to hnRNP A1 and A3. Addition of nucleoproteins abolished DNA synthesis arrest, confirming their ability to unfold G4 architectures and contribute to preserving genome stability [110,111]. The role of UP1, at least in telomerase regulation, appears to be quite complex.

    • HnRNP A3 binds to and protects mammalian telomeric repeats in vitro

      2007, Biochemical and Biophysical Research Communications
      Citation Excerpt :

      Loss of hnRNP A1 in mouse cells correlated with short telomeres and expression of hnRNP A1 or UP1 restored the normal length of telomeric repeats [4]. Taking these data together, hnRNP A1 is thought to be involved in telomere maintenance by stimulating telomerase activity through its ability to modulate a tertial structure of the telomere end [12,13,27]. HnRNP A2/B1, C1/C2, D, and E were demonstrated to associate with telomeres or/and telomerase [5–8] and suggested to play some roles in telomere metabolism.

    • Special issue on induced tandem repeat instability

      2006, Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis
    View all citing articles on Scopus
    View full text