Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis
Molecular mechanisms for maintenance of G-rich short tandem repeats capable of adopting G4 DNA structures
Introduction
Mammalian genomes are known to contain various types of repetitive sequences, including microsatellites, triplet repeats, minisatellites, and telomeres [1]. LINE, SINE and LTR sequences are also frequently found in the genome of vertebrates [2]. Although there is some variation in the size definition for repeat families, microsatellite repeats are generally composed of short repetitive units of less than 6 nucleotides [1]. It is estimated that more than 100,000 microsatellite loci occur in the mammalian genome. Microsatellites are typically less than 100 base pairs (bp) in total length [3]. Although triplet repeats are also categorized as members of the microsatellite family, some triplet repeat loci can expand to hundreds or even thousands of repeats in certain human disorders (e.g., fragile X syndrome, myotonic dystrophy, Huntington disease) by un-clarified mechanisms [4], [5]. In contrast, minisatellite repeats (also known as variable number of tandem repeats; VNTR) are composed of longer repetitive units of 5 or 6–100 nucleotides and are found in arrays expanded up to 10–20 kbp. As opposed to microsatellites, only a few 1000 such loci are present in the genome [6]. Although the biological significance of minisatellite repeats remains largely unknown, some repeat regions are known to be hotspots for meiotic recombination [7], [8]. They are also occasionally found in fragile chromosomal sites and could serve as targets for genomic recombination or chromosomal breakage [9], [10].
Genetic alteration at a few specific microsatellite and minisatellite repeats results in several human disorders. Alterations in microsatellite repeats are frequently found in cancer cells, and are caused by both genetic and functional alterations of genes encoding mismatch repair proteins, including MLH1, MSH2, MSH6, PMS1, PMS2 and MLH3 [11], [12]. Mutations in mismatch repair genes, as occurs in hereditary non-polyposis colorectal cancers, lead to microsatellite instabililty (MSI). MSI can, in turn, cause frame-shift mutations in long tracts of mononucleotides; examples of this have been observed in the transforming growth factor-β type II, BAX and insulin-like growth factor II receptor genes [11], [12]. Under mismatch repair-deficient conditions, microsatellite repeats are altered by the insertion or deletion of small numbers of mono- or di-nucleotide repeat units [11], [12].
Alteration at certain minisatelllite loci are also implicated in genetic predisposition to some human disorders, such as insulin-dependent diabetes mellitus type 2 (IDDM-2) [13]. A rare allele of the Ha-ras-VNTR, which is located in the 3′ region of the Ha-ras gene, appears to be associated with various cancers including breast, colon, urinary bladder and acute leukemia [14].
In this review article, characteristic structural features of G-rich short tandem repeats are summarized, and molecular mechanisms involved in maintaining genomic stability at these G-rich repetitive sequences are discussed.
Section snippets
Minisatellite Pc-1 and Pc-1-like repeats
The mouse Pc-1 minisatellite (also known as the expanded simple tandem repeat Ms6-hm) consists of a tandem array of G-rich repeats d(GGCAG)n, flanked with locus specific sequence [15]. The length of the repeat arrays vary widely among mouse strains [16], [17]. Pc-1 was observed to be a recombination hotspot at meiosis, with a germ-line mutation rate as high as 10% per gamete [18], [19]. In normal somatic cells, however, the repeats are relatively stable and the mutation rate has been estimated
Size alteration of G-rich minisatellite repeats in the genome
Minisatellite repeats are generally stable in somatic cells compared to germ cells as described above [24], [25], [26], but alterations at minisatellite regions can be induced both in cell cultures and in vivo [21], [27], [28], [29], [30], [31]. When culture cells are exposed to a variety of chemical carcinogens, ultraviolet irradiation or ionizing radiation, DNA fingerprint analysis reveals alterations of the banding pattern of genomic regions containing tandem repeats. We hereafter refer to
Formation of G4′ structure by d(GGCAG)n in vitro
We used structural analyses to investigate whether higher order structures occur in G-rich repetitive sequences to gain further insight into the molecular mechanisms operating in induced instability at these sites. The formation of secondary structures is likely due to the G-rich nature of the repeats. For example, a triple-stranded DNA between d(GGA:TCC) repeats and d(GGA) repeat oligonucleotides forms a D-loop-like higher order structure [32]. The d(CTG:CAG) repeats from the myotonic
DNA synthesis arrest at the d(GGG) sites in vitro
Several studies have demonstrated the ability of d(GGG) sites to cause DNA synthesis arrest. For example, the in vitro DNA synthesis assay showed DNA synthesis arrest at the first d(GGG) site of a single-stranded phagemid (pYA-3) carrying 12 repeats of d(GGCAG), with additional weaker stops at the second, third and fourth d(GGG) sequences (Fig. 2). A primer extension reaction using a synthetic oligonucleotide containing d(GGCAG)15 gave similar results (further described below). Inhibition of in
Isolation of G-rich minisatellite binding proteins
To elucidate the underlying molecular mechanisms for maintenance of genomic stability at genomic G-rich minisatellite sequences, we carried out an electrophoretic mobility shift assay (EMSA) to identify minisatellite binding proteins (MNBPs) using cell-free extracts from NIH3T3 cells treated with okadaic acid (Table 1). Okadaic acid (OA) is a known tumor promoter [46] and a specific inhibitor of the mammalian serine/threonine protein phosphatases [47]. OA is capable of inducing minisatellite
hnRNP A1 and UP1 bind to G-rich repetitive sequences and unfold G4′ structures
To clarify the consequence of hnRNP A1 binding to G-rich repetitive sequences, we investigated its effect on the stability of the G4′ structure. Recombinant hnRNP A1 and unwinding protein 1 (UP1), a proteolytic product of hnRNP A1 lacking the C-terminal portion of hnRNP A1 [49], [50], [51], were expressed in E. coli as GST fusion proteins and purified, by releasing a GST tag, for use in G4′ binding assays. DNA binding affinity and sequence specificity of UP1 is almost equivalent to those of
Abrogation of DNA synthesis arrest at the G4′ structure by UP1
When template carrying d(GGCAG)12 was used for an in vitro DNA synthesis assay, a primer extension reaction with BcaBEST DNA polymerase was obstructed mainly at the first d(GGG) site (Fig. 4A). A primer extension reaction with a synthetic oligonucleotide containing a d(CAGGG)15 repeat also demonstrated that progression of DNA polymerase was obstructed mainly at the first d(GGG) site followed by additional weaker stops at the second, third, fourth, fifth and sixth d(GGG) sites (Fig. 4B). When
Destruction of the telomere G4′ structure by binding of UP1
UP1 also binds to G5+TEL and TRM4 [d(TTAGGG)4], both of which contain four telomeric repeats [52], [60]. Under physiological-like conditions, d(TTAGGG)4 also gave a positive CD band at 290–295 nm, similar to d(GGCAG) repeats, indicating the formation of a G4′ structure [52]. In contrast, four-stranded parallel quadruplex DNA was not observed except at a much higher DNA concentration (data not shown). In vitro DNA synthesis using synthetic oligonucleotides and several DNA polymerases, including
UP1 unfolds the higher order DNA structures of d(CGG) triplet repeats
Another important functional aspect of UP1 is its role in the unfolding the higher order DNA structures of d(CGG) triplet repeats [64]. The d(CGG)n tract forms hairpin, quadruplex, and homoduplex structures in vitro under physiological-like conditions [64], [65], [66]. In our study, the CD band analysis of d(CGG)16 in either the presence or absence of 150 mM KCl showed a large negative peak at 255 nm and a weak positive peak at 280 nm, suggesting the formation of a non-B-type higher order DNA
Expression of hnRNP A1 and hnRNP A3 in sporadic human colorectal cancers
Both hnRNP A1 and A3 are over-expressed in human colorectal cancers [71]. In the case of hnRNP A1, quantitative gene expression analysis revealed that 60% (18/30) of sporadic human colorectal cancers showed over-expression of hnRNP A1 in cancer tissues by at least two-fold compared to their normal counterparts. Interestingly, 78% of cases at clinicopathological stage II showed increased expression of two-fold or greater; this is two-fold higher than that seen in the more advanced stage IV [71].
C-rich strands of minisatellite binding proteins
C-rich binding proteins, LRP130 and Tudor-SN/SND1, were also isolated from OA-treated NIH3T3 cells using oligonucleotide d(CTGCC)8 as a probe, as described earlier [48]. It has been demonstrated that LRP130 and Tudor-SN/SND1 have some sequence specificity for DNA binding [72], [73]. Interestingly, both LRP130 and Tudor-SN/SND1 bind to C-rich RNA sequences in a sequence specific manner, and are mainly localized at perinuclear regions and in the cytoplasm [74]. This may suggest the involvement of
Implication of G4 DNA structures in promoter regions in gene transcription
Although the biological roles of G-rich repeat sequences capable of adopting G4 DNA structures are still largely elusive, one of the intriguing fields of G4 DNA structure research is in its possible involvement in regulation of gene transcription. The G4′-quadruplex structure of the promoter region of the c-myc oncogene has been found to function as a transcriptional repressor [75] and the expression of c-myc can be inhibited by ligand-mediated G4-quadruplex stabilization [76]. In the case of
Future perspectives
G-rich repetitive sequences are frequently observed in the genome and are a subset of triplet repeats and of minisatellite DNA. Because of the peculiar and specific DNA structures adopted by G-rich repetitive sequences, these genomic regions might induce DNA replication fork arrest, leading to size alterations of the repeats in vivo. Various cellular components, such as WRN, TBPs, UP1 and hnRNP A1, may work as guardians against G-rich repeat length instability. In addition, C-rich sequence
Acknowledgements
The authors thank Drs. Minako Nagao and Takashi Sugimura for their helpful comments and continuing support for the research. This work was supported by a Grant-in-Aid for Cancer Research from the Ministry of Health, Labour and Welfare of Japan, and Grants-in-Aid for Scientific Research (for M.K. and for H.F.) and the Protein 3000 Project (for M.K.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. K.H., E.T. and K.N. are the recipients of a Research Resident
References (79)
- et al.
Hypervariable minisatellite DNA is a hotspot for homologous recombination in human cells
Cell
(1990) - et al.
Fragile sites and minisatellite repeat instability
Mol. Genet. Metab.
(2000) - et al.
Human chromosomal fragile site FRA16B is an amplified AT-rich minisatellite repeat
Cell
(1997) - et al.
Characterization of a highly unstable mouse minisatellite locus: evidence for somatic mutation during early development
Genomics
(1989) - et al.
A GGCAGG motif in minisatellites affecting their germline instability
J. Biol. Chem.
(1990) - et al.
The mouse Ms6-hm hypervariable microsatellite forms a hairpin and two unusual tetraplexes
J. Biol. Chem.
(1998) - et al.
Effect of SCID mutation on the occurrence of mouse Pc-1 (Ms6-hm) germline mutations
Mutat. Res.
(2002) Long-term genetic effects of radiation exposure
Mutat. Res.
(2003)Advances in the application of germline tandem repeat instability for in situ monitoring
Mutat. Res.
(2004)- et al.
Structure–function correlations of the insulin-linked polymorphic region
J. Mol. Biol.
(1996)
A novel K(+)-dependent DNA synthesis arrest site in a commonly occurring sequence motif in eukaryotes
J. Biol. Chem.
Intramolecular quadruplex formation of the G-rich strand of the mouse hypervariable minisatellite Pc-1
Biochem. Biophys. Res. Commun.
The chicken beta-globin gene promoter forms a novel “cinched” tetrahelical structure
J. Biol. Chem.
Smooth muscle myosin phosphatase inhibition and force enhancement by black sponge toxin
FEBS Lett.
Detection and isolation of minisatellite Pc-1 binding proteins
Biochim. Biophys. Acta
Purification and physical characterization of nucleic acid helix-unwinding proteins from calf thymus
J. Biol. Chem.
High pressure liquid chromatography purification of UP1 and UP2, two related single-stranded nucleic acid-binding proteins from calf thymus
J. Biol. Chem.
Structure of hnRNP D complexed with single-stranded telomere DNA and unfolding of the quadruplex by heterogeneous nuclear ribonucleoprotein D
J. Biol. Chem.
G4 DNA binding by LR1 and its subunits, nucleolin and hnRNP D, A role for G–G pairing in immunoglobulin switch recombination
J. Biol. Chem.
Sequence-specific binding protein of single-stranded and unimolecular quadruplex telomeric DNA from rat hepatocytes
J. Biol. Chem.
Purification and characterization of qTBP42, a new single-stranded and quadruplex telomeric DNA-binding protein from rat hepatocytes
J. Biol. Chem.
Tetrahelical forms of the fragile X syndrome expanded sequence d(CGG)(n) are destabilized by two heterogeneous nuclear ribonucleoprotein-related telomeric DNA-binding proteins
J. Biol. Chem.
The Bloom's syndrome helicase unwinds G4 DNA
J. Biol. Chem.
Human werner syndrome DNA helicase unwinds tetrahelical structures of the fragile X syndrome repeat sequence d(CGG)n
J. Biol. Chem.
FMR1 and the fragile X syndrome: human genome epidemiology review
Genet. Med.
Interactions between the Werner syndrome helicase and DNA polymerase delta specifically facilitate copying of tetraplex and hairpin structures of the d(CGG)n trinucleotide repeat sequence
J. Biol. Chem.
LRP130, a single-stranded DNA/RNA-binding protein, localizes at the outer nuclear and endoplasmic reticulum membrane, and interacts with mRNA in vivo
Biochem. Biophys. Res. Commun.
Stabilization of the c-myc gene promoter quadruplex by specific ligands’ inhibitors of telomerase
Biochem. Biophys. Res. Commun.
The sequence of the human genome
Science
LINEs, SINEs and processed pseudogenes: parasitic strategies for genome modeling
Cytogenet. Genome Res.
Microsatellite instability in human cancer: a prognostic marker for chemotherapy
Exp. Cell Res.
Fourteen and counting: unraveling trinucleotide repeat diseases
Hum. Mol. Genet.
Therapeutics development for triplet repeat expansion diseases
Nat. Rev. Genet.
Mutation processes at human minisatellites
Electrophoresis
Meiotic recombination hot spots and human DNA diversity
Philos. Trans. R. Soc. London B: Biol. Sci.
Genetic predisposition to colorectal cancer
Nat. Rev. Cancer
Deficient DNA mismatch repair: a common etiologic factor for colon cancer
Hum. Mol. Genet.
Minisatellites and human disease
Science
An association between the risk of cancer and mutations in the HRAS1 minisatellite locus
N. Engl. J. Med.
Cited by (21)
TMPyP4 porphyrin distorts RNA G-quadruplex structures of the disease-associated r(GGGGCC)n repeat of the C9orf72 gene and blocks interaction of RNAbinding proteins
2014, Journal of Biological ChemistryCitation Excerpt :To further test the potential of TMPyP4 to interfere with protein interaction, we assessed its effect upon the binding of hnRNPA1, an r(GGGGCC)n-binding protein that may have implications for ALS-FTD (6, 10). Previously, it was shown that hnRNPA1, or its proteolytic fragment UP1, could bind to various G-rich repeats, including telomeres and the fragile X-associated (CGG)n repeat (44–46). To test the potential interaction between hnRNPA1 and r(GGGGCC)8, a band shift assay was performed in the presence of increasing concentrations of purified hnRNPA1 (Fig. 2B).
The evolving world of protein-G-quadruplex recognition: A medicinal chemist's perspective
2011, BiochimieCitation Excerpt :Subsequent electrophoretic mobility shift assays revealed tight binding (nanomolar Kd) of the Pc-1 oligonucleotides to hnRNP A1 and A3. Addition of nucleoproteins abolished DNA synthesis arrest, confirming their ability to unfold G4 architectures and contribute to preserving genome stability [110,111]. The role of UP1, at least in telomerase regulation, appears to be quite complex.
HnRNP A3 binds to and protects mammalian telomeric repeats in vitro
2007, Biochemical and Biophysical Research CommunicationsCitation Excerpt :Loss of hnRNP A1 in mouse cells correlated with short telomeres and expression of hnRNP A1 or UP1 restored the normal length of telomeric repeats [4]. Taking these data together, hnRNP A1 is thought to be involved in telomere maintenance by stimulating telomerase activity through its ability to modulate a tertial structure of the telomere end [12,13,27]. HnRNP A2/B1, C1/C2, D, and E were demonstrated to associate with telomeres or/and telomerase [5–8] and suggested to play some roles in telomere metabolism.
Special issue on induced tandem repeat instability
2006, Mutation Research - Fundamental and Molecular Mechanisms of MutagenesisSuite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
2012, Journal of Bioinformatics and Computational Biology