Microsatellite is an important component of complete Hepatitis C virus genomes

https://doi.org/10.1016/j.meegid.2011.06.012Get rights and content

Abstract

Microsatellites are common and play diverse roles in eukaryotic and prokaryotic genomes. However, to our knowledge, microsatellite distribution remains largely enigmatic in viruses yet is crucial for understanding instability of viral genomes. We have therefore, examined microsatellite distribution in 54 complete genomes of Hepatitis C virus (HCV) from six genotypes, showing microsatellites were an important component of HCV genomes. Our results showed, in all analyzed HCV genomes, genome size and GC content had a weak influence on number, relative abundance and relative density of microsatellites, respectively. For each HCV genome, mono-, di- and trinucleotide repeats were very predominant, whereas other types of repeats rarely occurred. Our results revealed that the occurrence of microsatellites was significantly less than higher prokaryotes and eukaryotes and that all identified microsatellites were very short. The discovery of microsatellites in HCV genomes may become useful for population genetic, evolutionary analysis and strain (isolate) identification.

Highlights

► We reveal the distribution rules of mononucleotide repeats in HCV genomes. ► We explore the correlation between HCV genome features and microsatellite distribution. ► Microsatellite is an important component of complete HCV genomes. ► Different HCV genomes exhibit similar distribution pattern of microsatellites.

Introduction

Microsatellites or simple sequence repeats (SSRs) consist of mono-, di-, tri-, tetra-, penta- and hexanucleotide repeats (Chen et al., 2011a, Chen et al., 2010), being highly polymorphic in eukaryotic (Ellegren, 2004, Li et al., 2004, Rajendrakumar et al., 2007) and prokaryotic genomes (Gur-Arie et al., 2000). The abundance of these six types of microsatellites varies between different species (Karaoglu et al., 2005). Microsatellites are found in diverse regions of genomes, including 3′-UTR, 5′-UTR, exon and intron (Li et al., 2004, Rajendrakumar et al., 2007, Toth et al., 2000). Triplet repeats are more common than non-triplet repeats in coding regions in eukaryotes, due to the fact that length changes in non-triplet repeats may lead to frameshift mutations in coding regions (Ellegren, 2004, Li et al., 2004). The most common microsatellite motifs may be different in various species. For example, in Aspergillus nidulans, the most common microsatellite motifs are AT/TA repeats, whereas AG/GA repeats are most abundant in Fusarium graminearum (Karaoglu et al., 2005). Genome size and GC content have been shown to have a certain influence on the occurrence of microsatellites in several species (Coenye and Vandamme, 2005, Dieringer and Schlotterer, 2003). Strand slippage and unequal recombination have been proposed to explain microsatellite instability (Toth et al., 2000). Intrinsic features of microsatellites (repeat number, length and motif size) have the strongest influence on the microsatellite mutability, whereas regional genomic factors have only minor effects (Kelkar et al., 2008). Mutability of microsatellites grows with the number of repeats, most likely because of an increase in the probability of slippage (Pearson et al., 2005). Imperfection in microsatellites is thought to influence replication slippage by limiting expansion of microsatellite size (Mudunuri and Nagarajaram, 2007). Because of their high instability, microsatellites are believed to serve a functional role in genome evolution (Tautz et al., 1986). It has been shown that microsatellites are associated with various genetic diseases (Usdin, 2008), including Huntington’s disease and spinobulbar muscular atrophy (Li et al., 2004). Some microsatellites are related to bacterial pathogenesis and virulence, and can increase the antigenic variance to escape from the host immune response (Li et al., 2004, Mrazek et al., 2007).

However, despite widespread distribution and functional significance in genomes, little is known about distribution rules of microsatellites in viral genomes. Numerous polymorphic microsatellites were detected in human cytomegalovirus (HCMV), herpes simplex virus type 1 (HSV-1), and Ostreid herpesvirus 1 (OsHV-1) genomes (Davis et al., 1999, Deback et al., 2009, Segarra et al., 2010). Microsatellites have been also observed in genomes of epidemic human respiratory adenovirus, influenza virus, and Sin Nombre virus (Houng et al., 2009, Mudunuri et al., 2009). Our recent report comprehensively analyzed microsatellite distribution in viral pre-microRNAs, and found microsatellites were extensively presented in these small non-coding RNA sequences (Chen et al., 2010). In a previous study performed by us, Human Immunodeficiency Virus Type 1 (HIV-1) was thought to be an excellent system to study evolution and roles of viral microsatellites, and this analysis indicated microsatellites were very short in length and were in low abundance (Chen et al., 2009). However, there remains much to be confirmed whether these features from HIV-1 genomes are suitable for other viruses. Moreover, until recently, there is some lack of knowledge about mononucleotide repeats and the correlation between genome features and microsatellite distribution in viral genomes. HCV has a positive sense RNA genome that is composed of a single open reading frame, mostly containing six genotypes (genotypes 1, 2, 3, 4, 5 and 6) (Kuiken et al., 2005). Genomic diversity of HCV can provide a very good opportunity to address abovementioned problems.

In the present study, we present a comprehensive analysis of the distribution of microsatellites over 6 nt in 54 complete HCV genomes which belong to six genotypes. We analyzed distribution of mononucleotide repeats and explored the correlation between genome features and microsatellite distribution using linear regression analyses for the first time. We also compared our results and other organisms, and discussed their similarity and difference.

Section snippets

HCV genome sequences

We downloaded 54 complete HCV genomes from GenBank (http://www.ncbi.nlm.nih.gov). Analyzed sequences fall into six genotypes. The availability of complete HCV genomes from different genotypes is non-identical. The availably complete genomes from genotypes 1, 2, 3 and 6 are significantly more than those from genotypes 4 and 5. At the time of writing, only one complete genome was available for genotypes 4 and 5, respectively. Thus, the selected number of complete HCV genomes from various

Results and discussion

Different studies used different parameters to search a genome for microsatellites. Power et al. showed the number of microsatellites significantly changed by increasing or decreasing the threshold value of repeat units (Power et al., 2009). Thus, it is very important to select an appropriate threshold value of repeat length. Previous studies have selected threshold repeat length of 6 nt in HIV-1 genomes whose genome sizes are very similar with HCV genomes (Chen et al., 2009). Likewise, in the

Acknowledgements

The authors thank the editors and reviewers for very helpful comments and suggestions. The study was financially supported by Production, Education and Research guiding project, Guangdong Province (2010B090400439), Great program for GMO, Ministry of Agriculture of the people Republic of China (2009ZX08015-003A), the National Natural Science Foundation of China (Nos. 50608029, 50978088, 50808073, 51039001), Hunan Provincial Innovation Foundation for Postgraduate, the National Basic Research

References (39)

  • C.L. Davis et al.

    Numerous length polymorphisms at short tandem repeats in human cytomegalovirus

    J. Virol.

    (1999)
  • C. Deback et al.

    Utilization of microsatellite polymorphism for differentiating herpes simplex virus type 1 strains

    J. Clin. Microbiol.

    (2009)
  • D. Dieringer et al.

    Two distinct modes of microsatellite mutation processes: evidence from the complete genomic sequences of nine species

    Genome Res.

    (2003)
  • H. Ellegren

    Microsatellites: simple sequences with complex evolution

    Nat. Rev. Genet.

    (2004)
  • R. Gur-Arie et al.

    Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism

    Genome Res.

    (2000)
  • J.M. Hancock

    Genome size and the accumulation of simple sequence repeats: implications of new data from genome sequencing projects

    Genetica

    (2002)
  • B. Harr et al.

    Long microsatellite alleles in Drosophila melanogaster have a downward mutation bias and short persistence times, which cause their genome-wide underrepresentation

    Genetics

    (2000)
  • H.S. Houng et al.

    Adenovirus microsatellite reveals dynamics of transmission during a recent epidemic of human adenovirus serotype 14 infection

    J. Clin. Microbiol.

    (2009)
  • H. Karaoglu et al.

    Survey of simple sequence repeats in completed fungal genomes

    Mol. Biol. Evol.

    (2005)
  • Cited by (19)

    • The only conserved microsatellite in coding regions of ebolavirus is the editing site

      2021, Biochemical and Biophysical Research Communications
    • Conserved microsatellites may contribute to stem-loop structures in 5′, 3′ terminals of Ebolavirus genomes

      2019, Biochemical and Biophysical Research Communications
      Citation Excerpt :

      Whole-genome-scale analysis of 257 viruses suggested that microsatellites were widely distributed in protein-coding and non-coding regions of viral genomes [4]. Microsatellites might potentially contribute to evolution of virus, through the comprehensive analysis of viral genomes in HIV-1 [17], Potyvirus [18] and HCV [19]. The distribution and composition of microsatellites in seven species of Filoviridae family, including Ebolavirus, also have salient features [20].

    • Genome wide survey of microsatellites in ssDNA viruses infecting vertebrates

      2014, Gene
      Citation Excerpt :

      Since the size of these virus genomes is far too small in comparison to the eukaryotic genomes, we considered threshold values to define a microsatellite as set by some previous studies conducted on virus genomes for parallel comparison. As such, mononucleotide repeat motifs being repeated five or more times (George et al., 2012) and dinucleotide to hexanucleotide repeats where the motif was repeated three or more times (Chen et al., 2009, 2011; George et al., 2012; Zhao et al., 2012) were considered microsatellites. Distribution of microsatellites in the coding or non-coding regions was recorded on the basis of the existing annotations, ‘CDS’ features of GenBank and microsatellite coordinates.

    • The analysis of microsatellites and compound microsatellites in 56 complete genomes of Herpesvirales

      2014, Gene
      Citation Excerpt :

      There are some reasons why we select this material and why we do this work. Firstly, SSRs in many small viral genomes (genome size ≤ 10,000 bp), such as Hepatitis C virus (HCV) (Chen et al., 2011a), Potyvirus (Alam et al., 2013a; Zhao et al., 2011), tobamoviruses (Alam et al., 2013b), carlaviruses (Alam et al., 2014a) and potexvirus (Alam et al., 2014b) have been researched (Supplementary Table 5), but microsatellite analysis has not been conducted in large viral genomes (genome size ≥ 100,000 bp) previously; secondly, more information of genomes is available for herpesviruses than for any other large DNA virus (Davison, 2002); thirdly, many repeating structures (terminal repeats (TR) and internal repeats (IR)) containing microsatellites were found in genomes of Herpesvirales (Dominguez et al., 1996; Martin et al., 1991), so herpesvirus genomes are suitable to analyze SSRs; and lastly, although experiment is needed, microsatellites may be also a better choice to study genome evolution (Madsen et al., 2008) because of its polymorphisms and high mutability. Therefore, the analysis of microsatellites and compound microsatellites was carried out in 56 herpesvirus genomes with larger genomes size; and the results can hopefully help us to better understand the functional and evolutionary role of microsatellites in viruses.

    View all citing articles on Scopus
    View full text