Skip to main content

Analysis of RNA-Seq Data Using TEtranscripts

  • Protocol
  • First Online:
Transcriptome Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1751))

Abstract

Transposable elements (TE) are mobile genetic elements that can readily change their genomic position. When not properly silenced, TEs can contribute a substantial portion to the cell’s transcriptome, but are typically ignored in most RNA-seq data analyses. One reason for leaving TE-derived reads out of RNA-seq analyses is the complexities involved in properly aligning short sequencing reads to these highly repetitive regions. Here we describe a method for including TE-derived reads in RNA-seq differential expression analysis using an open source software package called TEtranscripts. TEtranscripts is designed to assign both uniquely and ambiguously mapped reads to all possible gene and TE-derived transcripts in order to statistically infer the correct gene/TE abundances. Here, we provide a detailed tutorial of TEtranscripts using a published qPCR validated dataset.

Abstract

Barbara McClintock laid the foundation for TE research with her discoveries in maize of mobile genetic elements capable of inserting into novel locations in the genome, altering the expression of nearby genes [1]. Since then, our appreciation of the contribution of repetitive TE-derived sequences to eukaryotic genomes has vastly increased. With the publication of the first human genome draft by the Human Genome Project, it was determined that nearly half of the human genome is derived from TE sequences [2, 3], with varying levels of repetitive DNA present in most plant and animal species. More recent studies looking at distantly related TE-like sequences have estimated that up to two thirds of the human genome might be repeat-derived [4], with the vast majority of these sequences attributed to retrotransposons that require transcription as part of the mobilization process, as discussed below.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. McClintock B (1956) Controlling elements and the gene. Cold Spring Harb Symp Quant Biol 21:197–216

    Article  CAS  PubMed  Google Scholar 

  2. Lander ES, Linton LM, Birren B, Nusbaum C et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    Article  CAS  PubMed  Google Scholar 

  3. Smit AF (1999) Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev 9:657–663

    Article  CAS  PubMed  Google Scholar 

  4. de Koning AP, Gu W, Castoe TA, Batzer MA, Pollock DD (2011) Repetitive elements may comprise over two-thirds of the human genome. PLoS Genet 7(12):e1002384

    Article  PubMed  PubMed Central  Google Scholar 

  5. Garfinkel DJ, Boeke JD, Fink GR (1985) Ty element transposition: reverse transcriptase and virus-like particles. Cell 42:507–517

    Article  CAS  PubMed  Google Scholar 

  6. Finnegan DJ (1989) Eukaryotic transposable elements and genome evolution. Trends Genet 5:103–107

    Article  CAS  PubMed  Google Scholar 

  7. Wicker T, Sabot F, Hua-Van A, Bennetzen JL et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982

    Article  CAS  PubMed  Google Scholar 

  8. Jurka J et al (2005) Repbase update, a database of eukaryotic repetitive elements. Cytogent Genome Res 110:462–467

    Article  CAS  Google Scholar 

  9. Smit A, et al. (1996–2010) Repeatmasker open-3.0. http://www.repeatmasker.org

  10. Honma MA et al (1993) High-frequency germinal transposition of DsALS in Arabidopsis. Proc Natl Acad Sci U S A 90:6242–6246

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Mills RE et al (2007) Which transposable elements are active in the human genome? Trends Genet 23:183–191

    Article  CAS  PubMed  Google Scholar 

  12. Bennett EA et al (2008) Active alu retotransposons in the human genome. Genome Res 18:1875–1883

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Kano H et al (2009) L1 retrotransposition occurs mainly in embryogenesis and creates somatic mosaicism. Genes Dev 23:1303–1312

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Beck CR et al (2010) LINE-1 retrotransposition activity in human genomes. Cell 141:1159–1170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Hancks DC, Kazazian HH Jr (2012) Active human retrotransposons: variation and disease. Curr Opin Genet Dev 22:191–203

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Huang CR et al (2012) Active transposition in genomes. Annu Rev Genet 46:651–675

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Lamprecht B et al (2012) Derepression of an endogenous long terminal repeat activates the CSF1R proto-oncogene in human lymphoma. Nat Med 16:571–579

    Article  Google Scholar 

  18. Lee E et al (2012) Landscape of somatic retrotransposition in human cancers. Science 337:967–971

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Shukla R et al (2013) Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma. Cell 153:101–111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Sciamanna I et al (2013) A tumor-promoting mechanism mediated by retrotransposon-encoded reverse transcriptase is active in human transformed cell lines. Oncotarget 4:2271–2287

    Article  PubMed  PubMed Central  Google Scholar 

  21. Criscione S et al (2014) Transcriptional landscape of repetitive elements in normal and cancer human cells. BMC Genomics 15:583

    Article  PubMed  PubMed Central  Google Scholar 

  22. Sciamanna I et al (2014) Regulatory roles of LINE-1-encoded reverse transcriptase in cancer onset and progression. Oncotarget 5:8039–8051

    Article  PubMed  PubMed Central  Google Scholar 

  23. Tubio JM et al (2014) Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes. Science 345:1251343

    Article  PubMed  PubMed Central  Google Scholar 

  24. Li W et al (2013) Activation of transposable elements during aging and neuronal decline in drosophila. Nat Neurosci 16:529–531

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Reilly MT et al (2013) The role of transposable elements in health and diseases of the central nervous system. J Neurosci 33:17577–17586

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Bundo M et al (2014) Increased L1 retrotransposition in the neuronal genome in schizophrenia. Neuron 81:306–313

    Article  CAS  PubMed  Google Scholar 

  27. Peaston AE et al (2004) Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev Cell 7:597–606

    Article  CAS  PubMed  Google Scholar 

  28. Macia A et al (2011) Epigenetic control of retrotransposon expression in human embryonic stem cells. Mol Cell Biol 31:300–316

    Article  CAS  PubMed  Google Scholar 

  29. Fadloun A et al (2013) Chromatin signatures and retrotransposon profiling in mouse embryos reveal regulation of LINE-1 by RNA. Nat Struct Mol Biol 20:332–338

    Article  CAS  PubMed  Google Scholar 

  30. Muotri AR et al (2005) Somatic mosaicism in neuronal precursor cells mediated by L1 retrotransposition. Nature 35:903–910

    Article  Google Scholar 

  31. Coufal NG et al (2009) L1 retrotransposition in human neural progenitor cells. Nature 460:1127–1131

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Coufal NG et al (2011) Ataxia telangiectasia mutated (ATM) modulates long interspersed element-1 (l1) retotransposition in human neural stem cells. Proc Natl Acad Sci U S A 108:20382–20387

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Faulkner GJ et al (2009) The regulated retrotransposon transcriptome of mammalian cells. Nat Genet 41:563–571

    Article  CAS  PubMed  Google Scholar 

  34. Perrat PN et al (2013) Transposition-driven genomic heterogeneity in the Drosophila brain. Science 340:91–95

    Article  CAS  PubMed  Google Scholar 

  35. Thomas CA et al (2012) LINE-1 retotransposition in the nervous system. Annu Rev Cell Dev Biol 28:555–573

    Article  CAS  PubMed  Google Scholar 

  36. De Cecco M et al (2013) Transposable elements become active and mobile in the genomes of aging mammalian somatic tissues. Aging 5:867–883

    Article  PubMed  PubMed Central  Google Scholar 

  37. Sedivy JM et al (2013) Death by transposition – the enemy within? Bioessays 35:1035–1043

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Rosenfeld JA et al (2009) Investigating repetitively matching short sequencing reads: the enigmatic nature of H3K9me3. Epigenetics 4:476–486

    Article  CAS  PubMed  Google Scholar 

  39. Day DS et al (2010) Estimating enrichment of repetitive elements from high-throughput sequence data. Genome Biol 11:R69

    Article  PubMed  PubMed Central  Google Scholar 

  40. Wang J et al (2010) A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags. Bioinformatics 26:2501–2508

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Chung D et al (2011) Discovering transcription factor binding sites in highly repetitive regions of genomes with multiread analysis of ChIP-Seq data. PLoS Comput Biol 7:e1002111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Tucker BA et al (2011) Exome sequencing and analysis of induced pluripotent stem cells identify the cilia-related gene male germ cell-associated kinase (MAK) as a cause of retinitis pigmentosa. Proc Natl Acad Sci U S A 108:E569–E576

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Treangen TJ, Salzberg SL (2011) Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet 13:36–46

    Article  PubMed  PubMed Central  Google Scholar 

  44. Jin Y, Tam OH, Paniagua E, Hammell M (2015) TEtranscripts: a package for including transposable elements in differential expression analysis of RNA-seq datasets. Bioinformatics 31:3593–3599

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Li H et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    Article  PubMed  PubMed Central  Google Scholar 

  46. Dobin A et al (2012) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21

    Article  PubMed  PubMed Central  Google Scholar 

  47. Ohtani H et al (2013) DmGTSF1 is necessary for Piwi piRISC-mediated transcriptional transposon silencing in the drosophila ovary. Genes Dev 27:1656–1661

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Karolchik D et al (2003) The UCSC genome browser database. Nucleic Acids Res 31:51–54

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Lin Y, Golovnina K, Chen ZX, Lee HN et al (2016) Comparison of normalization and differential expression analysis using RNA-seq data from 726 individual Drosophila melanogaster. BMC Genomics 17:28

    Article  PubMed  PubMed Central  Google Scholar 

  51. Polymenidou M et al (2011) Long pre-mRNA depletion and RNA missplicing contribute to neuronal vulnerability from loss of TDP-43. Nat Neurosci 14:459–468

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Molly Hammell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Jin, Y., Hammell, M. (2018). Analysis of RNA-Seq Data Using TEtranscripts. In: Wang, Y., Sun, Ma. (eds) Transcriptome Data Analysis. Methods in Molecular Biology, vol 1751. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7710-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7710-9_11

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7709-3

  • Online ISBN: 978-1-4939-7710-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics