Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Targeted sequencing for gene discovery and quantification using RNA CaptureSeq

Abstract

RNA sequencing (RNAseq) samples the majority of expressed genes infrequently, owing to the large size, complex splicing and wide dynamic range of eukaryotic transcriptomes. This results in sparse sequencing coverage that can hinder robust isoform assembly and quantification. RNA capture sequencing (CaptureSeq) addresses this challenge by using oligonucleotide probes to capture selected genes or regions of interest for targeted sequencing. Targeted RNAseq provides enhanced coverage for sensitive gene discovery, robust transcript assembly and accurate gene quantification. Here we describe a detailed protocol for all stages of RNA CaptureSeq, from initial probe design considerations and capture of targeted genes to final assembly and quantification of captured transcripts. Initial probe design and final analysis can take less than 1 d, whereas the central experimental capture stage requires 7 d.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Purchase on Springer Link

Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Schematic overview of targeted RNAseq.
Figure 2: Schematic overview of targeted RNAseq in three stages: design (red), capture (purple) and analysis (blue).
Figure 3: Designing probes at splice junctions.
Figure 4: Estimated fold enrichment achieved relative to the number of genes targeted.
Figure 5: Example Agilent Bioanalyzer results for pre- and postcapture libraries.
Figure 6: Analysis of ERCC RNA spike-ins to assess the performance of CaptureSeq experiments.
Figure 7: Example of targeted RNAseq expanding the annotation of an lncRNA.

Similar content being viewed by others

Accession codes

Accessions

Gene Expression Omnibus

References

  1. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  2. Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).

    Article  CAS  Google Scholar 

  3. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  CAS  Google Scholar 

  4. Martin, J.A. & Wang, Z. Next-generation transcriptome assembly. Nat. Rev. Genet. 12, 671–682 (2011).

    Article  CAS  Google Scholar 

  5. Ozsolak, F. & Milos, P.M. RNA sequencing: advances, challenges and opportunities. Nat. Rev. Genet. 12, 87–98 (2011).

    Article  CAS  Google Scholar 

  6. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).

    Article  CAS  Google Scholar 

  7. Mercer, T.R. et al. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat. Biotechnol. 30, 99–104 (2012).

    Article  CAS  Google Scholar 

  8. Levin, J.Z. et al. Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol. 10, R115 (2009).

    Article  Google Scholar 

  9. Zhang, K. et al. Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nat. Methods 6, 613–618 (2009).

    Article  CAS  Google Scholar 

  10. Li, J.B. et al. Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324, 1210–1213 (2009).

    Article  CAS  Google Scholar 

  11. Clark, M.J. et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29, 908–914 (2011).

    Article  CAS  Google Scholar 

  12. Levin, J.Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).

    Article  CAS  Google Scholar 

  13. Turner, E.H., Ng, S.B., Nickerson, D.A. & Shendure, J. Methods for genomic partitioning. Annu. Rev. Genomics Hum. Genet. 10, 263–284 (2009).

    Article  CAS  Google Scholar 

  14. Mamanova, L. et al. Target-enrichment strategies for next-generation sequencing. Nat. Methods 7, 111–118 (2010).

    Article  CAS  Google Scholar 

  15. Craig, D.W. et al. Identification of genetic variants using bar-coded multiplexed sequencing. Nat. Methods 5, 887–893 (2008).

    Article  CAS  Google Scholar 

  16. Howald, C. et al. Combining RT-PCR–seq and RNA-seq to catalog all genic elements encoded in the human genome. Genome Res. 22, 1698–1710 (2012).

    Article  CAS  Google Scholar 

  17. Porreca, G.J. et al. Multiplex amplification of large sets of human exons. Nat. Methods 4, 931–936 (2007).

    Article  CAS  Google Scholar 

  18. Dahl, F., Gullberg, M., Stenberg, J., Landegren, U. & Nilsson, M. Multiplex amplification enabled by selective circularization of large sets of genomic DNA fragments. Nucleic Acids Res. 33, e71 (2005).

    Article  Google Scholar 

  19. Anders, S. et al. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat. Protoc. 8, 1765–1786 (2013).

    Article  Google Scholar 

  20. Kreil, D.P., Russell, R.R. & Russell, S. Microarray oligonucleotide probes. Methods Enzymol. 410, 73–98 (2006).

    Article  CAS  Google Scholar 

  21. Harrow, J. et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 7 (suppl. 1), S4: 1–9 (2006).

    PubMed  Google Scholar 

  22. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  CAS  Google Scholar 

  23. Baillie, J.K. et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature 479, 534–537 (2011).

    Article  CAS  Google Scholar 

  24. ERC Consortium. Proposed methods for testing and selecting the ERCC external RNA controls. BMC Genomics 6, 150 (2005).

  25. Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012).

    Article  CAS  Google Scholar 

  26. Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  Google Scholar 

  27. Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008–2017 (2012).

    Article  CAS  Google Scholar 

  28. DeLuca, D.S. et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics 28, 1530–1532 (2012).

    Article  CAS  Google Scholar 

  29. Reich, M. et al. GenePattern 2.0. Nat. Genet. 38, 500–501 (2006).

    Article  CAS  Google Scholar 

  30. Goecks, J., Nekrutenko, A. & Taylor, J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11, R86 (2010).

    Article  Google Scholar 

  31. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    Article  Google Scholar 

  32. Derveaux, S., Vandesompele, J. & Hellemans, J. How to do successful gene expression analysis using real-time PCR. Methods 50, 227–230 (2010).

    Article  CAS  Google Scholar 

  33. Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  Google Scholar 

  34. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

  35. Au, K.F., Jiang, H., Lin, L., Xing, Y. & Wong, W.H. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res. 38, 4570–4578 (2010).

    Article  CAS  Google Scholar 

  36. Wu, T.D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).

    Article  CAS  Google Scholar 

  37. Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31, 46–53 (2013).

    Article  CAS  Google Scholar 

  38. Mezlini, A.M. et al. iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res. 23, 519–529 (2013).

    Article  CAS  Google Scholar 

  39. Li, J.J., Jiang, C.R., Brown, J.B., Huang, H. & Bickel, P.J. Sparse linear modeling of next-generation mRNA sequencing (RNA-seq) data for isoform discovery and abundance estimation. Proc. Natl. Acad. Sci. USA 108, 19867–19872 (2011).

    Article  CAS  Google Scholar 

  40. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  41. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  Google Scholar 

  42. Kuhn, R.M. et al. The UCSC Genome Browser Database: update 2009. Nucleic Acids Res. 37, D755–761 (2009).

    Article  CAS  Google Scholar 

  43. Amaral, P.P., Clark, M.B., Gascoigne, D.K., Dinger, M.E. & Mattick, J.S. lncRNAdb: a reference database for long noncoding RNAs. Nucleic Acids Res. 39, D146–D151 (2011).

    Article  CAS  Google Scholar 

  44. Cabili, M.N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).

    Article  CAS  Google Scholar 

  45. Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).

    Article  CAS  Google Scholar 

  46. Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).

    Article  CAS  Google Scholar 

  47. Citri, A., Pang, Z.P., Sudhof, T.C., Wernig, M. & Malenka, R.C. Comprehensive qPCR profiling of gene expression in single neuronal cells. Nat. Protoc. 7, 118–127 (2012).

    Article  CAS  Google Scholar 

  48. Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank the following funding sources: the Australian National Health and Medical Research Council (Australia Fellowship 631668; to J.S.M., T.R.M. and M.B.C.) and the Queensland State Government (National and International Research Alliance Program; to L.K.N.). We also thank the Institute for Molecular Bioscience core sequencing facility; we thank P. Danoy, J. Jeddeloh (Roche/NimbleGen) and T. Bruxner (Queensland Centre for Medical Genomics) for technical advice and assistance with capture sequencing; and we thank R. Bannen (Roche/NimbleGen) for helping with the design of capture arrays.

Author information

Authors and Affiliations

Authors

Contributions

T.R.M. and M.E.D. jointly conceived the CaptureSeq strategy. J.C. and M.B.C. designed, optimized and performed all stages of the protocol. T.R.M. and M.B.C. performed the analysis. M.E.B. and D.J.G. contributed to protocol development and optimization. T.R.M., J.C., M.B.C., M.E.B., L.K.N., R.J.T., M.E.D. and J.S.M. prepared the manuscript. L.K.N., R.J.T. and J.S.M. provided funding support.

Corresponding authors

Correspondence to Marcel E Dinger or John S Mattick.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Data

RNA CaptureSeq oligonucleotide sequences. (PDF 351 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mercer, T., Clark, M., Crawford, J. et al. Targeted sequencing for gene discovery and quantification using RNA CaptureSeq. Nat Protoc 9, 989–1009 (2014). https://doi.org/10.1038/nprot.2014.058

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2014.058

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing