Nanopore native RNA sequencing of a human poly(A) transcriptome

Workman, Rachael E.; Tang, Alison D.; Tang, Paul S.; Jain, Miten; Tyson, John R.; Razaghi, Roham; Zuzarte, Philip C.; Gilpatrick, Timothy; Payne, Alexander; Quick, Joshua; Sadowski, Norah; Holmes, Nadine; de Jesus, Jaqueline Goes; Jones, Karen L.; Soulette, Cameron M.; Snutch, Terrance P.; Loman, Nicholas; Paten, Benedict; Loose, Matthew; Simpson, Jared T.; Olsen, Hugh E.; Brooks, Angela N.; Akeson, Mark; Timp, Winston

doi:10.1038/s41592-019-0617-2

Article
Published: 18 November 2019

Nanopore native RNA sequencing of a human poly(A) transcriptome

Nature Methods volume 16, pages 1297–1305 (2019)Cite this article

29k Accesses
296 Citations
188 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 09 December 2019

This article has been updated

Abstract

High-throughput complementary DNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies. Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read-length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3′ poly(A) tail length, base modifications and transcript haplotypes.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Nanopore native poly(A) RNA sequencing pipeline.**

**Fig. 2: Performance statistics for nanopore native RNA sequencing.**

**Fig. 3: Mitochondrially encoded poly(A) RNA transcripts.**

**Fig. 4: Isoform-level analysis of GM12878 native poly(A) RNA sequence reads.**

**Fig. 5: Testing and implementation of the poly(A) tail length estimator nanopolish-polya.**

**Fig. 6: Nanopore detection of m6A and inosine base modifications.**

Improving prime editing with an endogenous small RNA-binding protein

Article Open access 03 April 2024

Jun Yan, Paul Oyler-Castrillo, … Britt Adamson

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Anoushka Joglekar, Wen Hu, … Hagen U. Tilgner

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Data availability

Sequence data including raw signal files (FAST5), event-level data (FAST5), base-calls (FASTQ) and alignments (BAM) are available as an Amazon Web Services Open Data set, for download from https://github.com/nanopore-wgs-consortium/NA12878. The scripts used for various analyses are also available from the same GitHub under nanopore-human-transcriptome/scripts.

Code availability

General scripts available at: https://github.com/nanopore-wgs-consortium/NA12878/tree/master/nanopore-human-transcriptome/scripts. Poly(A) caller (‘nanopolish-polya’, https://github.com/jts/nanopolish) and isoform analysis code for FLAIR (https://github.com/BrooksLabUCSC/flair).

Change history

09 December 2019
An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Adams, M. D. Complementary DNA sequencing: expressed sequenced tags and human genome project. Science 252, 1651–1656 (1991).
Article CAS PubMed Google Scholar
Temin, H. M. & Mizutani, S. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature 226, 1211–1213 (1970).
Article CAS PubMed Google Scholar
Baltimore, D. Viral RNA-dependent DNA polymerase: RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature 226, 1209 (1970).
Article CAS PubMed Google Scholar
Saiki, R. K. et al. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239, 487–491 (1988).
Article CAS PubMed Google Scholar
Garalde, D. R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods 15, 201–206 (2018).
Article CAS PubMed Google Scholar
Jenjaroenpun, P. et al. Complete genomic and transcriptional landscape analysis using third-generation sequencing: a case study of Saccharomyces cerevisiae CEN.PK113-7D. Nucleic Acids Res. 46, e38 (2018).
Article CAS PubMed PubMed Central Google Scholar
Smith, A. M., Jain, M., Mulroney, L., Garalde, D. R. & Akeson, M. Reading canonical and modified nucleobases in 16S ribosomal RNA using nanopore native RNA sequencing. PLoS One 14, e0216709 (2019).
Article CAS PubMed PubMed Central Google Scholar
Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).
Article CAS PubMed PubMed Central Google Scholar
Venturini, L., Caim, S., Kaithakottil, G. G., Mapleson, D. L. & Swarbreck, D. Leveraging multiple transcriptome assembly methods for improved gene structure annotation. Gigascience 7, giy093 (2018).
Article PubMed Central CAS Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jain, M. et al. Improved data analysis for the MinION nanopore sequencer. Nat. Methods 12, 351–356 (2015).
Article CAS PubMed PubMed Central Google Scholar
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338 (2018).
Article CAS PubMed PubMed Central Google Scholar
Szczesny, R. J. et al. RNA degradation in yeast and human mitochondria. Biochim. Biophys. Acta 1819, 1027–1034 (2012).
Article CAS PubMed Google Scholar
Payne, A., Holmes, N., Rakyan, V. & Loose, M. BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files. Bioinformatics 35, 2193–2198 (2018).
Article PubMed Central CAS Google Scholar
Tilgner, H., Grubert, F., Sharon, D. & Snyder, M. P. Defining a personal, allele-specific, and single-molecule long-read transcriptome. Proc. Natl Acad. Sci. USA 111, 9869–9874 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cho, H. et al. High-resolution transcriptome analysis with long-read RNA sequencing. PLoS ONE 9, e108095 (2014).
Article PubMed PubMed Central CAS Google Scholar
Bernstein, B. E. et al. Genomic maps and comparative analysis of histone modifications in human and mouse. Cell 120, 169–181 (2005).
Article CAS PubMed Google Scholar
Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Article CAS PubMed PubMed Central Google Scholar
Deveson, I. W. et al. Universal alternative splicing of noncoding exons. Cell Syst. 6, 245–255 (2018).
Article CAS PubMed Google Scholar
Gonzàlez-Porta, M., Frankish, A., Rung, J., Harrow, J. & Brazma, A. Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene. Genome Biol. 14, R70 (2013).
Article PubMed PubMed Central CAS Google Scholar
Baralle, F. E. & Giudice, J. Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017).
Article CAS PubMed PubMed Central Google Scholar
Edge, P., Bafna, V. & Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 27, 801–812 (2017).
Article CAS PubMed PubMed Central Google Scholar
Rozowsky, J. et al. AlleleSeq: analysis of allele-specific expression and binding in a network framework. Mol. Syst. Biol. 7, 522 (2011).
Article PubMed PubMed Central CAS Google Scholar
Brown, C. J. et al. A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature 349, 38 (1991).
Article CAS PubMed Google Scholar
Eckmann, C. R., Rammelt, C. & Wahle, E. Control of poly(A) tail length. Wiley Interdiscip. Rev. RNA 2, 348–361 (2011).
Article CAS PubMed Google Scholar
Subtelny, A. O., Eichhorn, S. W., Chen, G. R., Sive, H. & Bartel, D. P. Poly(A)-tail profiling reveals an embryonic switch in translational control. Nature 508, 66–71 (2014).
Article CAS PubMed PubMed Central Google Scholar
Chang, H., Lim, J., Ha, M. & Kim, V. N. TAIL-seq: genome-wide determination of poly(A) tail length and 3’ end modifications. Mol. Cell 53, 1044–1052 (2014).
Article CAS PubMed Google Scholar
Temperley, R. J., Wydro, M., Lightowlers, R. N. & Chrzanowska-Lightowlers, Z. M. Human mitochondrial mRNAs—like members of all families, similar but different. Biochim. Biophys. Acta Bioenerg. 1797, 1081–1085 (2010).
Article CAS Google Scholar
Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).
Article CAS PubMed Google Scholar
Rand, A. C. et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat. Methods 14, 411–413 (2017).
Article CAS PubMed PubMed Central Google Scholar
Liu, N. & Pan, T. N6-methyladenosine–encoded epitranscriptomics. Nat. Struct. Mol. Biol. 23, 98–102 (2016).
Article CAS PubMed Google Scholar
Dai, D., Wang, H., Zhu, L., Jin, H. & Wang, X. N6-methyladenosine links RNA metabolism to cancer progression. Cell Death Dis. 9, 124 (2018).
Article PubMed PubMed Central CAS Google Scholar
Sibbritt, T., Patel, H. R. & Preiss, T. Mapping and significance of the mRNA methylome. Wiley Interdiscip. Rev. RNA 4, 397–422 (2013).
Article CAS PubMed Google Scholar
Meyer, K. D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3’ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Article CAS PubMed PubMed Central Google Scholar
Roost, C. et al. Structure and thermodynamics of N6-methyladenosine in RNA: a spring-loaded base modification. J. Am. Chem. Soc. 137, 2107–2115 (2015).
Article CAS PubMed PubMed Central Google Scholar
Licht, K., Kapoor, U., Mayrhofer, E. & Jantsch, M. F. Adenosine to Inosine editing frequency controlled by splicing efficiency. Nucleic Acids Res. 44, 6398–6408 (2016).
Article CAS PubMed PubMed Central Google Scholar
Nishikura, K. Functions and regulation of RNA editing by ADAR deaminases. Annu. Rev. Biochem. 79, 321–349 (2010).
Article CAS PubMed PubMed Central Google Scholar
Tajaddod, M., Jantsch, M. F. & Licht, K. The dynamic epitranscriptome: A to I editing modulates genetic information. Chromosoma 125, 51–63 (2016).
Article CAS PubMed Google Scholar
Tardaguila, M. et al. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res. 28, 396–411 (2018).
Article CAS PubMed Central Google Scholar
Anvar, S. Y. et al. Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing. Genome Biol. 19, 46 (2018).
Article PubMed PubMed Central CAS Google Scholar
Wang, L. et al. Transcriptomic characterization of SF3B1 mutation reveals its pleiotropic effects in chronic lymphocytic leukemia. Cancer Cell 30, 750–763 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bradley, R. K., Merkin, J., Lambert, N. J. & Burge, C. B. Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution. PLoS Biol. 10, e1001229 (2012).
Article CAS PubMed PubMed Central Google Scholar
Bresson, S. M., Hunter, O. V., Hunter, A. C. & Conrad, N. K. Canonical Poly(A) polymerase activity promotes the decay of a wide variety of mammalian nuclear RNAs. PLoS Genet. 11, e1005610 (2015).
Article PubMed PubMed Central CAS Google Scholar
Yi, H. et al. PABP cooperates with the CCR4-NOT complex to promote mRNA deadenylation and block precocious decay. Mol. Cell 70, 1081–1088 (2018).
Article CAS PubMed Google Scholar
Parker, R. & Song, H. The enzymes and control of eukaryotic mRNA turnover. Nat. Struct. Mol. Biol. 11, 121–127 (2004).
Article CAS PubMed Google Scholar
Li, X., Xiong, X. & Yi, C. Epitranscriptome sequencing technologies: decoding RNA modifications. Nat. Methods 14, 23–31 (2016).
Article PubMed CAS Google Scholar
Roundtree, I. A., Evans, M. E., Pan, T. & He, C. Dynamic RNA modifications in gene expression regulation. Cell 169, 1187–1200 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lee, M., Kim, B. & Kim, V. N. Emerging roles of RNA modification: m(6)A and U-tail. Cell 158, 980–987 (2014).
Article CAS PubMed Google Scholar
Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Preprint at bioRxiv https://doi.org/10.1101/410183 (2018).
Hinrichs, A. S. et al. The UCSC genome browser database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
Article CAS PubMed Google Scholar
Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2016).
Article PubMed CAS Google Scholar
Molinie, B. et al. m6A-LAIC-seq reveals the census and complexity of the m6A epitranscriptome. Nat. Methods 13, 692 (2016).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors are grateful for support from the following individuals. L. Snell, B. Sipos and D. Turner (ONT) provided materials and advice relevant to the 3′ poly(A) standards used to test nanopolish-polya. D. Garalde (ONT) provided early advice on use of the MinION for RNA sequencing. N. Conrad gave insight into the correlation of intron retention and poly(A) tail length. M. Diekhans reviewed the isoform analysis. Z. M. Chrzanowska-Lightowlers, T. Suzuki and S. Okada commented on early drafts of the manuscript. A. Beggs, L. Tee and T. Nieto (University of Birmingham, UK) provided cell cultures used in the Birmingham sequencing runs. The project was supported by the following grants: NIH HG010053 (A.N.B., B.P. and M.A.), NIH 5T32HG008345 (A.D.T.), NIH HG010538 (W.T.), NIH U54HG007990 (B.P.), U01 HL137183-02 (B.P.), Oxford Nanopore Research Grant SC20130149 (M.A.), National Institutes of Health Research Surgical Reconstruction and Microbiology Research Centre (J.Q.), Medical Research Council CLIMB Fellowship (N.L.), Wellcome Trust 204843/Z/16/Z (M.L.), BBSRC BB/N017099/1 and BB/M020061/1 (M.L.), the Canada Research Chair in Biotechnology and Genomics-Neurobiology (T.P.S.), the Canadian Institutes of Health Research (no. 10677; T.P.S.), the Canadian Epigenetics, Environment and Health Research Consortium (T.P.S.), the Koerner Foundation (T.P.S.), Genome Canada (OGI-136, J.T.S.), and the Ontario Institute for Cancer Research through funds provided by the Government of Ontario (J.T.S.), Pew Charitable Trust (A.N.B.).

Author information

These authors contributed equally: R. E. Workman, A. D. Tang, P. S. Tang, M. Jain, J. R. Tyson, R. Razaghi.
These authors jointly supervised this work: H. E. Olsen, A. N. Brooks, M. Akeson, W. Timp.

Authors and Affiliations

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
Rachael E. Workman, Roham Razaghi, Timothy Gilpatrick, Norah Sadowski & Winston Timp
Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
Alison D. Tang, Miten Jain, Cameron M. Soulette, Benedict Paten, Hugh E. Olsen, Angela N. Brooks & Mark Akeson
UCSC Genomics Institute, University of California, Santa Cruz, USA
Alison D. Tang, Miten Jain, Cameron M. Soulette, Benedict Paten, Hugh E. Olsen, Angela N. Brooks & Mark Akeson
Ontario Institute for Cancer Research, Toronto, Ontario, Canada
Paul S. Tang, Philip C. Zuzarte & Jared T. Simpson
Michael Smith Laboratories and Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada
John R. Tyson, Karen L. Jones & Terrance P. Snutch
DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
Alexander Payne, Nadine Holmes & Matthew Loose
University of Birmingham, Birmingham, UK
Joshua Quick, Jaqueline Goes de Jesus & Nicholas Loman
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Jared T. Simpson

Authors

Rachael E. Workman
View author publications
You can also search for this author in PubMed Google Scholar
Alison D. Tang
View author publications
You can also search for this author in PubMed Google Scholar
Paul S. Tang
View author publications
You can also search for this author in PubMed Google Scholar
Miten Jain
View author publications
You can also search for this author in PubMed Google Scholar
John R. Tyson
View author publications
You can also search for this author in PubMed Google Scholar
Roham Razaghi
View author publications
You can also search for this author in PubMed Google Scholar
Philip C. Zuzarte
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Gilpatrick
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Payne
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Quick
View author publications
You can also search for this author in PubMed Google Scholar
Norah Sadowski
View author publications
You can also search for this author in PubMed Google Scholar
Nadine Holmes
View author publications
You can also search for this author in PubMed Google Scholar
Jaqueline Goes de Jesus
View author publications
You can also search for this author in PubMed Google Scholar
Karen L. Jones
View author publications
You can also search for this author in PubMed Google Scholar
Cameron M. Soulette
View author publications
You can also search for this author in PubMed Google Scholar
Terrance P. Snutch
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Loman
View author publications
You can also search for this author in PubMed Google Scholar
Benedict Paten
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Loose
View author publications
You can also search for this author in PubMed Google Scholar
Jared T. Simpson
View author publications
You can also search for this author in PubMed Google Scholar
Hugh E. Olsen
View author publications
You can also search for this author in PubMed Google Scholar
Angela N. Brooks
View author publications
You can also search for this author in PubMed Google Scholar
Mark Akeson
View author publications
You can also search for this author in PubMed Google Scholar
Winston Timp
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A., W.T., H.E.O., M.J. and J.R.T. conceived the study. M.A., A.N.B. and W.T. coordinated the collaboration. R.E.W., N.S., N.H., J.Q., P.C.Z., H.E.O., M.J., J.R.T. and T.G. acquired data. R.E.W., A.D.T., N.S., T.G., M.L., A.P., N.L., R.R., A.N.B., P.S.T., J.T.S., B.P., H.E.O., J.R.T., W.T., M.A. and M.J. analyzed and interpreted data. Specifically, R.E.W. performed a first pass analysis and data indexing; T.G. and R.R. performed the allele-specific analysis; R.E.W. and R.R. performed the m6A modification analysis; P.S.T. and J.T.S. designed and implemented the poly(A) tail length estimation software; A.D.T. and A.N.B. performed transcript isoform analysis; P.S.T., W.T., R.R. and N.S. performed the polyA tail analysis; M.J. and H.E.O. performed the A-to-I base modification analysis; J.T., M.J., N.L. and H.E.O. performed sequencer performance analysis; and M.A., M.J., H.E.O., M.L. and A.P. performed mitochondrial gene expression analysis. The following were principally responsible for text and figures by topic: RNA preparation, nanopore sequencing, and computational pipeline (M.J., H.E.O., J.R.T., M.A.); native poly(A) RNA sequencing statistics (M.J., H.E.O., J.R.T., M.A.); FLAIR-based isoform detection and analysis (A.D.T., C.M.S., A.N.B.); assignment of transcripts to parental alleles using nanopore reads (T.G., R.R., W.T.); mitochondrially-encoded transcripts (M.A., H.E.O., M.J., M.L., A.P.); k-mer coverage (H.E.O., M.J.); 3′ poly(A) analysis (P.S.T., J.T.S., W.T., R.R., T.G.); m6A analysis (R.E.W., W.T., R.R., N.S.); A-to-I conversion (M.J., H.E.O.). Manuscript revisions and edits (R.E.W., A.D.T., P.S.T., M.J., J.R.T., P.C.Z., T.G., R.R., N.S., T.P.S., N.L., B.P., M.L., J.T.P., H.E.O., A.N.B., M.A., W.T.). K.L.J. and J.G.d.J. replicated and distributed GM12878 cells.

Corresponding authors

Correspondence to Mark Akeson or Winston Timp.

Ethics declarations

Competing interests

M.A. holds options in Oxford Nanopore Technologies (ONT). M.A. is a paid consultant to ONT. R.E.W., W.T., T.G., J.R.T., J.Q., N.J.L., J.T.S., N.S., A.N.B., M.A., H.E.O., M.J. and M.L. received reimbursement for travel, accommodation and conference fees to speak at events organised by ONT. N.L. has received an honorarium to speak at an ONT company meeting. W.T. has two patents (8,748,091 and 8,394,584) licensed to ONT. M.A. is an inventor on 11 UC patents licensed to ONT (6,267,872, 6,465,193, 6,746,594, 6,936,433, 7,060,50, 8,500,982, 8,679,747, 9,481,908, 9,797,013, 10,059,988, and 10,081,835). J.T.S., M.L. and M.A. received research funding from ONT.

Additional information

Peer review information Nicole Rusk was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 k-mer analysis for nanopore native RNA and cDNA sequencing.

a, Observed vs expected k-mers (k = 5) for ~2.9 million native RNA reads. b, Observed versus expected k-mers (k = 5) for ~3.9 million cDNA reads.

Supplementary Figure 2 Nanopore RNA reads recapitulate known features of the human MT-transcriptome.

a, Nanopore poly(A) RNA read coverage is consistent with the tRNA punctuation model of mitochondrial-RNA processing. Dark gray pattern represents read coverage along the heavy strand of the human mitochondrial genome. Labelled colored bars represent protein-coding genes, including known UTRs, or ribosomal RNAs. Regular font text below the colored bars identifies each gene. Yellow bars and red arrows represent the position of tRNA genes along the MT H strand. These tRNA genes are denoted by italicized text. b–d, Evidence for 3′ UTRs in MT-CO1 (b), MT-CO2 (c) and MT-ND5 (d) based on nanopore native RNA sequencing coverage. Dark gray pattern represents base coverage along specific mitochondrial transcripts. In b–d, numbering is base position relative to the human mitochondrial chromosome (chrM) for the hg38 reference. Colored horizontal lines represent protein coding sequences as in Fig. 3a of the main text. Red lines represent known 3′ UTR¹. For MT-ND5, the 3′ UTR indicated by nanopore read coverage is 26 nt longer than documented¹. This extension is denoted by a dashed red line in d.

Supplementary Figure 3 Commonly observed bicistronic human MT-RNA transcripts documented by nanopore sequencing.

a, Nanopore read-coverage plot of the mitochondrial heavy strand (gray pattern). Dotted red lines mark the predicted limits of bicistronic transcripts that encode mitochondrial ATP synthase protein 8 plus ATP synthase protein 6 (MT-ATP8/MT-ATP6), and mitochondrial NADH-ubiquinone oxidoreductase chain 4L plus NADH-ubiquinone oxidoreductase chain 4 (MT-ND4L/MT-ND4). MT-ATP8/MT-ATP6 and MT-ND4L/MT-ND4 are 841 nt and 1,667 nt long, respectively. b, Detailed view of MT-ATP8/MT-ATP6 coverage. Blue lines represent nominal length of individual gene products MT-ATP8 and MT-ATP6 within contiguous transcripts. Neighboring genes for MT-tRNA lysine (MT-TK) and Cytochrome C Oxidase Subunit 3 (MT-CO3) are marked by yellow and beige lines, respectively. c, Detailed view of MT-ND4L/MT-ND4 coverage. Blue lines represent nominal length of individual gene products MT-ND4L and MT-ND4 within contiguous transcripts. Yellow lines are neighboring genes for MT-tRNA arginine (MT-TR) and MT-tRNA histidine (MT-TH).

Supplementary Figure 4 Nanopore read coverage proximal to polycistronic RNA19 (MT-RNR2+ MT-TL1+MT-ND1).

a, Nanopore poly(A) read coverage of the human MT-RNA genome H strand. Expanded section includes reads that align only to MT-RNR2 or MT-ND1, as well as a smaller number of reads that align to MT-RNR2+MT-TL1+MT-ND1. b, Twenty-five examples of near full length (>2,600 nt) RNA19 transcript reads (total in study = 508 reads). c, Examples of nanopore poly(A) reads that cover full length RNA19 and attached unprocessed MT tRNA at the 5′ end (MT-TV) and at the 3′ end (MT-TI) of RNA19. There were a total of ten reads of this sort in the complete poly(A) dataset. Blue represents base matches to reference; red represents base mismatches to reference; orange represents insertions relative to reference.

Supplementary Figure 5 Example polycistronic transcripts that are observed by nanopore sequencing, that are otherwise difficult to detect.

a, MT-CO1 transcripts bearing OriL nucleotides at their 5′ ends. Gray bars in the panel at top represent base coverage. Most coverage is for either MT-CO1 or for strands corresponding to OriL which is approximately equivalent to the reverse complement of MT-TA, MT-TN, MT-TC, MT-TY which are encoded on the mitochondrial L strand (yellow bars). We could not find documentation for the exact nt limits of OriL. A limited number of transcript reads bridged the gap between OriL and MT-CO1 (red arrow). The multi-colored lines correspond to individual nanopore reads that aligned to the entire length of OriL+MT-CO1. b, Nanopore read of an 8.8 kb polycistronic mitochondrial L strand transcript. The read extends from the 3′ end of MT-TC (position 5,891, MT-genome) to the boundary between MT-ND6 and MT-TE (position 14,673, Mt-genome). An unprocessed tRNA (MT-TS1) is internal to the strand.

Supplementary Figure 6 Recovery of truncated ionic current signal from continuous fast5 files yields more alignable sequence for some RNA strands.

a, Ionic current signal for translocation of a MT-CO1 transcript. It is representative of traces where the read was artificially truncated by a signal anomaly. The red line represents the MinKNOW segmented read (positions 474–1,532 of the MT-CO1 gene), and the magenta line represents the manually segmented and rescued read (positions 27–1,532 of the MT-CO1 gene). The signal in blue was present in the MinKNOW output read fast5 file. Signal in gray was not present in the MinKNOW output read fast5 file, but could be extracted from the continuous fast5 file using BulkVis. The time bar is two seconds. b, Recovery of data at the 3′ end of a read (shaded) using BulkVis. c, Recovery of data at the 5′ end of a read (shaded) using BulkVis. d,e, Effect of additional ionic current data on the mapping coordinates (start and end positions for an alignment) relative to the reference transcript for all MT-CO1 reads in bulk files from Lab 1. A detailed summary of data shown in d and e can be found in Supplementary Table 7. The analysis was performed using 5 experiments that delivered 5 bulk files representing approximately 2 h of continuous data each.

Supplementary Figure 7 MTCO1 poly(A) transcript read length versus MinION run time.

The panel at left is from Lab 1 (12,565 reads) and is representative of results for Labs 2–5. The panel at right is from Lab 6 (17,859 reads). The intensity of blue shading represents the density of the data distribution. The discrete dots at the edge of the distributions represent regions where there is one or only a few data points.

Supplementary Figure 8 Correcting minimap2 genomic read alignments improves splice site accuracy.

Using FLAIR-correct, misaligned splice sites were corrected to splice sites supported by short-read sequencing. The x axis is the distance from the aligned splice site to the closest annotated splice site in GENCODE v27. The y axis is the number of aligned sites (log-scale) with raw alignment distance counts in blue and corrected counts in yellow.

Supplementary Figure 9 Criteria for the FLAIR-sensitive and FLAIR-stringent isoform sets.

a, Two candidate isoforms assembled using FLAIR. Each block represents either a complete or a partial exon (numbers 1–4). b, Reads that align to a candidate isoform. Light gray bars represent 25 nt coverage into first and last exons. c, FLAIR-sensitive isoform set that passed criteria shown at arrow. d, FLAIR-stringent isoform set that passed criteria shown at arrow. Isoform 1 failed FLAIR-stringent isoform test (X); isoform 2 passed FLAIR-stringent isoform test.

Supplementary Figure 10 Saturation curves for each of the defined FLAIR and GENCODE isoform sets.

The y axis is the number of isoforms detected in the FLAIR and GENCODE isoform sets; the x axis is the number of reads subsampled in 10% increments from a total of 8.17 million native RNA reads.

Supplementary Figure 11 Allele-specific expression of XIST. IGV view of reads assigned to the paternal allele (top) or the maternal allele (middle).

Colored bars in the coverage plots indicate SNPs present in the paternal (top) or maternal (middle) allele relative the GRCh38 XIST reference. Purple boxes (insets) highlight two of the numerous SNPs used to assign allele specificity. Lower panel shows a gene model for XIST.

Supplementary Figure 12 IGV view of transcripts for allele-specific isoforms across the complete IFIH1 gene.

This is the same dataset as in Fig. 4d, however here the 5′ end is shown. See Fig. 4d for details.

Supplementary Figure 13 Kruskal–Wallis test for poly(A) tail length variance between isoforms of the same gene.

Only genes with at least 500 reads, and isoforms with at least 25 reads were considered. The 50 lowest statistically significant P values out of 215 total are shown. The P values were between 4.41 × 10^–106 and 4.17× 10^–25.

Supplementary Figure 14

Gene models for SNHG8 isoforms ENST00000602819 and ENST00000602520 identified in this study.

Supplementary Figure 15 Nanopore detection of inosine base modifications.

a, Genome browser view of a segment of the AHR gene in the GRCh38 reference. The top row shows nucleotide position and base sequence. Magenta squares (below the top row) represent putative inosine positions as characterized by RADAR¹. Blue lines denote read alignments for nanopore native RNA, and brown lines denote read alignments for nanopore cDNA. White letters are mismatches relative to the GRCh38 AHR reference. White spaces with connecting black lines represent deletions in the alignment. Base miscalls occur in native RNA data at or near putative A-to-I editing sites. G base variants occur at corresponding positions in cDNA data. b, Summary of alignment information using WebLogo ² for native RNA and cDNA data. Top row is the same ten base motif of the AHR gene as in a, with asterisks denoting putative inosines. Letter size in the logo depicts relative frequency of occurrence.

Supplementary information

Supplementary Information

Supplementary Figures 1–15, Supplementary Note, Supplementary Tables 1, 4, 8–10 and 13.

Reporting Summary

Supplementary Table 2

Native RNA reads by gene. 9.7 million individual pass native RNA reads were aligned to genes in GENCODE v27 using minimap2 (splice aware setting). 20,289 separate genes were identified in these alignments.

Supplementary Table 3

Native RNA reads by isoform assignment. 9.7 million individual pass native RNA reads were aligned to isoforms in GENCODE v27 using minimap2 (splice aware setting). 64,241 separate isoforms were identified in these alignments.

Supplementary Table 5

k-mer coverage for nanopore native RNA reads aligned to GENCODE isoforms. The read sequences were filtered by length, and only reads that covered 90% or more of the respective reference sequence were chosen. Expected k-mer counts were calculated from the set of reference sequences, and observed k-mer counts were calculated from the set of read sequences.

Supplementary Table 6

k-mer coverage for nanopore cDNA reads aligned to GENCODE isoforms. The read sequences were filtered by length and only reads that covered 90% or more of the respective reference sequence were chosen. Expected k-mer counts were calculated from the set of reference sequences and observed k-mer counts were calculated from the set of read sequences.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Workman, R.E., Tang, A.D., Tang, P.S. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat Methods 16, 1297–1305 (2019). https://doi.org/10.1038/s41592-019-0617-2

Download citation

Received: 28 December 2018
Accepted: 19 September 2019
Published: 18 November 2019
Issue Date: December 2019
DOI: https://doi.org/10.1038/s41592-019-0617-2

This article is cited by

Nanopore DNA sequencing technologies and their applications towards single-molecule proteomics
- Adam Dorey
- Stefan Howorka
Nature Chemistry (2024)
Quantitative analysis of tRNA abundance and modifications by nanopore RNA sequencing
- Morghan C. Lucas
- Leszek P. Pryszcz
- Eva Maria Novoa
Nature Biotechnology (2024)
Co-transcriptional gene regulation in eukaryotes and prokaryotes
- Morgan Shine
- Jackson Gordon
- Karla M. Neugebauer
Nature Reviews Molecular Cell Biology (2024)
Regulation by the RNA-binding protein Unkempt at its effector interface
- Kriti Shah
- Shiyang He
- Jernej Murn
Nature Communications (2024)
Simultaneous nanopore profiling of mRNA m6A and pseudouridine reveals translation coordination
- Sihao Huang
- Adam C. Wylder
- Tao Pan
Nature Biotechnology (2024)