Abstract
Paired DNA and RNA profiling is increasingly employed in genomics research to uncover molecular mechanisms of disease and to explore personal genotype and phenotype correlations. Here, we introduce Simul-seq, a technique for the production of high-quality whole-genome and transcriptome sequencing libraries from small quantities of cells or tissues. We apply the method to laser-capture-microdissected esophageal adenocarcinoma tissue, revealing a highly aneuploid tumor genome with extensive blocks of increased homozygosity and corresponding increases in allele-specific expression. Among this widespread allele-specific expression, we identify germline polymorphisms that are associated with response to cancer therapies. We further leverage this integrative data to uncover expressed mutations in several known cancer genes as well as a recurrent mutation in the motor domain of KIF3B that significantly affects kinesin–microtubule interactions. Simul-seq provides a new streamlined approach for generating comprehensive genome and transcriptome profiles from limited quantities of clinically relevant samples.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Shah, S.P. et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486, 395–399 (2012).
Grubert, F. et al. Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162, 1051–1065 (2015).
Stranger, B.E. et al. Population genomics of human gene expression. Nat. Genet. 39, 1217–1224 (2007).
Ongen, H. et al. Putative cis-regulatory drivers in colorectal cancer. Nature 512, 87–90 (2014).
Li, J.B. et al. Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324, 1210–1213 (2009).
Tuch, B.B. et al. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One 5, e9317 (2010).
Su, A.I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004).
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
Macaulay, I.C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
Dey, S.S., Kester, L., Spanjaard, B. & Van, A. Integrated genome and transcriptome sequencing from the same cell. Nat. Biotechnol. 33, 1–19 (2015).
Lam, H.Y.K. et al. Performance comparison of whole-genome sequencing platforms. Nat. Biotechnol. 30, 78–82 (2011).
Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).
Baker, S.C. et al. The External RNA Controls Consortium: a progress report. Nat. Methods 2, 731–734 (2005).
Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Weinstein, J.N. et al. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315–322 (2014).
Zhao, M., Kim, P., Mitra, R., Zhao, J. & Zhao, Z. TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes. Nucleic Acids Res. 4, D1023–D1031 (2015).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Yadav, M. et al. Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature 515, 572–576 (2014).
Robbins, P.F. et al. Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat. Med. 19, 747–752 (2013).
Schumacher, T.N. & Schreiber, R.D. Neoantigens in cancer immunotherapy. Science 348, 69–74 (2015).
Futreal, P.A. et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).
Joerger, A.C., Ang, H.C. & Fersht, A.R. Structural basis for understanding oncogenic p53 mutations and designing rescue drugs. Proc. Natl. Acad. Sci. USA 103, 15056–15061 (2006).
Bullock, A.N., Henckel, J. & Fersht, A.R. Quantitative analysis of residual folding and DNA binding in mutant p53 core domain: definition of mutant states for rescue in cancer therapy. Oncogene 19, 1245–1256 (2000).
Gautschi, O. et al. Cyclin D1 (CCND1) A870G gene polymorphism modulates smoking-induced lung cancer risk and response to platinum-based chemotherapy in non-small cell lung cancer (NSCLC) patients. Lung Cancer 51, 303–311 (2006).
Absenger, G. et al. The cyclin D1 (CCND1) rs9344 G>A polymorphism predicts clinical outcome in colon cancer patients treated with adjuvant 5-FU-based chemotherapy. Pharmacogenomics J. 14, 130–134 (2014).
Gonçalves, A. et al. A polymorphism of EGFR extracellular domain is associated with progression free-survival in metastatic colorectal cancer patients receiving cetuximab-based treatment. BMC Cancer 8, 169 (2008).
Hsieh, Y.Y., Tzeng, C.H., Chen, M.H., Chen, P.M. & Wang, W.S. Epidermal growth factor receptor R521K polymorphism shows favorable outcomes in KRAS wild-type colorectal cancer patients treated with cetuximab-based chemotherapy. Cancer Sci. 103, 791–796 (2012).
Yu, Y. & Feng, Y.-M. The role of kinesin family proteins in tumorigenesis and progression: potential biomarkers and molecular targets for cancer therapy. Cancer 116, 5150–5160 (2010).
Jimbo, T. et al. Identification of a link between the tumour suppressor APC and the kinesin superfamily. Nat. Cell Biol. 4, 323–327 (2002).
Woehlke, G. et al. Microtubule interaction site of the kinesin motor. Cell 90, 207–216 (1997).
Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Integrated genome and transcriptome sequencing of the same cell. Nat. Biotechnol. 33, 285–289 (2015).
Macaulay, I.C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).
Zhao, W. et al. Comparison of RNA-Seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15, 419 (2014).
Nones, K. et al. Genomic catastrophes frequently arise in esophageal adenocarcinoma and drive tumorigenesis. Nat. Commun. 5, 5224 (2014).
Agrawal, N. et al. Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma. Cancer Discov. 2, 899–905 (2012).
Dulak, A.M. et al. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nat. Genet. 45, 478–486 (2013).
Haraguchi, K., Hayashi, T., Jimbo, T., Yamamoto, T. & Akiyama, T. Role of the kinesin-2 family protein, KIF3, during mitosis. J. Biol. Chem. 281, 4094–4099 (2006).
Liu, X. et al. Small molecule induced reactivation of mutant p53 in cancer cells. Nucleic Acids Res. 41, 6034–6044 (2013).
Stachler, M.D. et al. Paired exome analysis of Barrett's esophagus and adenocarcinoma. Nat. Genet. 47, 1047–1055 (2015).
Moriai, T., Kobrin, M.S., Hope, C., Speck, L. & Korc, M. A variant epidermal growth factor receptor exhibits altered type alpha transforming growth factor binding and transmembrane signaling. Proc. Natl. Acad. Sci. USA 91, 10217–10221 (1994).
Zhang, W. et al. Cyclin D1 and epidermal growth factor polymorphisms associated with survival in patients with advanced colorectal cancer treated with Cetuximab. Pharmacogenet. Genomics 16, 475–483 (2006).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).
Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Liao, Y., Smyth, G.K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Roth, A. et al. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics 28, 907–913 (2012).
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Larson, D.E. et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28, 311–317 (2012).
Koboldt, D.C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Wang, J. et al. CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat. Methods 8, 652–654 (2011).
Zhang, J. et al. INTEGRATE: gene fusion discovery using whole genome and transcriptome data. Genome Res. 26, 108–118 (2016).
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Cingolani, P. et al. Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift. Front. Genet. 3, 35 (2012).
Romanel, A., Lago, S., Prandi, D., Sboner, A. & Demichelis, F. ASEQ: fast allele-specific studies from next-generation sequencing data. BMC Med. Genomics 8, 9 (2015).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Stock, M.F. & Hackney, D.D. Expression of kinesin in Escherichia coli. Methods Mol. Biol. 164, 43–48 (2001).
Acknowledgements
We thank C. Araya, C. Cenik, P. Dumesic, D. Phanstiel and D. Webster for many helpful discussions and input regarding the manuscript and analyses. We acknowledge J. Churko from the laboratory of J. Wu at Stanford University for providing the fibroblasts as well as the work of both the sequencing core at the Stanford Center for Genomics and Personalized Medicine and the Genetics Bioinformatics Service Center, with special thanks to G. Euskirchen, L. Ramirez, C. Eastman, N. Watson and N. Hammond. Finally, we would like to thank H. Chen from Bina Technologies.
Author information
Authors and Affiliations
Contributions
J.A.R., D.V.S. and M.P.S. conceived the project, designed experiments and wrote the manuscript. J.A.R. and D.V.S. performed analyses and experiments. R.K.P. provided pathology expertise and formalin-fixed paraffin-embedded esophageal adenocarcinoma specimens. Work in the Snyder lab is supported by NIH grants to M.P.S. (1P50HG00773501 and 8U54DK10255602). J.A.R. was supported by the Damon Runyon Cancer Research Foundation, and D.V.S. was supported by an NIH T32 fellowship (HG000044) and a Genentech Graduate Fellowship.
Corresponding author
Ethics declarations
Competing interests
M.P.S. is a cofounder of Personalis and a member of the scientific advisory boards of Personalis and Genapsys.
Integrated supplementary information
Supplementary Figure 1 Simul-seq library preparation and quality control.
(a) Histogram of incubation times for parallel Tru-seq DNA and RNA library preparation as well as Simul-seq. (b) High-sensitivity DNA bioanalyzer trace for a yeast/human mixed Simul-seq library. Note, this trace is representative of an average Simul-seq library. (c-d) Representative droplet digital PCR (ddPCR) raw fluorescence amplitude data (left) and assay design (right) for quantification of DNA (c) and RNA (d) constituents of Simul-seq libraries.
Supplementary Figure 2 Comparison of variant calls between Simul-seq genome and DNA-seq replicates.
(a-b) Venn diagrams comparing SNV (a) and indel (b) calls between the Simul-seq genome and two DNA-seq control genomes derived from different tissues of the same individual13.
Supplementary Figure 3 Distribution of coding and noncoding genes in Simul-seq transcriptome.
Bar graph of all Ensembl biotype annotations for genes with FPKM values greater than or equal to 5.
Supplementary Figure 4 Simul-seq and RNA-seq replicates are well correlated.
Scatter plots of Log10(FPKM+1) gene measurements for Simul-seq and RNA-seq replicates. Spearman’s ρ correlation values for each comparison are shown.
Supplementary Figure 5 DNA and RNA sequencing data for 50,000 (50K) fibroblast replicates.
(a) Coverage distributions for Simul-seq libraries of the same individual. (b) Venn diagrams comparing SNV calls between the Simul-seq replicates. (c) Scatter plots of Log10(FPKM+1) gene measurements for 50K Simul-seq replicates (Spearman’s ρ=0.97). (d) Correlation between External RNA Controls Consortium (ERCC) spike-in control Log10 RNA concentrations versus the average Log10(RPKM+1) for Simul-seq (blue; Spearman’s ρ=0.97) and 50K Simul-seq (orange; Spearman’s ρ=0.96) fibroblast replicates (n=2/group). Note, zero values have been shifted to 1, and all ERCC transcripts are shown.
Supplementary Figure 6 Simul-seq replicate and tumor RNA quality control analysis.
(a) Distribution of normalized transcript coverage for RNA-seq and Simul-seq replicates performed on fibroblasts as well as Simul-seq data obtained for esophageal adenocarcinoma tissue isolated using laser capture microscopy (Simul-seq EAC). (b) Strand specificity of Simul-seq and RNA-seq samples. (c) The fraction of reads mapping to various genomic annotations for Simul-seq and RNA-seq samples. Note, an increased intronic read fraction combined with a similar intergenic read fraction in the Simul-seq EAC sample likely indicates increased intron retention and/or a higher proportion of unspliced RNA in this specimen.
Supplementary Figure 7 Targeted resequencing of KIF3B locus in esophageal adenocarcinoma patient samples.
(a) Histogram of the unique and unmapped Bowtie aligned reads obtained for 76 FFPE samples (50 tumors and 26 normals). The original sample (02-28923-C9) that was subjected to the Simul-seq protocol was included as a positive control. A single tumor-normal pair (00-18224-A2) displayed a substantially higher number of variant calls yet a lower number of uniquely mapped reads, suggesting that these samples harbored increased rates of PCR errors induced by low quality genomic DNA. Therefore, these samples were not included in somatic mutation analysis. (b) Validation of variant calls using pyrophosphate sequencing.
Supplementary Figure 8 Purification of recombinant wild-type and R293W mutant motor domains.
(a) Schematic of KIF3B protein, with motor domain and ATP binding region highlighted in blue and red, respectively. For biochemical assays, a region spanning the motor domain of KIF3B (amino acids 1-365) was cloned and recombinantly expressed with an N-terminal 6x-Histidine tag (bottom). (b) Coomassie stained gel of recombinant proteins pre- and post-induction with Isopropyl β-D-1-thiogalactopyranoside (IPTG) as well as after Ni2+ affinity purification.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–8 and Supplementary Note. (PDF 1853 kb)
Supplementary Table 1
Simul-seq and control library read counts and mapping rates. (XLSX 12 kb)
Supplementary Table 2
Somatic SVs for Simul-seq EAC tumor genome (XLSX 35 kb)
Supplementary Table 3
Somatic, expressed gene fusions in Simul-seq EAC tumor genome (XLSX 9 kb)
Supplementary Table 4
VCF of somatic SNVs for Simul-seq EAC tumor genome (XLSX 2379 kb)
Supplementary Table 5
VCF of somatic indels for Simul-seq EAC tumor genome (XLSX 474 kb)
Supplementary Table 6
EAC tumor ASE analysis at heterozygous SNV positions in the normal genome (XLSX 7583 kb)
Supplementary Table 7
ASE of annotated tumor supressor genes harboring damaging germline variants (XLSX 12 kb)
Supplementary Table 8
Simul-seq RNA and RNA-seq ERCC spike-in transcript quantification (XLSX 16 kb)
Supplementary Table 9
Genomic regions of KIF3B locus targeted for resequencing (XLSX 8 kb)
Supplementary Table 10
Primer sets used in KIF3B targeted resequencing (XLSX 11 kb)
Rights and permissions
About this article
Cite this article
Reuter, J., Spacek, D., Pai, R. et al. Simul-seq: combined DNA and RNA sequencing for whole-genome and transcriptome profiling. Nat Methods 13, 953–958 (2016). https://doi.org/10.1038/nmeth.4028
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.4028
This article is cited by
-
Identification of a novel gene signature in second-trimester amniotic fluid for the prediction of preterm birth
Scientific Reports (2022)
-
Massive parallel sequencing in forensics: advantages, issues, technicalities, and prospects
International Journal of Legal Medicine (2020)
-
Quantification of allelic differential expression using a simple Fluorescence primer PCR-RFLP-based method
Scientific Reports (2019)
-
Barcoded solid-phase RNA capture for Spatial Transcriptomics profiling in mammalian tissue sections
Nature Protocols (2018)
-
Telomere heterogeneity linked to metabolism and pluripotency state revealed by simultaneous analysis of telomere length and RNA-seq in the same human embryonic stem cell
BMC Biology (2017)