Assessment of microRNA differential expression and detection in multiplexed small RNA sequencing data

  1. Marc E. Lenburg1,2,4
  1. 1Bioinformatics Graduate Program, Boston University, Boston, Massachusetts 02215, USA
  2. 2Section of Computational Biomedicine, Boston University Medical Center, Boston, Massachusetts 02118, USA
  3. 3Simmons Center for Interstitial Lung Disease and Department of Medicine, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania 15213, USA
  4. 4Department of Pathology and Laboratory Medicine, Boston University School of Medicine, Boston, Massachusetts 02118, USA
  5. 5Center for Genes, Environment and Health and Department of Medicine, National Jewish Health, Denver, Colorado 80206, USA
  6. 6Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, Massachusetts 02215, USA
  7. 7Department of Medicine, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
  1. Corresponding author: mlenburg{at}bu.edu

Abstract

Small RNA sequencing can be used to gain an unprecedented amount of detail into the microRNA transcriptome. The relatively high cost and low throughput of sequencing bases technologies can potentially be offset by the use of multiplexing. However, multiplexing involves a trade-off between increased number of sequenced samples and reduced number of reads per sample (i.e., lower depth of coverage). To assess the effect of different sequencing depths owing to multiplexing on microRNA differential expression and detection, we sequenced the small RNA of lung tissue samples collected in a clinical setting by multiplexing one, three, six, nine, or 12 samples per lane using the Illumina HiSeq 2000. As expected, the numbers of reads obtained per sample decreased as the number of samples in a multiplex increased. Furthermore, after normalization, replicate samples included in distinct multiplexes were highly correlated (R > 0.97). When detecting differential microRNA expression between groups of samples, microRNAs with average expression >1 reads per million (RPM) had reproducible fold change estimates (signal to noise) independent of the degree of multiplexing. The number of microRNAs detected was strongly correlated with the log2 number of reads aligning to microRNA loci (R = 0.96). However, most additional microRNAs detected in samples with greater sequencing depth were in the range of expression which had lower fold change reproducibility. These findings elucidate the trade-off between increasing the number of samples in a multiplex with decreasing sequencing depth and will aid in the design of large-scale clinical studies exploring microRNA expression and its role in disease.

Keywords

Footnotes

  • Received April 28, 2014.
  • Accepted November 7, 2014.

This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents