Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data

  1. Joel McManus1
  1. 1Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA;
  2. 2Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA;
  3. 3Illumina, Incorporated, Madison, Wisconsin 53719, USA
  1. 4 These authors contributed equally to this work.

  • 5 Present address: FluNXT, Sanofi Pasteur, Cambridge, MA 02138, USA

  • Corresponding author: mcmanus{at}andrew.cmu.edu
  • Abstract

    Upstream open reading frames (uORFs), located in transcript leaders (5′ UTRs), are potent cis-acting regulators of translation and mRNA turnover. Recent genome-wide ribosome profiling studies suggest that thousands of uORFs initiate with non-AUG start codons. Although intriguing, these non-AUG uORF predictions have been made without statistical control or validation; thus, the importance of these elements remains to be demonstrated. To address this, we took a comparative genomics approach to study AUG and non-AUG uORFs. We mapped transcription leaders in multiple Saccharomyces yeast species and applied a novel machine learning algorithm (uORF-seqr) to ribosome profiling data to identify statistically significant uORFs. We found that AUG and non-AUG uORFs are both frequently found in Saccharomyces yeasts. Although most non-AUG uORFs are found in only one species, hundreds have either conserved sequence or position within Saccharomyces. uORFs initiating with UUG are particularly common and are shared between species at rates similar to that of AUG uORFs. However, non-AUG uORFs are translated less efficiently than AUG-uORFs and are less subject to removal via alternative transcription initiation under normal growth conditions. These results suggest that a subset of non-AUG uORFs may play important roles in regulating gene expression.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.221507.117.

    • Freely available online through the Genome Research Open Access option.

    • Received February 7, 2017.
    • Accepted December 11, 2017.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server