Trends in Immunology
ReviewShort open reading frame genes in innate immunity: from discovery to characterization
Section snippets
Gene annotation and sORFs
Beginning with the sequencing of the yeast Saccharomyces cerevisiae genome in the 1990s, the scientific community has shown considerable interest in comprehensive identification and annotation of protein-coding genes in eukaryotes. Early efforts focused on ATG-initiated ORFs capable of encoding a polypeptide of at least 100 amino acids [1,2]. ORFs that did not meet this length cut-off, short open reading frames (sORFs) (see Glossary), required additional evidence to merit a protein-coding
Sequence analysis
Discovering new SEPs begins with sequence analysis. The standard scanning model of translation initiation involves a preinitiation complex binding the RNA 5′-cap and traversing the transcript until reaching a Kozak sequence centered at the start codon AUG. There, the remaining translational machinery engages, and elongation of the polypeptide occurs. Once a stop codon is encountered, translation is terminated, and the ribosome dissociates from the transcript [15]. This model implies a simple
Candidate validation, approaches, and drawbacks
Translated sORF predictions based on ribosomal association likely misestimate the coding potential of transcripts and say nothing about the function of predicted SEPs (including whether they are functional at all). To confirm novel peptide production, candidate sORFs are typically validated via peptide tagging and microscopy or immunoprecipitation (IP). SEPs frequently influence phenotype by complexing with larger protein partners [16,17,70., 71., 72.]; therefore, determining these partners
High-throughput validation
Individual, high-resolution characterization of SEPs will be an important part of correctly annotating the genome and characterizing SEP–protein interaction networks. However, the large numbers of putatively coding sORFs detected from Ribo-Seq and sequence analysis argues for the application of high-throughput methods to validate translation en masse. Broadly, there are two approaches: peptidomics via MS (Box 1) and genome editing with CRISPR-Cas (Box 2). In both cases the challenge is for the
An emerging class: bifunctional genes
Here we describe SEPs that were recently discovered and characterized in innate immune (and innate immune-derived) cell lines. We also note instances where the RNA itself is known to contribute to a phenotype distinctly from the SEP; this is the case in four of the five examples shown in Figure 2. Although the sample size is too small to make strong inferences, the high representation of these ‘coding-noncoding’ or ‘bifunctional’ [85,86] genes suggests that future studies of SEPs would do well
Concluding remarks
Nuanced biomolecular interrogation has allowed researchers to differentiate between SEP and RNA activity, and there are many databases containing thousands of sORFs and lncRNAs that are yet to be investigated (Table 2). Immunologists might find the study of this expanded proteome particularly fruitful. It is well established that the transcriptome is drastically changed under conditions of inflammatory stimuli. Furthermore, it is reported that multiple components of translation initiation
Acknowledgments
S.C. is supported by R01AI148413 from National Institute of Allergy and Infectious Diseases and R35GM137801 from the National Institute of General Medical Sciences. E.M. is supported by T32HG012344 and in part by R35GM137801.
Declaration of interests
S.C. is a paid consultant to NextRNA Therapeutics. No interests are declared by E.M.
Glossary
- Dark proteome
- understudied and under-characterized proteins and peptides, including those that arise from UTRs and noncoding RNAs.
- FASTQs
- the standard short-read sequencing format for bioinformatic sequencing analysis.
- Homology-directed repair (HDR)
- repair of DNA breaks using a homologous template, allowing the insertion of genetic material.
- Lipopolysaccharide (LPS)
- a PAMP component of Gram-negative bacterial cell walls. In its purified form, it is commonly used as an inflammation-inducing ligand in
References (137)
The dark proteome: translation from noncanonical open reading frames
Trends Cell Biol.
(2022)The biogenesis, functions, and challenges of circular RNAs
Mol. Cell
(2018)The how and why of lncRNA function: an innate immune perspective
Biochim. Biophys. Acta Gene Regul. Mech.
(2020)Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins
Cell
(2013)A non-AUG translational initiation in c-myc exon 1 generates an N-terminally distinct protein whose synthesis is disrupted in Burkitt’s lymphomas
Cell
(1988)Translation initiation at non-AUG triplets in mammalian cells
J. Biol. Chem.
(1989)A regression-based analysis of ribosome-profiling data reveals a conserved complexity to mammalian translation
Mol. Cell
(2015)A peptide encoded by a putative lncRNA HOXB-AS3 suppresses colon cancer growth
Mol. Cell
(2017)- et al.
Transcriptome-wide measurement of translation by ribosome profiling
Methods
(2017) Time-resolved proteomics extends ribosome profiling-based measurements of protein synthesis dynamics
Cell Syst.
(2017)
Dynamic regulation of a ribosome rescue pathway in erythroid cells and platelets
Cell Rep.
Assessment of GFP tag position on protein localization and growth fitness in yeast
J. Mol. Biol.
Addition of an EGFP-tag to the N-terminal of influenza virus M1 protein impairs its ability to accumulate in ND10
J. Virol. Methods
Pick a tag and explore the functions of your pet protein
Trends Biotechnol.
Incredible RNA: dual functions of coding and noncoding
Mol. Cells
cncRNAs: bi-functional RNAs with protein coding and non-coding functions
Semin. Cell Dev. Biol.
Mitochondrial protein interaction mapping identifies regulators of respiratory chain function
Mol. Cell
Endotoxin (LPS) stimulates 4E-BP1/PHAS-I phosphorylation in macrophages
J. Surg. Res.
eIF4E phosphorylation by MST1 reduces translation of a subset of mRNAs, but increases lncRNA translation
Biochim. Biophys. Acta Gene Regul. Mech.
A question of size: the eukaryotic proteome and the problems in defining it
Nucleic Acids Res.
Life with 6000 genes
Science
Classification and function of small open reading frames
Nat. Rev. Mol. Cell Biol.
Genome regulation by long noncoding RNAs
Annu. Rev. Biochem.
Long NONCODING RNA AW112010 promotes the differentiation of inflammatory T cells by suppressing IL-10 expression through histone demethylation
J. Immunol.
lincRNA-Cox2 functions to regulate inflammation in alveolar macrophages during acute lung injury
J. Immunol.
The long non-coding RNA HOXB-AS3 regulates ribosomal RNA transcription in NPM1-mutated acute myeloid leukemia
Nat. Commun.
Many lncRNAs, 5’UTRs, and pseudogenes are translated and some are likely to express functional proteins
eLife
Revisiting sORFs: overcoming challenges to identify and characterize functional microproteins
FEBS J.
Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures
J. Biomed. Sci.
The scanning mechanism of eukaryotic translation initiation
Annu. Rev. Biochem.
A micropeptide encoded by lncRNA MIR155HG suppresses autoimmune inflammation via modulating antigen presentation
Sci. Adv.
mTORC1 and muscle regeneration are regulated by the LINC00961-encoded SPAR polypeptide
Nature
Noncanonical translation initiation in eukaryotes
Cold Spring Harb. Perspect. Biol.
Leaky ribosomal scanning in mammalian genomes: significance of histone H4 alternative translation in vivo
Nucleic Acids Res.
Changes in global translation elongation or initiation rates shape the proteome via the Kozak sequence
Sci. Rep.
Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins
eLife
Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci
Genome Res.
Whole-genome alignment and comparative annotation
Annu. Rev. Anim. Biosci.
PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions
Bioinformatics
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Genome Res.
Measuring the accuracy of genome-size multiple alignments
Genome Biol.
Detection of nonneutral substitution rates on mammalian phylogenies
Genome Res.
The human genome browser at UCSC
Genome Res.
The InterPro protein families and domains database: 20 years on
Nucleic Acids Res.
Efficient analysis of mammalian polysomes in cells and tissues using Ribo Mega-SEC
eLife
Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling
Science
Ribosome profiling reveals the what, when, where and how of protein synthesis
Nat. Rev. Mol. Cell Biol.
The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments
Nat. Protoc.
Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution
Proc. Natl. Acad. Sci. U. S. A.
Quantitative profiling of initiating ribosomes in vivo
Nat. Methods
Cited by (7)
MiRNAs and lncRNAs in the regulation of innate immune signaling
2023, Non-coding RNA ResearchInnate Immunity in Cardiovascular Diseases—Identification of Novel Molecular Players and Targets
2023, Journal of Clinical Medicine