ReviewThe evolution of spliceosomal introns
Introduction
One of the greatest enigmas of eukaryotic genome evolution is the widespread existence of introns. Most genes in multicellular eukaryotes contain at least one intron, and many contain a dozen or more. In some lineages, mammals in particular, the size of individual introns substantially exceeds that of their surrounding exons. Intragenic noncoding DNA poses three significant problems for the organism: they must be accurately spliced out of precursor mRNAs (pre-mRNAs) if proper proteins are to be produced; the magnitude of the genome-wide investment in nucleotides contained within introns must impose a metabolic cost at the DNA and RNA levels; and the burden of critical intron recognition sites increases the mutation rate to null alleles. How, then, did the eukaryotic genome arrive at the point at which ‘genes in pieces’ has become the norm?
Our review focuses on the predominant class of eukaryotic introns: the spliceosomal introns of nuclear-encoded protein genes. Such introns are transcribed along with their surrounding exons, and their removal from pre-mRNAs is performed within the nucleus by the spliceosomal complex. Involving five small nuclear RNAs (snRNAs) and ∼100 proteins, the spliceosome carries out a cohesive series of partially overlapping interactions that culminate in intron excision and exon joining.
Until recently, studies regarding the evolution of eukaryotic introns were largely focused on a single issue, their potential involvement in the ancient origin of genes by exon shuffling 1., 2., 3., 4., 5., 6.. However, although many details regarding the origin of introns remain to be resolved, their approximate time of origin is no longer in doubt. None of the 100 or so sequenced prokaryotic genomes harbors the signature of current or past introns in protein-coding genes, whereas every basal eukaryotic lineage that has been examined contains at least some spliceosomal introns and/or orthologues of spliceosomal components found in higher eukaryotes 7., 8., 9•., 10•.. Some protist genes contain as many introns as their orthologues in multicellular species, and there is considerable evidence for intron loss and gain in independent lineages [9•], consistent with the view that the distribution of introns reflects ongoing birth/death processes [11•].
Some have argued that this phylogenetic pattern might be an illusion, with selection for streamlined genomes having led to the loss of introns from all lineages of prokaryotes after the primordial set of genes had been produced by exon shuffling 1., 3.. However, simple population-genetic principles suggest that the enormous population sizes of prokaryotes — probably well beyond 1010 in most cases — combined with the weak selective disadvantage of intron-containing alleles would have presented a significant barrier to the proliferation of introns [11•]. The physical positions of introns have also been cited as indirect evidence for their involvement in the creation of early genes 12., 13., 14., but the observed patterns are also consistent with the differential selective elimination of introns in sites that are most prone to conversion to null alleles [11•].
Thus, the most reasonable explanation of the data are that spliceosomal introns were never present in prokaryotes but rose to prominence soon after the origin of the stem eukaryote. This issue having been largely resolved, attention is now shifting to the adaptive significance and differential proliferation of today's introns. Achieving a general understanding of these issues will require a deeper appreciation of the basic natural history of splicing and transcript processing and of the dynamics and stability of introns from a population-genetics perspective.
Section snippets
A group II intron ancestry?
The leading hypothesis for the origin of spliceosome-dependent introns is that they and the spliceosome itself were derived from one or more self-splicing group II introns 15., 16., 17., 18.. Such introns have been found in the chromosomes and plasmids of numerous eubacteria [19••] as well as in one member of the archaea (S Zimmerly, personal communication). (Technically speaking, however, some of the eubacterial group II elements are not introns at all, in that they lie between rather than
Twin spliceosomes?
Intrigue with respect to the origin of the spliceosome increased with the discovery that before the origin of multicellularity, eukaryotes evolved not just one, but two, spliceosomal complexes 39., 40.. The major spliceosome deals with the classic set of introns that usually start with GT and end with AG, whereas the minor spliceosome processes another set, many of which start with AT and end with AC. Members of the latter class of introns have come to be known as U12-dependent introns, in
The evolutionary demography of introns
A burst of intron proliferation may have followed the invention of the spliceosome, which would have mitigated the negative consequences of recognizable intergenic inserts. However, it is also clear that all species are subject to ongoing stochastic gains and losses of spliceosomal introns 48., 49., 50•., 51•.. Although we can provide only two quantitative estimates for the rate of intron turnover, these are remarkably consistent. In a comparison of the genomes of the congeneric nematodes C.
Genomic assimilation and accommodation
Once introns had spread throughout the eukaryotic genome, the new norm of genes-in-pieces apparently provided a novel physical structure for adaptive exploitation in the grand tradition of descent with modification. In today's eukaryotes, almost all of the major events in the production of mature mRNAs are highly coupled with exon definition and/or intron splicing [73••]. For example, interactions between various splicing factors and elongation factors promote transcription elongation 74., 75••.
Nonsense-mediated decay and intron proliferation
One of the most significant services provided by introns is their indirect role in the cell's surveillance for inappropriate transcripts, in particular those harboring premature termination codons (PTCs). PTC-containing mRNAs arise from the direct transcription of inherited mutant alleles as well as from errors in transcription or splicing of otherwise functional alleles. Although such transcripts are expected to lead to potentially deleterious truncated proteins, eukaryotes protect themselves
Conclusions
Much of the early intrigue with the evolution of introns was focused on their potential role in the origin of the primordial set of protein-coding genes and on their present adaptive significance (or lack thereof). Advances in phylogenetic analysis — made possible by high-throughput genomic sequencing — now suggest that at as many as two refined spliceosomes, and the introns serviced by them, were well established in the stem eukaryote. On the other hand, the near 100 fully sequenced
Acknowledgements
We are very grateful to N Kane, L Maquat, S Mount, L Rieseberg, A Stoltzfus, and S Zimmerly for helpful comments during the preparation of this manuscript. The work was supported by National Institutes of Health support to M Lynch and by an NSF IGERT fellowship in Evolution, Development, and Genomics to A Richardson.
References and recommended reading
Papers of particular interest, published within the annual period of review, have been highlighted as:
• of special interest
•• of outstanding interest
References (115)
How were introns inserted into nuclear genes?
Trends Genet
(1989)- et al.
The recent origins of introns
Curr Opin Genet Dev
(1991) The recent origins of spliceosomal introns revisited
Curr Opin Genet Dev
(1998)- et al.
Trichomonas vaginalis possesses a gene encoding the essential spliceosomal component, PRP8
Mol Biochem Parasitol
(1999) On the origin of RNA splicing and introns
Cell
(1985)The generality of self-splicing RNA: relationship to nuclear mRNA splicing
Cell
(1986)Self-splicing group II and nuclear pre-mRNA introns: how similar are they?
Trends Biochem Sci
(1990)Intron phylogeny: a new hypothesis
Trends Genet
(1991)- et al.
The ins and outs of group II introns
Trends Genet
(2001) - et al.
A catalytically active group II intron domain 5 can function in the U12-dependent spliceosome
Mol Cell
(2002)
mRNA splicing and autocatalytic introns: distant cousins or the products of chemical determinism
Cell
Transposition and exon shuffling by group II intron RNA molecules in pieces
J Mol Biol
Conserved sequences in a class of rare eukaryotic nuclear introns with non-consensus splice sites
J Mol Biol
Evolutionary fates and origins of U12-type introns
Mol Cell
Molecular evolution: recent cases of spliceosomal intron gain?
Curr Biol
The role of introns in evolution
FEBS Lett
Localization and stability of introns spliced from the Pem homeobox gene
J Biol Chem
SINEs and LINEs: the art of biting the hand that feeds you
Curr Opin Cell Biol
A variable intron distribution in globin genes of Chironomus: evidence for recent intron gain
Gene
A conserved mRNA export machinery coupled to pre-mRNA splicing
Cell
Terminal exon definition occurs cotranscriptionally and promotes termination of RNA polymerase II
Mol Cell
Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases
Trends Biochem Sci
The regulation of splice-site selection, and its role in human disease
Am J Hum Genet
Splicing regulation as a potential genetic modifier
Trends Genet
MAGOH interacts with a novel RNA-binding protein
Genomics
Cross-intron bridging interactions in the yeast commitment complex are conserved in mammals
Cell
A perfect message: RNA surveillance and nonsense-mediated decay
Cell
Nonsense-mediated mRNA decay in Saccharomyces cerevisiae
Gene
mRNA quality control: marking the message for life or death
Curr Biol
Stop making nonSense: the C. elegans smg genes
Trends Genet
Genes in pieces: were they ever together?
Nature
Selfish DNA and the origin of introns
Nature
The exon theory of genes
Cold Spring Harb Symp Quant Biol
Amitochondriate amoebae and the evolution of DNA-dependent RNA polymerase II
Proc Natl Acad Sci USA
The chaperonin genes of jakobid and jakobid-like flagellates: implications for eukaryotic evolution
Mol Biol Evol
A spliceosomal intron in Giardia lamblia
Proc Natl Acad Sci USA
Intron evolution as a population-genetic process
Proc Natl Acad Sci USA
Intron phase correlations and the evolution of the intron/exon structure of genes
Proc Natl Acad Sci USA
Toward a resolution of the introns early/late debate: only phase zero introns are correlated with the structure of ancient proteins
Proc Natl Acad Sci USA
Footprints of primordial introns on the eukaryotic genome
Trends Genet
Compilation and analysis of group II intron insertions in bacterial genomes: evidence for retroelement behavior
Nucleic Acids Res
Splicing of precursors to mRNAs by the spliceosomes
Structure and activities of group II introns
Annu Rev Biochem
Intrinsic metal binding by a spliceosomal RNA
Nat Struct Biol
Metal ion catalysis during group II intron self-splicing: parallels with the spliceosome
Genes Dev
Metal-ion coordination by U6 small nuclear RNA contributes to catalysis in the spliceosome
Nature
Trans-activation of group II intron splicing by nuclear U5 snRNA
Nature
Trans-splicing group II introns in plant mitochondria: the complete set of cis-arranged homologs in ferns, fern allies, and a hornwort
RNA
Group I and group II ribozymes as RNPs: clues to the past and guides to the future
Retrotransposition of a bacterial group II intron
Nature
Cited by (132)
The wheat WRKY transcription factor TaWRKY1-2D confers drought resistance in transgenic Arabidopsis and wheat (Triticum aestivum L.)
2023, International Journal of Biological MacromoleculesCitation Excerpt :In particular, TaWRKY2-3A, TaWRKY3-1A, TaWRKY8-1B, and TaWRKY8-1D are intronless. The loss of introns is thought to be the consequence of intron turnover or mature mRNA reverse transcription followed by homologous recombination with intron-containing alleles [38,39]. To further understand the functional regions of the TaWRKY members, MEME online software was used to predict the probable motifs in all TaWRKY proteins.
Genome-wide analysis of zinc finger motif-associated homeodomain (ZF-HD) family genes and their expression profiles under abiotic stresses and phytohormones stimuli in tea plants (Camellia sinensis)
2021, Scientia HorticulturaeCitation Excerpt :For example, intronless genes account for 21.7%21.7 % and 19.9 % of the Arabidopsis and rice genomes, respectively (Jain et al., 2008a). “Introns early” and “introns-late” are two main concepts explaining how eukaryotic genes have evolved different exon/intron structures (GILBERT, 1987; Logsdon, 1998; Lynch and Richardson, 2002). The “introns-early” theory proposes that introns of eukaryotic genes are inherited from prokaryotic ancestors, and differential intron loss can explain different exon/intron structures among homologous eukaryotic genes (GILBERT, 1987).
An update on the CHDGKB for the systematic understanding of risk factors associated with non-syndromic congenital heart disease
2021, Computational and Structural Biotechnology JournalSpecies diversity and genome evolution of the pathogenic protozoan parasite, Neospora caninum
2020, Infection, Genetics and EvolutionConvergent intron gains in hymenopteran elongation factor-1α
2013, Molecular Phylogenetics and Evolution