Genome Evolution: We are not so special

New sequence data from choanoflagellates improves our understanding of the genetic changes that occurred along the branch of the evolutionary tree that gave rise to animals.
  1. Zachary R Lewis
  2. Casey W Dunn  Is a corresponding author
  1. Yale University, United States

The most recent common ancestor of animals lived more than 600 million years ago, so we cannot sequence its genome. Nevertheless, we can identify a minimal set of gene families that were present in this long-dead ancestor by comparing genomic data across animals and their closest relatives. In addition to being interesting in its own right, this helps us identify which genes were gained and lost before the origin of animals and, likewise, which genes were gained and lost as animals diversified.

The challenge, though, is that there are strong sampling biases that can compromise these analyses. Genome sequencing has focused on species that are medically relevant, experimentally tractable, and easy to sequence (del Campo et al., 2014). Left unaddressed, these biases can frustrate efforts to reconstruct the genomes of our ancient ancestors. Take, for example, the simple case of three groups of organisms called O, C and M, and a gene that originated along the branch that gave rise to C and M (Figure 1A). If more sequencing effort has been invested in group M than in group C, the gene is more likely to be found in group M than in group C. And if the gene is found in M but not in C, even though it is present in both, then it will appear that the gene is specific to group M and younger than it actually is.

Genes lost and gained.

(A) Example of biased sampling (left): although a gene was gained (first green line) before group C and group M diverged, biased sampling means that it is only detected in group M, which leads to the incorrect inference (second green line) that the gene arose after the groups diverged. With uniform sampling (right), the gene gain is correctly inferred (third green line). Groups C, M and O could be Choanoflagellata, Metazoa and Outgroups. (B) Cladogram showing the evolutionary relationships of the clades in question, with the Choanoflagellata stem shown in red and the Metazoa stem shown in blue. Choanozoa refers to the clade Choanoflagellata + Metazoa (Brunet and King, 2017). (C) The number of gene groups gained (y-axis) plotted against the number of gene groups lost (x-axis) along various branches leading to the nodes shown in panel B, based on the data in four studies (Fairclough et al., 2013; Paps and Holland, 2018; Richter et al., 2018; Suga et al., 2013). The gray dashed line indicates equal gene group gain and loss. Note that the four studies use different methodologies to define groupings of genes. Data and analyses are available at https://github.com/dunnlab/gene_inventory_2018 (Lewis and Dunn, 2018; copy archived at https://github.com/elifesciences-publications/gene_inventory_2018).

Now, in eLife, Daniel Richter, Parinaz Fozouni, Michael Eisen and Nicole King report their work to reduce sequencing bias by sampling many more genes in the sister group to animals, the choanoflagellates (Richter et al., 2018). They generated transcriptomic data for 19 species of choanoflagellates and analyzed them in combination with previously published metazoan (animal), choanoflagellate and other eukaryote genomes. In addition to presenting new data, Richter et al. – who are based at UC Berkeley, UCSF, the Gladstone Institutes and Station Biologique de Roscoff – applied new probabilistic methods to minimize the chance that a gene family would be predicted to be present in a taxonomic group based on the spurious assignment of unrelated genes to the same family.

In related work at the universities of Essex and Oxford, Jordi Paps and Peter Holland have reported an interesting analysis of gene gain and loss in early animal evolution (Paps and Holland, 2018). The studies agree on some key points. Both recovered a relatively large number of gene family gains along the ‘animal stem’ (the branch of the evolutionary tree that uniquely gives rise to animals; shown in blue in Figure 1B). However, while Paps and Holland estimate that the number of gains was much higher than the number of losses, which they interpreted as evidence for an accelerated expansion of gene families along the Metazoa stem, Richter et al. estimate approximately equal numbers of gains and losses (Figure 1C). This means that Richter et al. find evidence for accelerated churn of gene families along the Metazoa stem, not a burst of expansion. This incongruence is likely related to Paps and Holland analyzing two choanoflagellate species, compared to the 21 analyzed by Richter et al.

Another difference is that Paps and Holland did not estimate gene gain and loss along the Choanoflagellata stem, whereas Richter et al. did. This revealed more gene family gain and less gene family loss along the Choanoflagellata stem than along the Metazoa stem (Figure 1C). So, Richter et al. do find a burst of gene family expansion, but in Choanoflagellata rather than Metazoa. It will be critical to further test the findings of both studies with improved sampling of other closely related groups, which could change how the gains and losses are apportioned to these two stems.

The results presented by Richter et al. agree in important ways with other recent work (King et al., 2008; Suga et al., 2013). These analyses reveal that the genetic changes on the Metazoa stem included the evolution of new intercellular signaling pathways (Fairclough et al., 2013) and the integration of new ligands and receptors into intracellular pathways that were already present (such as the Hippo pathway; Sebé-Pedrós et al., 2012). Other changes included the expansion of a core set of transcription factors (de Mendoza et al., 2013), and increased cis-regulatory complexity (Sebé-Pedrós et al., 2016).

Comparative gene content analyses refine our understanding of what makes metazoans unique, and in the process we are learning about the underappreciated biology of our close non-metazoan relatives (Sebé-Pedrós et al., 2017). For instance, Richter et al. identified homologs of Toll-like receptors in most choanoflagellates. These genes were thought to be an animal-specific innovation for innate immunity. Future research could investigate if these genes have immune-like roles in non-animals.

It is impossible to know how special animals really are without also knowing something about our closest relatives. The more we learn about these relatives, the less special we seem to be.

References

Article and author information

Author details

  1. Zachary R Lewis

    Zachary R Lewis is in the Department of Ecology and Evolutionary Biology, Yale University, New Haven, United States

    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0160-4722
  2. Casey W Dunn

    Casey W Dunn is in the Department of Ecology and Evolutionary Biology, Yale University, New Haven, United States

    For correspondence
    casey.dunn@yale.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0628-5150

Publication history

  1. Version of Record published: July 3, 2018 (version 1)

Copyright

© 2018, Lewis et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,077
    views
  • 299
    downloads
  • 2
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Zachary R Lewis
  2. Casey W Dunn
(2018)
Genome Evolution: We are not so special
eLife 7:e38726.
https://doi.org/10.7554/eLife.38726
  1. Further reading

Further reading

    1. Biochemistry and Chemical Biology
    2. Evolutionary Biology
    Eva Pyrihová, Martin S King ... Edmund RS Kunji
    Research Article

    Stramenopiles form a clade of diverse eukaryotic organisms, including multicellular algae, the fish and plant pathogenic oomycetes, such as the potato blight Phytophthora, and the human intestinal protozoan Blastocystis. In most eukaryotes, glycolysis is a strictly cytosolic metabolic pathway that converts glucose to pyruvate, resulting in the production of NADH and ATP (Adenosine triphosphate). In contrast, stramenopiles have a branched glycolysis in which the enzymes of the pay-off phase are located in both the cytosol and the mitochondrial matrix. Here, we identify a mitochondrial carrier in Blastocystis that can transport glycolytic intermediates, such as dihydroxyacetone phosphate and glyceraldehyde-3-phosphate, across the mitochondrial inner membrane, linking the cytosolic and mitochondrial branches of glycolysis. Comparative analyses with the phylogenetically related human mitochondrial oxoglutarate carrier (SLC25A11) and dicarboxylate carrier (SLC25A10) show that the glycolytic intermediate carrier has lost its ability to transport the canonical substrates malate and oxoglutarate. Blastocystis lacks several key components of oxidative phosphorylation required for the generation of mitochondrial ATP, such as complexes III and IV, ATP synthase, and ADP/ATP carriers. The presence of the glycolytic pay-off phase in the mitochondrial matrix generates ATP, which powers energy-requiring processes, such as macromolecular synthesis, as well as NADH, used by mitochondrial complex I to generate a proton motive force to drive the import of proteins and molecules. Given its unique substrate specificity and central role in carbon and energy metabolism, the carrier for glycolytic intermediates identified here represents a specific drug and pesticide target against stramenopile pathogens, which are of great economic importance.

    1. Evolutionary Biology
    2. Genetics and Genomics
    Brian PH Metzger, Yeonwoo Park ... Joseph W Thornton
    Research Article

    A protein’s genetic architecture – the set of causal rules by which its sequence produces its functions – also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest – excluding the vast majority of possible genotypes and evolutionary trajectories – and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor’s specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor’s capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.