Skip to main content
Log in

The tree of blobs of a species network: identifiability under the coalescent

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

Inference of species networks from genomic data under the Network Multispecies Coalescent Model is currently severely limited by heavy computational demands. It also remains unclear how complicated networks can be for consistent inference to be possible. As a step toward inferring a general species network, this work considers its tree of blobs, in which non-cut edges are contracted to nodes, so only tree-like relationships between the taxa are shown. An identifiability theorem, that most features of the unrooted tree of blobs can be determined from the distribution of gene quartet topologies, is established. This depends upon an analysis of gene quartet concordance factors under the model, together with a new combinatorial inference rule. The arguments for this theoretical result suggest a practical algorithm for tree of blobs inference, to be fully developed in a subsequent work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Allman ES, Baños H, Rhodes JA (2019) NANUQ: a method for inferring species networks from gene trees under the coalescent model. Algorithms Mol Biol 14(24):1–25

    Google Scholar 

  • Allman ES, Degnan JH, Rhodes JA (2011) Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent. J Math Biol 62(6):833–862

    Article  MATH  Google Scholar 

  • Allman ES, Matias C, Rhodes JA (2009) Identifiability of parameters in latent structure models with many observed variables. Ann Stat 37(6A):3099–3132

    Article  MATH  Google Scholar 

  • Allman ES, Mitchell JD, Rhodes JA (2022) Gene tree discord, simplex plots, and statistical tests under the coalescent. Syst Biol 71:929–942. https://doi.org/10.1093/sysbio/syaa104

    Article  Google Scholar 

  • Baños H (2019) Identifying species network features from gene tree quartets. Bull Math Biol 81:494–534

    Article  MATH  Google Scholar 

  • Blischak PD, Chifman J, Wolfe AD, Kubatko LS (2018) HyDe: a Python package for genome-scale hybridization detection. Syst Biol 67:821–829

    Article  Google Scholar 

  • Bryant D, Moulton V (2004) Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol 21:255–265

    Article  Google Scholar 

  • Dress AWM, Huson DH (2004) Constructing splits graphs. IEEE/ACM Trans Comput Biol Bioinf 1(3):109–115

    Article  Google Scholar 

  • Erdős PL, Semple C, Steel M (2019) A class of phylogenetic networks reconstructable from ancestral profiles. Math Biosci 313:33–40

    Article  MATH  Google Scholar 

  • Flouri T, Jiao X, Rannala B, Yang Z (2019) A Bayesian implementation of the multispecies coalescent model with introgression for phylogenomic analysis. Mol Biol Evol 37(4):1211–1223

    Article  Google Scholar 

  • Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz M, Hansen NF, Durand EY, Malaspinas A, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Ž, Gušic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S (2010) A draft sequence of the Neandertal genome. Science 328:710–722

    Article  Google Scholar 

  • Grünewald S, Huber KT (2007) Reconstructing evolution: new mathematical and computational advances, chapter identifying and defining trees. Oxford University Press, pp 217–244

    Google Scholar 

  • Gusfield D, Bansal V, Bafna V, Song YS (2007) A decomposition theory for phylogenetic networks and incompatible characters. J Comput Biol 14(10):1247–1272

    Article  Google Scholar 

  • Hamlin JAP, Hibbins MS, Moyle LC (2020) Assessing biological factors affecting postspeciation introgression. Evol Lett 4:137–154

    Article  Google Scholar 

  • Hibbins MS, Hahn MW (2022) Phylogenomic approaches to detecting and characterizing introgression. Genetics 220(2):173

    Article  Google Scholar 

  • Huber KT, Moulton V, Semple C, Wu T (2018) Quarnet inference rules for level-1 networks. Bull Math Biol 80(8):2137–2153

    Article  MATH  Google Scholar 

  • Liu L, Yu L, Kubatko L, Pearl DK, Edwards SV (2009) Coalescent methods for estimating phylogenetic trees. Mol Phylogenet Evol 53(1):320–328

    Article  Google Scholar 

  • Meng C, Kubatko LS (2009) Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. Theor Popul Biol 75(1):35–45

    Article  MATH  Google Scholar 

  • Mitchell JD, Allman ES, Rhodes JA (2019) Hypothesis testing near singularities and boundaries. Electron J Stat 13(1):2150–2193

    Article  MATH  Google Scholar 

  • Murakami Y, van Iersel L, Janssen R, Jones M, Moulton V (2019) Reconstructing tree-child networks from reticulate-edge-deleted subnetworks. Bull Math Biol 81(10):3823–3863

    Article  MATH  Google Scholar 

  • Pamilo P, Nei M (1988) Relationships between gene trees and species trees. Mol Biol Evol 5(5):568–583

    Google Scholar 

  • Rhodes JA (2020) Topological metrizations of trees, and new quartet methods of tree inference. IEEE/ACM Trans Comput Biol Bioinf 17(6):2107–2118

    Article  Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425

    Google Scholar 

  • Sayyari E, Mirarab S (2016) Fast coalescent-based computation of local branch support from quartet frequencies. Mol Biol Evol 33(7):1654–1668

    Article  Google Scholar 

  • Semple C, Steel M (2005) Phylogenetics. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Semple C, Toft G (2021) Trinets encode orchard phylogenetic networks. J Math Biol 83(3):28

    Article  MATH  Google Scholar 

  • Solís-Lemus C, Ané C (2016) Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting. PLoS Genet 12(3):e1005896

    Article  Google Scholar 

  • Steel M (2016) Phylogeny: discrete and random processes in evolution. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  • van Iersel L, Moulton V, Murakami Y (2020) Reconstructibility of unrooted level-k phylogenetic networks from distances. Adv Appl Math 120:102075

    Article  MATH  Google Scholar 

  • Xu J, Ané C (2021) Identifiability of local and global features of phylogenetic networks from average distances. J Math Biol. https://doi.org/10.1007/s00285-022-01847-8 (to appear)

  • Yu Y, Degnan JH, Nakhleh L (2012) The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. PLoS Genet 8:e1002660

    Article  Google Scholar 

  • Yu Y, Nakhleh L (2015) A maximum pseudo-likelihood approach for phylogenetic networks. BMC Genomics 16:S10

    Article  Google Scholar 

  • Zhang C, Ogilvie HA, Drummond AJ, Stadler T (2017) Bayesian inference of species networks from multilocus sequence data. Mol Biol Evol 35(2):504–517

    Article  Google Scholar 

  • Zhu J, Wen D, Yu Y, Meudt HM, Nakhleh L (2018) Bayesian inference of phylogenetic networks from bi-allelic genetic markers. PLoS Comput Biol 14(1):e1005932

    Article  Google Scholar 

  • Zhu J, Yu Y, Nakhleh L (2016) In the light of deep coalescence: revisiting trees within networks. BMC Bioinf 5:271–282

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Foundation, Grant 2051760, awarded to JR and EA. EA and JM were also supported by NIGMS Institutional Development Award (IDeA), Grant 2P20GM103395. HB was supported by the Moore-Simons Project on the Origin of the Eukaryotic Cell, Simons Foundation Grant 735923LPI (DOI: https://doi.org/10.46714/735923LPI) awarded to Andrew J. Roger and Edward Susko.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John A. Rhodes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Allman, E.S., Baños, H., Mitchell, J.D. et al. The tree of blobs of a species network: identifiability under the coalescent. J. Math. Biol. 86, 10 (2023). https://doi.org/10.1007/s00285-022-01838-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00285-022-01838-9

Keywords

Mathematics Subject Classification

Navigation