Abstract
Inference of a species tree from multi-locus gene trees having topological incongruence due to incomplete lineage sorting (ILS), is currently performed by either consensus (supertree), parsimony analysis (minimizing deep coalescence), or statistical methods. However, statistical approaches involve huge computational complexity. Accuracy of approximation heuristics used in either consensus or parsimony analysis, also varies considerably. We propose COSPEDSpec, a novel two stage species tree estimation method, combining both consensus and parsimony approaches. First stage uses our earlier proposed couplet supertree technique COSPEDTree [2][3], whereas the second stage proposes a greedy heuristic to refine a non-binary (unresolved) supertree into a binary species tree. During each iteration, it reduces the number of extra lineages between the current species tree and the input gene trees, thus modeling ILS as the cause of gene tree / species tree incongruence. COSPEDSpec incurs time and space complexity lower or equal to the reference methods. For large scale datasets having hundreds of taxa and thousands of gene trees, COSPEDSpec produces species trees with lower branch dissimilarities and much less computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bayzid, M.S., Warnow, T.: Estimating optimal species trees from incomplete gene trees under deep coalescence. Journal of Computational Biology 19(6), 591–605 (2012)
Bhattacharyya, S., Mukherjee, J.: Cospedtree: Couplet supertree by equivalence partitioning of taxa set and dag formation. IEEE/ACM Trans. Comp. Biol. Bioinfo. 1, 1 (2014), doi:10.1109/TCBB.2014.2366778 (preprints)
Bhattacharyya, S., Mukhopadhyay, J.: Couplet supertree by equivalence partitioning of taxa set and dag formation. In: 5th ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), pp. 259–268 (2014)
Chaudhary, R., Bansal, M.S., Wehe, A., Fernández-Baca, D., Eulenstein, O.: igtp: a software package for large-scale gene tree parsimony analysis. BMC Bioinformatics 23(574), 1–7 (2010)
Chaudhary, R., Burleigh, J.G., Fernández-Baca, D.: Inferring species trees from incongruent multi-copy gene trees using the robinson-foulds distance. Algorithms for Molecular Biology 8(1(28)), 1–12 (2013)
DeGiorgio, M., Degnan, J.H.: Fast and consistent estimation of species trees using supermatrix rooted triples. Mol. Biol Evol. 27(3), 552–569 (2010)
Degnan, J.H., Rosenberg, N.A.: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends in Ecology and Evolution 24(6), 332–340 (2009)
Durand, D., Halldorsson, B.V., Vernot, B.: A hybrid micro-macroevolutionary approach to gene tree reconstruction. Journal of Computational Biology 13(2), 320–335 (2005)
Edwards, S.V., Liu, L., Pearl, D.K.: High-resolution species trees without concatenation. PNAS 104(14), 5936–5941 (2007)
Heled, J., Drummond, A.J.: Bayesian inference of species trees from multilocus data. Mol. Biol. E 27(3), 570–580 (2010)
Helmkamp, L.J., Jewett, E.M., Rosenberg, N.A.: Improvements to a class of distance matrix methods for inferring species trees from gene trees. Journal of Computational Biology 19(6), 632–649 (2012)
Jewett, E.M., Rosenberg, N.A.: iglass: An improvement to the glass method for estimating species trees from gene trees. Journal of Computational Biology 19(3), 293–315 (2012)
Kubatko, L.S., Carstens, B.C., Knowles, L.: Stem: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 25(7), 971–973 (2009)
Kuo, C.H., Wares, J.P., Kissinger, J.C.: The apicomplexan whole-genome phylogeny: An analysis of incongurence among gene trees. Mol. Biol. Evol. 25(12), 2689–2698 (2008)
Larget, B.R., Kotha, S.K., Dewey, C.N., Ané, C.: Bucky: Gene tree / species tree reconciliation with bayesian concordance analysis. Bioinformatics 26(22), 2910–2911 (2010)
Liu, L., Yu, L., Edwards, S.V.: A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evolutionary Biology 10(302), 1–18 (2010)
Liu, L., Yu, L., Pearl, D.K.: Maximum tree: a consistent estimator of the species tree. J. Math. Biol. 60(1), 95–106 (2010)
Liu, L., Yu, L., Pearl, D.K., Edwards, S.V.: Estimating species phylogenies using coalescence times among sequences. Syst. Biol. 58(5), 468–477 (2009)
Maddison, W.P., Knowles, L.L.: Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55(1), 21–30 (2006)
Mirarab, S., Reaz, R., Bayzid, M.S., Zimmermann, T., Swenson, M.S., Warnow, T.: Astral: genome-scale coalescent-based species tree estimation. Bioinformatics 30(17), i541–i548 (2014)
Mossel, E., Roch, S.: Incomplete lineage sorting: Consistent phylogeny estimation from multiple loci. IEEE/ACM Trans. Comp. Biol. Bioinfo. 7(1), 166–171 (2010)
Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends in Ecology and Evolution 28(12), 719–728 (2013)
Robinson, D.R., Foulds, L.R.: Comparison of phylogenetic trees. Mathematical Biosciences 53(1-2), 131–147 (1981)
Rokas, A., Williams, B., King, N., Carroll, S.: Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425, 798–804 (2003)
Salichos, L., Rokas, A.: Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497(7449), 327–333 (2013)
Song, S., Liu, L., Edwards, S.V., Wu, S.: Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc. Natl. Acad. Sci. USA 109(37), 14942–14947 (2012)
Springer, M.S., Burk-Herrick, A., Meredith, R., Eizirik, E., Teeling, E., O’Brien, S.J., Murphy, W.J.: The adequacy of morphology for reconstructing the early history of placental mammals. Syst. Biol. 56(4), 673–684 (2007)
Stamatakis, A.: Raxml-vi-hpc: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)
Sukumaran, J., Holder, M.T.: Dendropy: a python library for phylogenetic computing. Bioinformatics 26(12), 1569–1571 (2000)
Than, C., Nakhleh, L.: Species tree inference by minimizing deep coalescences. PLOS Computational Biology 5(9), 1–12 (2009)
Yang, J., Warnow, T.: Fast and accurate methods for phylogenomic analyses. BMC Bioinformatics 12(9), 1–12 (2011)
Yu, Y., Warnow, T., Nakhleh, L.: Algorithms for mdc-based multi-locus phylogeny inference: Beyond rooted binary gene trees on single alleles. Journal of Computational Biology 18(11), 1543–1559 (2011)
Zimmermann, T., Mirarab, S., Warnow, T.: Bbca: Improving the scalability of *beast using random binning. BMC Genomics 15(suppl. 6)(S11), 1–9 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bhattacharyya, S., Mukhopadhyay, J. (2015). Couplet Supertree Based Species Tree Estimation. In: Harrison, R., Li, Y., Măndoiu, I. (eds) Bioinformatics Research and Applications. ISBRA 2015. Lecture Notes in Computer Science(), vol 9096. Springer, Cham. https://doi.org/10.1007/978-3-319-19048-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-19048-8_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19047-1
Online ISBN: 978-3-319-19048-8
eBook Packages: Computer ScienceComputer Science (R0)