Abstract
Weak branch supports in a gene tree suggest that the signal in sequence data is insufficient to resolve a particular branching order. One approach to reduce uncertainty takes the topology of the species tree into account. Under a maximum parsimony model, the best resolution of the weak branches is the binary tree that minimizes the cost of duplications, transfers, and losses. However, this problem is NP-hard, and the exact algorithm is limited to small, weakly supported areas.
We present an exact algorithm and several heuristic methods to resolve weak or non-binary gene trees given an undated species tree. These methods generate a set of optimal, binary resolutions that are temporally feasible, as well as event histories corresponding to each binary resolution. We compared the accuracy and runtime of these methods on simulated and biological datasets. The best of these heuristics provide close approximation to the event cost of the exact method and are much faster in practice. Surprisingly, a heuristic based on duplications and losses provides a good initialization for tree searching methods, even when transfers are present. Comparing event costs with RF distance, we observed that the two measures of distance captured very different information and are poorly correlated.
All methods are implemented in a new release of Notung, a Java-based, cross-platform software for reconciling and resolving gene trees. Notung is available at: http://www.cs.cmu.edu/~durand/Notung.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Anisimova, M., Gil, M., Dufayard, J.F., Dessimoz, C., Gascuel, O.: Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60(5), 685–699 (2011)
Bansal, M.S., Alm, E.J., Kellis, M.: Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss. Bioinformatics 28, i283–i291 (2012)
Bansal, M.S., Wu, Y.C., Alm, E.J., Kellis, M.: Improved gene tree error correction in the presence of horizontal gene transfer. Bioinformatics 31, 1211–1218 (2015)
Barker, D.: Gene trees for orthologous groups from: the evolution of nitrogen fixation in cyanobacteria (2012). Edinburgh DataShare. doi:10.5061/dryad.pv6df
Boussau, B., Szöllősi, G.J., Duret, L., Gouy, M., Tannier, E., Daubin, V.: Genome-scale coestimation of species and gene trees. Genome Res. 23, 323–330 (2013)
Chang, W.-C., Eulenstein, O.: Reconciling gene trees with apparent polytomies. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 235–244. Springer, Heidelberg (2006). doi:10.1007/11809678_26
Chaudhary, R., Burleigh, J.G., Eulenstein, O.: Efficient error correction algorithms for gene tree reconciliation based on duplication, duplication and loss, and deep coalescence. BMC Bioinformatics 13(Suppl 10), S11 (2012)
Chauve, C., El-Mabrouk, N., Guéguen, L., Semeria, M., Tannier, E.: Duplication rearrangement and reconciliation: a follow-up 13 years later. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds.) Models and Algorithms for Genome Evolution, pp. 47–62. Springer, London (2013). doi:10.1007/978-1-4471-5298-9_4
Chen, K., Durand, D., Farach-Colton, M.: Notung: a program for dating gene duplications and optimizing gene family trees. J. Comput. Biol. 7(3/4), 429–447 (2000)
Darby, C.A., Stolzer, M., Ropp, P.J., Barker, D., Durand, D.: Xenolog classification. Bioinformatics 33(5), 640–649 (2017)
David, L.A., Alm, E.J.: Rapid evolutionary innovation during an Archaean genetic expansion. Nature 469, 93–96 (2011)
Donati, B., Baudet, C., Sinaimeri, B., Crescenzi, P., Sagot, M.F.: EUCALYPT: efficient tree reconciliation enumerator. Algorithms Mol. Biol. 10(1), 3 (2015)
Doyon, J.-P., Scornavacca, C., Gorbunov, K.Y., Szöllősi, G.J., Ranwez, V., Berry, V.: An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications and transfers. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 93–108. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16181-0_9
Durand, D., Halldorsson, B., Vernot, B.: A hybridmicro-macroevolutionary approach to gene tree reconstruction. J. Comput. Biol. 13(2), 320–335 (2006). A preliminary version appeared in RECOMB 2005, 250–264
El-Mabrouk, N., Ouangraoua, A.: A general framework for gene tree correction based on duplication-loss reconciliation. In: Proceedings of the Workshop on Algorithmics in Bioinformatics (WABI). (2017, in press)
Górecki, P., Eulenstein, O.: Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem. BMC Bioinform. 13(Suppl 10), S14 (2012)
Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010)
Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: Proceedings of the 8th International Conference on Research in Computational Biology, RECOMB 2004, pp. 347–356. ACM Press, New York (2004)
Hill, T., Nordström, K.J.V., Thollesson, M., Säfström, T.M., Vernersson, A.K.E., Fredriksson, R., Schiöth, H.B.: Sprit: Identifying horizontal gene transfer in rooted phylogenetic trees. BMC Evol. Biol. 10, 42 (2010)
Huson, D., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, Cambridge (2011)
Huson, D.H., Scornavacca, C.: A survey of combinatorial methods for phylogenetic networks. Genome Biol. Evol. 3, 23–35 (2011)
Jacox, E., Chauve, C., Szöllősi, G.J., Ponty, Y., Scornavacca, C.: ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32, 2056–2058 (2016)
Jacox, E., Weller, M., Tannier, E., Scornavacca, C.: Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses. Bioinformatics 33, 980–987 (2017)
Keane, T.M., Creevey, C.J., Pentony, M.M., Naughton, T.J., Mclnerney, J.O.: Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6, 29 (2006)
Kordi, M., Bansal, M.S.: Exact algorithms for duplication-transfer-loss reconciliation with non-binary gene trees. In: ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 297–306 (2016)
Kordi, M., Bansal, S.: On the complexity of duplication-transfer-loss reconciliation with non-binary gene trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 14(3), 587–599 (2017)
Lafond, M., Chauve, C., Dondi, R., El-Mabrouk, N.: Polytomy refinement for the correction of dubious duplications in gene trees. Bioinformatics 30, i519–i526 (2014)
Lafond, M., Noutahi, E., El-Mabrouk, N.: Efficient non-binary gene tree resolution with weighted reconciliation cost. In: Grossi, R., Lewenstein, M. (eds.) 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), Leibniz International Proceedings in Informatics (LIPIcs), vol. 54, pp. 14:1–14:12. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2016)
Lafond, M., Semeria, M., Swenson, K.M., Tannier, E., El -Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14(Suppl 15), S5 (2013)
Lafond, M., Swenson, K.M., El-Mabrouk, N.: An optimal reconciliation algorithm for gene trees with polytomies. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 106–122. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33122-0_9
Latysheva, N., Junker, V.L., Palmer, W.J., Codd, G.A., Barker, D.: The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 28(5), 603–606 (2012)
Ma, W., Smirnov, D., Forman, J., Schweickart, A., Slocum, C., Srinivasan, S., Libeskind-Hadas, R.: DTL-RnB: algorithms and tools for summarizing the space of DTL reconciliations. IEEE/ACM Trans. Comput. Biol. Bioinform. (2016, in press)
Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: Heath, L., Ramakrishnan, N. (eds.) The Problem Solving Handbook for Computational, pp. 125–158. Springer, Heidelberg (2010). doi:10.1007/978-0-387-09760-2_7
Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28, 719–728 (2013)
Nakhleh, L., Ruths, D.: Gene trees, species trees, and species networks. In: Guerra, R., Goldstein, D. (eds.) Meta-Analysis and Combining Information in Genetics and Genomics, pp. 275–293. CRC Press, Boca Raton (2009)
Nguyen, T.H., Ranwez, V., Pointet, S., Chifolleau, A.M.A., Doyon, J.P., Berry, V.: Reconciliation and local gene tree rearrangement can be of mutual profit. Algorithms Mol. Biol. 8(1), 12 (2013)
Noutahi, E., Semeria, M., Lafond, M., Seguin, J., Boussau, B., Guéguen, L., El -Mabrouk, N., Tannier, E.: Efficient gene tree correction guided by genome evolution. PLoS ONE 11, e0159559 (2016)
Ovadia, Y., Fielder, D., Conow, C., Libeskind-Hadas, R.: The cophylogeny reconstruction problem is NP-complete. J. Comput. Biol. 18, 59–65 (2011)
Penel, S., Arigon, A.M., Dufayard, J.F., Sertier, A.S., Daubin, V., Duret, L., Gouy, M., Perrière, G.: Databases of homologous gene families for comparative genomics. BMC Bioinform. 10(Suppl 6), S3 (2009)
Rasmussen, M.D., Kellis, M.: A Bayesian approach for fast and accurate gene tree reconstruction. Mol. Biol. Evol. 28, 273–290 (2011)
Scornavacca, C., Jacox, E., Szöllősi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31, 841–848 (2015)
Sjöstrand, J., Sennblad, B., Arvestad, L., Lagergren, J.: DLRS: gene tree evolution in light of a species tree. Bioinformatics 28, 2994–2995 (2012)
Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B., Lagergren, J.: A Bayesian method for analyzing lateral gene transfer. Syst. Biol. 63(3), 409 (2014)
Stolzer, M., Lai, H., Xu, M., Sathaye, D., Vernot, B., Durand, D.: Inferring duplications, losses, transfers, and incomplete lineage sorting with non-binary species trees. Bioinformatics 28, i409–i415 (2012)
Swenson, K.M., Doroftei, A., El-Mabrouk, N.: Gene tree correction for reconciliation and species tree inference. Algorithms Mol. Biol. 7, 31 (2012)
Szöllősi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. U.S.A. 109, 17513–17518 (2012)
Szöllősi, G.J., Rosikiewicz, W., Boussau, B., Tannier, E., Daubin, V.: Data from: efficient exploration of the space of reconciled gene trees (2013). Dryad Digital Repository. doi:10.5061/dryad.pv6df
Szöllősi, G.J., Rosikiewicz, W., Boussau, B., Tannier, E., Daubin, V.: Efficient exploration of the space of reconciled gene trees. Syst. Biol. 62, 901–912 (2013)
Thomas, P.D.: GIGA: a simple, efficient algorithm for gene tree inference in the genomic age. BMC Bioinform. 11, 312 (2010)
Tofigh, A., Hallett, M., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biol. Bioinf. 8, 517–535 (2011)
Vilella, A.J., Severin, J., Ureta-Vidal, A., Heng, L., Durbin, R., Birney, E.: Ensemblcompara genetrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 19, 327–335 (2009)
Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23, i549–i558 (2007)
Zheng, Y., Zhang, L.: Are the duplication cost and robinson-foulds distance equivalent? J. Comput. Biol. 21, 578–590 (2014)
Zheng, Y., Zhang, L.: Reconciliation with non-binary gene trees revisited. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 418–432. Springer, Cham (2014). doi:10.1007/978-3-319-05269-4_33
Acknowledgments
We thank Annette McLeod for help with figures.
Funding
This material is based upon work supported by the National Science Foundation under Grant No. DBI-1262593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lai, H., Stolzer, M., Durand, D. (2017). Fast Heuristics for Resolving Weakly Supported Branches Using Duplication, Transfers, and Losses. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-67979-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67978-5
Online ISBN: 978-3-319-67979-2
eBook Packages: Computer ScienceComputer Science (R0)