Skip to main content

Fast Heuristics for Resolving Weakly Supported Branches Using Duplication, Transfers, and Losses

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10562))

Abstract

Weak branch supports in a gene tree suggest that the signal in sequence data is insufficient to resolve a particular branching order. One approach to reduce uncertainty takes the topology of the species tree into account. Under a maximum parsimony model, the best resolution of the weak branches is the binary tree that minimizes the cost of duplications, transfers, and losses. However, this problem is NP-hard, and the exact algorithm is limited to small, weakly supported areas.

We present an exact algorithm and several heuristic methods to resolve weak or non-binary gene trees given an undated species tree. These methods generate a set of optimal, binary resolutions that are temporally feasible, as well as event histories corresponding to each binary resolution. We compared the accuracy and runtime of these methods on simulated and biological datasets. The best of these heuristics provide close approximation to the event cost of the exact method and are much faster in practice. Surprisingly, a heuristic based on duplications and losses provides a good initialization for tree searching methods, even when transfers are present. Comparing event costs with RF distance, we observed that the two measures of distance captured very different information and are poorly correlated.

All methods are implemented in a new release of Notung, a Java-based, cross-platform software for reconciling and resolving gene trees. Notung is available at: http://www.cs.cmu.edu/~durand/Notung.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Anisimova, M., Gil, M., Dufayard, J.F., Dessimoz, C., Gascuel, O.: Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60(5), 685–699 (2011)

    Article  Google Scholar 

  2. Bansal, M.S., Alm, E.J., Kellis, M.: Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss. Bioinformatics 28, i283–i291 (2012)

    Article  Google Scholar 

  3. Bansal, M.S., Wu, Y.C., Alm, E.J., Kellis, M.: Improved gene tree error correction in the presence of horizontal gene transfer. Bioinformatics 31, 1211–1218 (2015)

    Article  Google Scholar 

  4. Barker, D.: Gene trees for orthologous groups from: the evolution of nitrogen fixation in cyanobacteria (2012). Edinburgh DataShare. doi:10.5061/dryad.pv6df

  5. Boussau, B., Szöllősi, G.J., Duret, L., Gouy, M., Tannier, E., Daubin, V.: Genome-scale coestimation of species and gene trees. Genome Res. 23, 323–330 (2013)

    Article  Google Scholar 

  6. Chang, W.-C., Eulenstein, O.: Reconciling gene trees with apparent polytomies. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 235–244. Springer, Heidelberg (2006). doi:10.1007/11809678_26

    Chapter  Google Scholar 

  7. Chaudhary, R., Burleigh, J.G., Eulenstein, O.: Efficient error correction algorithms for gene tree reconciliation based on duplication, duplication and loss, and deep coalescence. BMC Bioinformatics 13(Suppl 10), S11 (2012)

    Article  Google Scholar 

  8. Chauve, C., El-Mabrouk, N., Guéguen, L., Semeria, M., Tannier, E.: Duplication rearrangement and reconciliation: a follow-up 13 years later. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds.) Models and Algorithms for Genome Evolution, pp. 47–62. Springer, London (2013). doi:10.1007/978-1-4471-5298-9_4

    Chapter  Google Scholar 

  9. Chen, K., Durand, D., Farach-Colton, M.: Notung: a program for dating gene duplications and optimizing gene family trees. J. Comput. Biol. 7(3/4), 429–447 (2000)

    Article  Google Scholar 

  10. Darby, C.A., Stolzer, M., Ropp, P.J., Barker, D., Durand, D.: Xenolog classification. Bioinformatics 33(5), 640–649 (2017)

    Google Scholar 

  11. David, L.A., Alm, E.J.: Rapid evolutionary innovation during an Archaean genetic expansion. Nature 469, 93–96 (2011)

    Article  Google Scholar 

  12. Donati, B., Baudet, C., Sinaimeri, B., Crescenzi, P., Sagot, M.F.: EUCALYPT: efficient tree reconciliation enumerator. Algorithms Mol. Biol. 10(1), 3 (2015)

    Article  Google Scholar 

  13. Doyon, J.-P., Scornavacca, C., Gorbunov, K.Y., Szöllősi, G.J., Ranwez, V., Berry, V.: An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications and transfers. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 93–108. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16181-0_9

    Chapter  Google Scholar 

  14. Durand, D., Halldorsson, B., Vernot, B.: A hybridmicro-macroevolutionary approach to gene tree reconstruction. J. Comput. Biol. 13(2), 320–335 (2006). A preliminary version appeared in RECOMB 2005, 250–264

    Article  MathSciNet  MATH  Google Scholar 

  15. El-Mabrouk, N., Ouangraoua, A.: A general framework for gene tree correction based on duplication-loss reconciliation. In: Proceedings of the Workshop on Algorithmics in Bioinformatics (WABI). (2017, in press)

    Google Scholar 

  16. Górecki, P., Eulenstein, O.: Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem. BMC Bioinform. 13(Suppl 10), S14 (2012)

    Article  Google Scholar 

  17. Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010)

    Article  Google Scholar 

  18. Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: Proceedings of the 8th International Conference on Research in Computational Biology, RECOMB 2004, pp. 347–356. ACM Press, New York (2004)

    Google Scholar 

  19. Hill, T., Nordström, K.J.V., Thollesson, M., Säfström, T.M., Vernersson, A.K.E., Fredriksson, R., Schiöth, H.B.: Sprit: Identifying horizontal gene transfer in rooted phylogenetic trees. BMC Evol. Biol. 10, 42 (2010)

    Article  Google Scholar 

  20. Huson, D., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  21. Huson, D.H., Scornavacca, C.: A survey of combinatorial methods for phylogenetic networks. Genome Biol. Evol. 3, 23–35 (2011)

    Article  Google Scholar 

  22. Jacox, E., Chauve, C., Szöllősi, G.J., Ponty, Y., Scornavacca, C.: ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32, 2056–2058 (2016)

    Article  Google Scholar 

  23. Jacox, E., Weller, M., Tannier, E., Scornavacca, C.: Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses. Bioinformatics 33, 980–987 (2017)

    Google Scholar 

  24. Keane, T.M., Creevey, C.J., Pentony, M.M., Naughton, T.J., Mclnerney, J.O.: Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6, 29 (2006)

    Article  Google Scholar 

  25. Kordi, M., Bansal, M.S.: Exact algorithms for duplication-transfer-loss reconciliation with non-binary gene trees. In: ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 297–306 (2016)

    Google Scholar 

  26. Kordi, M., Bansal, S.: On the complexity of duplication-transfer-loss reconciliation with non-binary gene trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 14(3), 587–599 (2017)

    Article  Google Scholar 

  27. Lafond, M., Chauve, C., Dondi, R., El-Mabrouk, N.: Polytomy refinement for the correction of dubious duplications in gene trees. Bioinformatics 30, i519–i526 (2014)

    Article  Google Scholar 

  28. Lafond, M., Noutahi, E., El-Mabrouk, N.: Efficient non-binary gene tree resolution with weighted reconciliation cost. In: Grossi, R., Lewenstein, M. (eds.) 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), Leibniz International Proceedings in Informatics (LIPIcs), vol. 54, pp. 14:1–14:12. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2016)

    Google Scholar 

  29. Lafond, M., Semeria, M., Swenson, K.M., Tannier, E., El -Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14(Suppl 15), S5 (2013)

    Article  Google Scholar 

  30. Lafond, M., Swenson, K.M., El-Mabrouk, N.: An optimal reconciliation algorithm for gene trees with polytomies. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 106–122. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33122-0_9

    Chapter  Google Scholar 

  31. Latysheva, N., Junker, V.L., Palmer, W.J., Codd, G.A., Barker, D.: The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 28(5), 603–606 (2012)

    Article  Google Scholar 

  32. Ma, W., Smirnov, D., Forman, J., Schweickart, A., Slocum, C., Srinivasan, S., Libeskind-Hadas, R.: DTL-RnB: algorithms and tools for summarizing the space of DTL reconciliations. IEEE/ACM Trans. Comput. Biol. Bioinform. (2016, in press)

    Google Scholar 

  33. Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: Heath, L., Ramakrishnan, N. (eds.) The Problem Solving Handbook for Computational, pp. 125–158. Springer, Heidelberg (2010). doi:10.1007/978-0-387-09760-2_7

    Google Scholar 

  34. Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28, 719–728 (2013)

    Article  Google Scholar 

  35. Nakhleh, L., Ruths, D.: Gene trees, species trees, and species networks. In: Guerra, R., Goldstein, D. (eds.) Meta-Analysis and Combining Information in Genetics and Genomics, pp. 275–293. CRC Press, Boca Raton (2009)

    Chapter  Google Scholar 

  36. Nguyen, T.H., Ranwez, V., Pointet, S., Chifolleau, A.M.A., Doyon, J.P., Berry, V.: Reconciliation and local gene tree rearrangement can be of mutual profit. Algorithms Mol. Biol. 8(1), 12 (2013)

    Article  Google Scholar 

  37. Noutahi, E., Semeria, M., Lafond, M., Seguin, J., Boussau, B., Guéguen, L., El -Mabrouk, N., Tannier, E.: Efficient gene tree correction guided by genome evolution. PLoS ONE 11, e0159559 (2016)

    Article  Google Scholar 

  38. Ovadia, Y., Fielder, D., Conow, C., Libeskind-Hadas, R.: The cophylogeny reconstruction problem is NP-complete. J. Comput. Biol. 18, 59–65 (2011)

    Article  MathSciNet  Google Scholar 

  39. Penel, S., Arigon, A.M., Dufayard, J.F., Sertier, A.S., Daubin, V., Duret, L., Gouy, M., Perrière, G.: Databases of homologous gene families for comparative genomics. BMC Bioinform. 10(Suppl 6), S3 (2009)

    Article  Google Scholar 

  40. Rasmussen, M.D., Kellis, M.: A Bayesian approach for fast and accurate gene tree reconstruction. Mol. Biol. Evol. 28, 273–290 (2011)

    Article  Google Scholar 

  41. Scornavacca, C., Jacox, E., Szöllősi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31, 841–848 (2015)

    Article  Google Scholar 

  42. Sjöstrand, J., Sennblad, B., Arvestad, L., Lagergren, J.: DLRS: gene tree evolution in light of a species tree. Bioinformatics 28, 2994–2995 (2012)

    Article  Google Scholar 

  43. Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B., Lagergren, J.: A Bayesian method for analyzing lateral gene transfer. Syst. Biol. 63(3), 409 (2014)

    Article  Google Scholar 

  44. Stolzer, M., Lai, H., Xu, M., Sathaye, D., Vernot, B., Durand, D.: Inferring duplications, losses, transfers, and incomplete lineage sorting with non-binary species trees. Bioinformatics 28, i409–i415 (2012)

    Article  Google Scholar 

  45. Swenson, K.M., Doroftei, A., El-Mabrouk, N.: Gene tree correction for reconciliation and species tree inference. Algorithms Mol. Biol. 7, 31 (2012)

    Article  Google Scholar 

  46. Szöllősi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. U.S.A. 109, 17513–17518 (2012)

    Article  Google Scholar 

  47. Szöllősi, G.J., Rosikiewicz, W., Boussau, B., Tannier, E., Daubin, V.: Data from: efficient exploration of the space of reconciled gene trees (2013). Dryad Digital Repository. doi:10.5061/dryad.pv6df

  48. Szöllősi, G.J., Rosikiewicz, W., Boussau, B., Tannier, E., Daubin, V.: Efficient exploration of the space of reconciled gene trees. Syst. Biol. 62, 901–912 (2013)

    Article  Google Scholar 

  49. Thomas, P.D.: GIGA: a simple, efficient algorithm for gene tree inference in the genomic age. BMC Bioinform. 11, 312 (2010)

    Article  Google Scholar 

  50. Tofigh, A., Hallett, M., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biol. Bioinf. 8, 517–535 (2011)

    Article  Google Scholar 

  51. Vilella, A.J., Severin, J., Ureta-Vidal, A., Heng, L., Durbin, R., Birney, E.: Ensemblcompara genetrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 19, 327–335 (2009)

    Article  Google Scholar 

  52. Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23, i549–i558 (2007)

    Article  Google Scholar 

  53. Zheng, Y., Zhang, L.: Are the duplication cost and robinson-foulds distance equivalent? J. Comput. Biol. 21, 578–590 (2014)

    Article  MathSciNet  Google Scholar 

  54. Zheng, Y., Zhang, L.: Reconciliation with non-binary gene trees revisited. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 418–432. Springer, Cham (2014). doi:10.1007/978-3-319-05269-4_33

    Chapter  Google Scholar 

Download references

Acknowledgments

We thank Annette McLeod for help with figures.

Funding

This material is based upon work supported by the National Science Foundation under Grant No. DBI-1262593. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dannie Durand .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Lai, H., Stolzer, M., Durand, D. (2017). Fast Heuristics for Resolving Weakly Supported Branches Using Duplication, Transfers, and Losses. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67979-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67978-5

  • Online ISBN: 978-3-319-67979-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics