Abstract
A rogue taxon in a collection of phylogenetic trees is one whose position varies drastically from tree to tree. The presence of such taxa can greatly reduce the resolution of the consensus tree (e.g., the majority-rule or strict consensus) for a collection. The reduced consensus approach aims to identify and eliminate rogue taxa to produce more informative consensus trees. Given a collection of phylogenetic trees over the same leaf set, the goal is to find a set of taxa whose removal maximizes the number of internal edges in the consensus tree of the collection. We show that this problem is NP-hard for strict and majority-rule consensus. We give a polynomial-time algorithm for reduced strict consensus when the maximum degree of the strict consensus of the original trees is bounded. We describe exact integer linear programming formulations for computing reduced strict, majority and loose consensus trees. In experimental tests, our exact solutions improved over heuristic methods on several problem instances.
Supported in part by National Science Foundation grants DEB-0830012 and CCF-106029.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amenta, N., Clarke, F., John, K.S.: A Linear-Time Majority Tree Algorithm. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 216–227. Springer, Heidelberg (2003)
Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees. SIAM Journal on Computing 26, 758–769 (1994)
Bryant, D.: A classification of consensus methods for phylogenetics. In: Janowitz, M., Lapointe, F.-J., McMorris, F., Mirkin, B.B., Roberts, F. (eds.) Bioconsensus. Discrete Mathematics and Theoretical Computer Science, vol. 61, pp. 163–185. American Mathematical Society, Providence (2003)
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent subtree mining — an overview. Fundamenta Informaticae 66(1-2), 161–198 (2004)
Cranston, K.A., Rannala, B.: Summarizing a posterior distribution of trees using agreement subtrees. Systematic Biology 56(4), 578 (2007)
Dong, J., Fernández-Baca, D.: Constructing Large Conservative Supertrees. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS, vol. 6833, pp. 61–72. Springer, Heidelberg (2011)
Dong, J., Fernández-Baca, D., McMorris, F.R.: Constructing majority-rule supertrees. Algorithms in Molecular Biology 5(2) (2010)
Farach, M., Przytycka, T.M., Thorup, M.: On the agreement of many trees. Inf. Process. Lett. 55(6), 297–301 (1995)
Finden, C.R., Gordon, A.D.: Obtaining common pruned trees. Journal of Classification 2(1), 255–276 (1985)
Gusfield, D., Frid, Y., Brown, D.: Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data. In: Lin, G. (ed.) COCOON 2007. LNCS, vol. 4598, pp. 51–64. Springer, Heidelberg (2007)
Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations. Plenum, New York (1972)
Lee, C.-M., Hung, L.-J., Chang, M.-S., Shen, C.-B., Tang, C.-Y.: An improved algorithm for the maximum agreement subtree problem. Information Processing Letters 94(5), 211–216 (2005)
Margush, T., McMorris, F.R.: Consensus n-trees. Bulletin of Mathematical Biology 43(2), 239–244 (1981)
Nadler, S.A., Carreno, R.A., Mejía-Madrid, H., Ullberg, J., Pagan, C., Houston, R., Hugot, J.P.: Molecular phylogeny of clade III nematodes reveals multiple origins of tissue parasitism. Parasitology 134(10), 1421–1442 (2007)
Pattengale, N., Aberer, A., Swenson, K., Stamatakis, A., Moret, B.: Uncovering hidden phylogenetic consensus in large datasets. IEEE/ACM Trans. Comput. Biol. Bioinformatics 8-4(99), 1 (2011)
Redelings, B.: Bayesian phylogenies unplugged: Majority consensus trees with wandering taxa (2009)
Semple, C., Steel, M.: Phylogenetics. Oxford Lecture Series in Mathematics. Oxford University Press, Oxford (2003)
Sridhar, S., Lam, F., Blelloch, G.E., Ravi, R., Schwartz, R.: Mixed integer linear programming for maximum-parsimony phylogeny inference. IEEE/ACM Trans. Comput. Biol. Bioinformatics 5(3), 323–331 (2008)
Sullivan, J., Swofford, D.L.: Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. Journal of Mammalian Evolution 4(2), 77–86 (1997)
Swenson, K.M., Chen, E., Pattengale, N.D., Sankoff, D.: The Kernel of Maximum Agreement Subtrees. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS, vol. 6674, pp. 123–135. Springer, Heidelberg (2011)
Thomson, R.C., Shaffer, H.B.: Sparse supermatrices for phylogenetic inference: taxonomy, alignment, rogue taxa, and the phylogeny of living turtles. Systematic Biology 59(1), 42 (2010)
Wilkinson, M.: Common cladistic information and its consensus representation: reduced Adams and reduced cladistic consensus trees and profiles. Systematic Biology 43(3), 343 (1994)
Wilkinson, M.: More on reduced consensus methods. Systematic Biology 44(3), 435 (1995)
Wilkinson, M.: Majority-rule reduced consensus trees and their use in bootstrapping. Molecular Biology and Evolution 13(3), 437 (1996)
Xiao, Y., Yao, J.F.: Efficient data mining for maximal frequent subtrees. In: Proc. IEEE International Conference on Data Mining, pp. 379–386. IEEE (2003)
Zaki, M.J.: Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Trans. on Knowl. and Data Eng. 17(8), 1021–1035 (2005)
Zhang, S., Wang, J.T.L.: Discovering frequent agreement subtrees from phylogenetic data. IEEE Trans. on Knowl. and Data Eng. 20, 68–82 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deepak, A., Dong, J., Fernández-Baca, D. (2012). Identifying Rogue Taxa through Reduced Consensus: NP-Hardness and Exact Algorithms. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-30191-9_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30190-2
Online ISBN: 978-3-642-30191-9
eBook Packages: Computer ScienceComputer Science (R0)