Abstract
The fundamental Hennigian principle, grouping solely on synapomorphy, is seldom used in modern phylogenetics. In the submitted paper, we apply this principle in reanalyzing five datasets comprising 197 complete plastid genomes (plastomes). We focused on the latter because plastome-based DNA sequence data gained dramatic popularity in molecular systematics during the last decade. We show that pattern-cladistic analyses based on complete plastid genome sequences can successfully resolve affinities between plant taxa, simultaneously simplifying both the genomic and analytical frameworks of phylogenetic studies. We developed “Matrix to Newick” (M2N), a program to represent the standard molecular alignment of plastid genomes in the form of trees or relationships directly. Thus, massive plastome-based DNA sequence data can be successfully represented in a relational form rather than as a standard molecular alignment. Application of methods of median supertree construction (the Average Consensus method has been used as an example in this study) or Maximum Parsimony analysis to relational representations of plastome sequence data may help systematist to avoid the complicated assumption-based frameworks of Maximum Likelihood or Bayesian phylogenetics that are most used today in massive plastid sequence data analyses. We also found that significant amounts of pure genomic information that typically accommodate the majority of current plastid phylogenomic studies can be effectively dropped by systematists if they focus on the pattern-cladistics or relational analyses of plastome-based molecular data. The proposed pattern-cladistic approach is a powerful and straightforward heuristic alternative to modern plastome-based phylogenetics.
Similar content being viewed by others
References
Assis LCS (2015) Homology assessment in parsimony and model–based analyses: two sides of the same coin. Cladistics 31:315–320
Barrett CF, Baker WJ, Comer JR, Conran JG, Lahmeyer SC, Leebens-Mack JH, Li J, Lim GS, Mayfield-Jones DR, Perez L (2016) Plastid genomes reveal support for deep phylogenetic relationships and extensive rate variation among palms and other commelinid monocots. New Phytol 209:855–870
Baum BR (1989) PHYLIP: phylogeny inference package Version 3.2. Q Rev Biol 64:539–541
Baum BR (1992) Combining trees as a way of combining data sets for phylogenetic inference and the desirability of combining gene trees. Taxon 41:3–10
Bininda-Emonds ORP, Sanderson MJ (2001) Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. Syst Biol 50:565–579
Brower AVZ (2000) Evolution is not a necessary assumption of cladistics. Cladistics 16:143–154
Brower AVZ (2019) Background knowledge: the assumptions of pattern cladistics. Cladistics 35:717–731
Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17:540–552
Chen YP, Zhao F, Paton AJ, Sunojkumar P, Gao L-M, Xiang C-L (2022) Plastome sequences fail to resolve shallow level relationships within the rapidly radiated genus Isodon (Lamiaceae). Front Plant Sci 13:985488
Creevey CJ, McInerney JO (2009) Trees from trees: construction of phylogenetic supertrees using Clann. In: Posada D (ed) Springer protocols: methods in molecular biology Bioinformatics for DNA Sequence Analysis, vol 537. Humana Press Totowa, Totowa, pp 139–161
Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models new heuristics and parallel computing. Nat Methods 9:772–772
De Soete G, DeSarbo WS, Carroll JD (1985) Optimal variable weighting for hierarchical clustering—an alternating least-squares algorithm. J Classif 2:173–192
Degnan JH, DeGiorgio M, Bryant D, Rosenberg NA (2009) Properties of consensus methods for inferring species trees from gene trees. Syst Biol 58:35–54
DeSarbo WS, Carroll JD, Clark LA, Green PE (1984) Synthesized clustering—a method for amalgamating alternative clustering bases with differential weighting of variables. Psychometrika 49:57–78
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
Edgar RC (2010) Quality measures for protein alignment benchmarks. Nucleic Acids Res 38:2145–2153
Farris JS (1983) The logical basis of phylogenetic analysis. In: Platnick NI, Funk V (eds) Advances in cladistics, 2. Proceedings of the 2nd meeting of the Willi Hennig society; Ann Arbor, Mich., USA, Oct. 1–4; 1981. Columbia University Press, New York, pp. 7–36
Felsenstein J (1993) PHYLIP (Phylogeny Inference Package) Version 3.5c Distributed by the author Department of Genetics University of Washington Seattle USA https://csbf.stanford.edu/phylip/
Felsenstein J (2004) Inferring phylogenies. Sinauer Associates Inc Sunderland, Sunderland
Fitch WM (1966) An improved method of testing for evolutionary homology. J Mol Biol 16:9–16
Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS (2019) Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 139:106539
Goloboff PA, Farris JS, Nixon KC (2008) TNT, a free program for phylogenetic analysis. Cladistics 24:774–786
Goncalves DJ, Simpson BB, Ortiz EM, Shimizu GH, Jansen RK (2019) Incongruence between gene trees and species trees and phylogenetic signal variation in plastid genes. Mol Phylogenet Evol 138:219–232
Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224
Graybeal A (1998) Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol 47:9–17
Hartigan JA (1975) Clustering Algorithms. John Wiley and Sons, New York
Hennig W (1966) Phylogenetic systematics. University of Illinois Press, Urbana
Huang Y, Fan L, Huang J, Zhou G, Chen X, Chen J (2022) Plastome phylogenomics of Aucuba (Garryaceae). Front Genet 13:753719
Huelsenbeck JP, Ronquist F (2001) MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755
Hull DL (1967) The metaphysics of evolution. Br J Hist Sci 3:309–337
Jombart T, Kendall M, Almagro-Garcia J, Colijn C (2017) treespace: statistical exploration of landscapes of phylogenetic trees. Mol Ecol Resour 17:1385–1392
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–132
Kalyaanamoorthy S, Bui Quang M, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780
Kitching IJ, Forey PL, Humphries CJ, Williams DM (1998) Cladistics: the theory and practice of parsimony analysis, 2nd ed. Systematics Association publications (Book 11). Oxford University Press, Oxford, UK
Lapointe FJ, Cucumel G (1997) The average consensus procedure: a combination of weighted trees containing identical or overlapping sets of taxa. Syst Biol 46:306–312
Lapointe FJ, Levasseur C (2004) Everything you always wanted to know about the average consensus, and more. In: Bininda-Emonds RP (ed) Phylogenetic supertrees: combining information to reveal the Tree of Life. Computational biology series, vol 4. Kluwer Academic Publishers, Dordrecht, pp 87–105
Lewis PO (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50:913–925
Maddison WP, Maddison DR (2021) Mesquite: a modular system for evolutionary analysis Version 3.70 http://mes.quite.project.org
Mavrodiev EV (2016) Dealing with propositions not with the characters: the ability of three-taxon statement analysis to recognize groups based solely on ‘reversals’ under the maximum-likelihood criteria. Aust Syst Bot 29:119–125
Mavrodiev EV, Madorsky A (2012) TAXODIUM Version 10: a simple way to generate uniform and fractionally weighted three-item matrices from various kinds of biological data. PloS one 7:e48813
Mavrodiev EV, Dell C, Schroder L (2017) A laid-back trip through the hennigian forests. Peer J 5:e3578
Mavrodiev EV, Williams DM, Ebach MC (2019) On the typology of relations. Evol Biol 46:71–89
Michener CD, Sokal RR (1957) A quantitative approach to a problem in classification. Evolution 11:130–162
Miller MA, Pfeiffer W, Schwartz T (2010) Creating the CIPRES Science Gateway for inference of large phylogenetic tres In: Pirece M (ed) Proceedings of the gateway computing environments workshop (GCE) 14 Nov 2010 New Orleans pp 1–8
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, Von Haeseler A, Lanfear R (2020) IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37:1530–1534
Morrison DA, Morgan MJ, Kelchner SA (2015) Molecular homology and multiple-sequence alignment: an analysis of concepts and practice. Aust Syst Bot 28:46–62
Mossel E, Roch S (2010) Incomplete lineage sorting: consistent phylogeny estimation from multiple loci. IEEE/ACM Trans Comput Biol Bioinform 7:166–171
Namgung J, Do HDK, Kim C, Choi HJ, Kim JH (2021) Complete chloroplast genomes shed light on phylogenetic relationships divergence time and biogeography of Allioideae (Amaryllidaceae). Sci Rep 11:1–13
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453
Nelson G (1979) Cladistic analysis and synthesis - principles and definitions, with a historical note on Adanson’s Familles des Plantes (1763–1764). Syst Zool 28:1–21
Nelson G (2011) Resemblance as Evidence of Ancestry. Zootaxa 2946:137–141
Nelson G, Platnick NI (1991) Three-taxon statements—a more precise use of parsimony? Cladistics 7:351–366
O’Rourke F (2004) Aristotle and the Metaphyics of Evolution. Rev Metaphys 58:3–59
Patterson C (1982) Homology in classical and molecular biology. Mol Biol Evol 5:603–625
Platnick NI (1979) Philosophy and the transformation of cladistics. Syst Zool 28:537–546
Platnick NI (1993) Character optimization and weighting - differences between the standard and three-taxon approaches to phylogenetic inference. Cladistics 9:267–272
Ragan MA (1992a) Phylogenetic inference based on matrix representation of trees. Mol Biol Evol 1:53–58
Ragan MA (1992b) Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. BioSystems 28:47–55
Rambaut A, Drummond AJ (2009) Tracer v. 16. http://beast.bio.ed.acuk/
Rambaut A, Drummond AJ (2018) FigTree v. 1.4. Molecular evolution, phylogenetics and epidemiology. University of Edinburgh, Edinburgh. http://tree.bio.ed.ac.uk/software/figtree/
Rannala B, Yang ZH (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43:304–311
Ronquist F, Teslenko M, Van Der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 32: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542
Samigullin T, Logacheva M, Terentieva E, Degtjareva G, Pimenov M, Valiejo-Roman C (2022) Plastid Phylogenomic analysis of tordylieae tribe (Apiaceae Apioideae). Plants 11:709
Sankoff D, Morel C, Cedergren RJ (1973) Evolution of 5S RNA and non-randomness of base replacement. Nature New Biol 245:232–234
Sokal RR (1986) Phenetic taxonomy—theory and methods. Annu Rev Ecol Evol Syst 17:423–442
Soltis DE, Soltis PS (2004) Amborella not a “basal angiosperm”? not so fast. Am J Bot 91:997–1001
Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313
Swofford DL (2002) PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates Inc, Sunderland
Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564–577
Thompson JD, Plewniak F, Poch O (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15:87–88
Townsend JP (2007) Profiling phylogenetic informativeness. Syst Biol 56(2):222–231
Townsend JP, Lopez-Giraldez F (2010) Optimal selection of gene and ingroup taxon sampling for resolving phylogenetic relationships. Syst Biol 59:446–457
Tremblay F (2013) Nicolai Hartmann and the metaphysical foundation of phylogenetic systematics. Biol Theory 7:56–68
Tremblay F (2020) Nikolai Lossky’s evolutionary metaphysics of reincarnation. Sophia 59:733–753
Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59:581–607
Wagner ND, Volf M, Hörandl E (2021) Highly diverse shrub willows (Salix L.) share highly similar plastomes. Front Plant Sci 12:662715
Watson HC, Kendrew JC (1961) Comparison between the amino-acid sequences of sperm whale myoglobin and of human hemoglobin. Nature 190:670–672
Wei R, Zhang X-C (2020) Phylogeny of Diplazium (Athyriaceae) revisited: resolving the backbone relationships based on plastid genomes and phylogenetic tree space analysis. Mol Phylogenet Evol 143:106699
Williams DM (1994) Combining trees and combining data. Taxon 43:449–453
Williams DM (1996) Characters and cladograms. Taxon 45:275–283
Williams DM (2004) Supertrees, components and three-item data. In: Bininda-Emonds ORP (ed) Phylogenetic supertrees: Combining information to reveal the Tree of Life. Springer-Kluwer Academic Publisher, Dordrecht, The Netherlands, pp 389–408
Williams DM, Ebach MC (2006) The data matrix. Geodiversitas 28:409–420
Williams DM, Ebach MC (2008) Foundations of systematics and biogeography. Springer, New York
Williams DM, Siebert DJ (2000) Characters, homology and three-item analysis. In: Scotland RW, Pennington TR (eds) Homology and systematics: coding characters for phylogenetic analysis Systematics Association Special Volume (Book 58). Taylor and Francis, London, pp 183–208
Williams DM, Ebach MC, Wheeler QD (2010) Beyond belief: The steady resurrection of phenetics. In: Williams DM, Knapp S (eds) Beyond cladistics: The branching of a paradigm. University of California Press, Berkley, California, USA, pp 169–197
Williams DM, Ebach MC (2020) Cladistics: A guide to biological classification, 3rd ed, Systematics Association Special Volume (Book 88). Cambridge University Press, Cambridge, UK
Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728
Zhang C, Rabiee M, Sayyari E, Mirarab S (2018) ASTRAL–III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform 19:15–30
Zhang X, Sun Y, Landis JB, Lv Z, Shen J, Zhang H, Lin N, Li L, Sun J, Deng T (2020) Plastome phylogenomic study of Gentianeae (Gentianaceae): widespread gene tree discordance and its association with evolutionary rate heterogeneity of plastid genes. BMC Plant Biol 20:1–15
Acknowledgements
The authors thank Prof. Richard Buggs (School of Biological and Chemical Sciences, Queen Mary University of London, UK) for his helpful notes and comments on an early version of the article. We acknowledge Dr. David M. Williams (the Natural History Museum, London, UK) and Prof. Malte C. Ebach (University of New South Wales & the Sydney’s Australian Museum, AU) for their helpful discussion. We also thank Dr. Williams for bringing the study of Nelson (1979) to our attention. Two anonymous Reviewers (especially Reviewer 2) are highly acknowledged for their elegant comments, that helped to improve the article's content.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mavrodiev, E.V., Madorsky, A. On Pattern-Cladistic Analyses Based on Complete Plastid Genome Sequences. Acta Biotheor 71, 22 (2023). https://doi.org/10.1007/s10441-023-09475-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10441-023-09475-5