Skip to main content
Log in

On Pattern-Cladistic Analyses Based on Complete Plastid Genome Sequences

  • Regular Article
  • Published:
Acta Biotheoretica Aims and scope Submit manuscript

Abstract

The fundamental Hennigian principle, grouping solely on synapomorphy, is seldom used in modern phylogenetics. In the submitted paper, we apply this principle in reanalyzing five datasets comprising 197 complete plastid genomes (plastomes). We focused on the latter because plastome-based DNA sequence data gained dramatic popularity in molecular systematics during the last decade. We show that pattern-cladistic analyses based on complete plastid genome sequences can successfully resolve affinities between plant taxa, simultaneously simplifying both the genomic and analytical frameworks of phylogenetic studies. We developed “Matrix to Newick” (M2N), a program to represent the standard molecular alignment of plastid genomes in the form of trees or relationships directly. Thus, massive plastome-based DNA sequence data can be successfully represented in a relational form rather than as a standard molecular alignment. Application of methods of median supertree construction (the Average Consensus method has been used as an example in this study) or Maximum Parsimony analysis to relational representations of plastome sequence data may help systematist to avoid the complicated assumption-based frameworks of Maximum Likelihood or Bayesian phylogenetics that are most used today in massive plastid sequence data analyses. We also found that significant amounts of pure genomic information that typically accommodate the majority of current plastid phylogenomic studies can be effectively dropped by systematists if they focus on the pattern-cladistics or relational analyses of plastome-based molecular data. The proposed pattern-cladistic approach is a powerful and straightforward heuristic alternative to modern plastome-based phylogenetics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Assis LCS (2015) Homology assessment in parsimony and model–based analyses: two sides of the same coin. Cladistics 31:315–320

    Article  Google Scholar 

  • Barrett CF, Baker WJ, Comer JR, Conran JG, Lahmeyer SC, Leebens-Mack JH, Li J, Lim GS, Mayfield-Jones DR, Perez L (2016) Plastid genomes reveal support for deep phylogenetic relationships and extensive rate variation among palms and other commelinid monocots. New Phytol 209:855–870

    Article  Google Scholar 

  • Baum BR (1989) PHYLIP: phylogeny inference package Version 3.2. Q Rev Biol 64:539–541

    Article  Google Scholar 

  • Baum BR (1992) Combining trees as a way of combining data sets for phylogenetic inference and the desirability of combining gene trees. Taxon 41:3–10

    Article  Google Scholar 

  • Bininda-Emonds ORP, Sanderson MJ (2001) Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. Syst Biol 50:565–579

    Article  Google Scholar 

  • Brower AVZ (2000) Evolution is not a necessary assumption of cladistics. Cladistics 16:143–154

    Article  Google Scholar 

  • Brower AVZ (2019) Background knowledge: the assumptions of pattern cladistics. Cladistics 35:717–731

    Article  Google Scholar 

  • Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17:540–552

    Article  Google Scholar 

  • Chen YP, Zhao F, Paton AJ, Sunojkumar P, Gao L-M, Xiang C-L (2022) Plastome sequences fail to resolve shallow level relationships within the rapidly radiated genus Isodon (Lamiaceae). Front Plant Sci 13:985488

    Article  Google Scholar 

  • Creevey CJ, McInerney JO (2009) Trees from trees: construction of phylogenetic supertrees using Clann. In: Posada D (ed) Springer protocols: methods in molecular biology Bioinformatics for DNA Sequence Analysis, vol 537. Humana Press Totowa, Totowa, pp 139–161

    Chapter  Google Scholar 

  • Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models new heuristics and parallel computing. Nat Methods 9:772–772

    Article  Google Scholar 

  • De Soete G, DeSarbo WS, Carroll JD (1985) Optimal variable weighting for hierarchical clustering—an alternating least-squares algorithm. J Classif 2:173–192

    Article  Google Scholar 

  • Degnan JH, DeGiorgio M, Bryant D, Rosenberg NA (2009) Properties of consensus methods for inferring species trees from gene trees. Syst Biol 58:35–54

    Article  Google Scholar 

  • DeSarbo WS, Carroll JD, Clark LA, Green PE (1984) Synthesized clustering—a method for amalgamating alternative clustering bases with differential weighting of variables. Psychometrika 49:57–78

    Article  Google Scholar 

  • Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797

    Article  Google Scholar 

  • Edgar RC (2010) Quality measures for protein alignment benchmarks. Nucleic Acids Res 38:2145–2153

    Article  Google Scholar 

  • Farris JS (1983) The logical basis of phylogenetic analysis. In: Platnick NI, Funk V (eds) Advances in cladistics, 2. Proceedings of the 2nd meeting of the Willi Hennig society; Ann Arbor, Mich., USA, Oct. 1–4; 1981. Columbia University Press, New York, pp. 7–36

  • Felsenstein J (1993) PHYLIP (Phylogeny Inference Package) Version 3.5c Distributed by the author Department of Genetics University of Washington Seattle USA https://csbf.stanford.edu/phylip/

  • Felsenstein J (2004) Inferring phylogenies. Sinauer Associates Inc Sunderland, Sunderland

    Google Scholar 

  • Fitch WM (1966) An improved method of testing for evolutionary homology. J Mol Biol 16:9–16

    Article  Google Scholar 

  • Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS (2019) Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 139:106539

    Article  Google Scholar 

  • Goloboff PA, Farris JS, Nixon KC (2008) TNT, a free program for phylogenetic analysis. Cladistics 24:774–786

    Article  Google Scholar 

  • Goncalves DJ, Simpson BB, Ortiz EM, Shimizu GH, Jansen RK (2019) Incongruence between gene trees and species trees and phylogenetic signal variation in plastid genes. Mol Phylogenet Evol 138:219–232

    Article  Google Scholar 

  • Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224

    Article  Google Scholar 

  • Graybeal A (1998) Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol 47:9–17

    Article  Google Scholar 

  • Hartigan JA (1975) Clustering Algorithms. John Wiley and Sons, New York

    Google Scholar 

  • Hennig W (1966) Phylogenetic systematics. University of Illinois Press, Urbana

    Google Scholar 

  • Huang Y, Fan L, Huang J, Zhou G, Chen X, Chen J (2022) Plastome phylogenomics of Aucuba (Garryaceae). Front Genet 13:753719

    Article  Google Scholar 

  • Huelsenbeck JP, Ronquist F (2001) MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755

    Article  Google Scholar 

  • Hull DL (1967) The metaphysics of evolution. Br J Hist Sci 3:309–337

    Article  Google Scholar 

  • Jombart T, Kendall M, Almagro-Garcia J, Colijn C (2017) treespace: statistical exploration of landscapes of phylogenetic trees. Mol Ecol Resour 17:1385–1392

    Article  Google Scholar 

  • Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–132

    Chapter  Google Scholar 

  • Kalyaanamoorthy S, Bui Quang M, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14:587–589

    Article  Google Scholar 

  • Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780

    Article  Google Scholar 

  • Kitching IJ, Forey PL, Humphries CJ, Williams DM (1998) Cladistics: the theory and practice of parsimony analysis, 2nd ed. Systematics Association publications (Book 11). Oxford University Press, Oxford, UK

  • Lapointe FJ, Cucumel G (1997) The average consensus procedure: a combination of weighted trees containing identical or overlapping sets of taxa. Syst Biol 46:306–312

    Article  Google Scholar 

  • Lapointe FJ, Levasseur C (2004) Everything you always wanted to know about the average consensus, and more. In: Bininda-Emonds RP (ed) Phylogenetic supertrees: combining information to reveal the Tree of Life. Computational biology series, vol 4. Kluwer Academic Publishers, Dordrecht, pp 87–105

    Chapter  Google Scholar 

  • Lewis PO (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50:913–925

    Article  Google Scholar 

  • Maddison WP, Maddison DR (2021) Mesquite: a modular system for evolutionary analysis Version 3.70 http://mes.quite.project.org

  • Mavrodiev EV (2016) Dealing with propositions not with the characters: the ability of three-taxon statement analysis to recognize groups based solely on ‘reversals’ under the maximum-likelihood criteria. Aust Syst Bot 29:119–125

    Article  Google Scholar 

  • Mavrodiev EV, Madorsky A (2012) TAXODIUM Version 10: a simple way to generate uniform and fractionally weighted three-item matrices from various kinds of biological data. PloS one 7:e48813

    Article  Google Scholar 

  • Mavrodiev EV, Dell C, Schroder L (2017) A laid-back trip through the hennigian forests. Peer J 5:e3578

    Article  Google Scholar 

  • Mavrodiev EV, Williams DM, Ebach MC (2019) On the typology of relations. Evol Biol 46:71–89

    Article  Google Scholar 

  • Michener CD, Sokal RR (1957) A quantitative approach to a problem in classification. Evolution 11:130–162

    Article  Google Scholar 

  • Miller MA, Pfeiffer W, Schwartz T (2010) Creating the CIPRES Science Gateway for inference of large phylogenetic tres In: Pirece M (ed) Proceedings of the gateway computing environments workshop (GCE) 14 Nov 2010 New Orleans pp 1–8

  • Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, Von Haeseler A, Lanfear R (2020) IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37:1530–1534

    Article  Google Scholar 

  • Morrison DA, Morgan MJ, Kelchner SA (2015) Molecular homology and multiple-sequence alignment: an analysis of concepts and practice. Aust Syst Bot 28:46–62

    Article  Google Scholar 

  • Mossel E, Roch S (2010) Incomplete lineage sorting: consistent phylogeny estimation from multiple loci. IEEE/ACM Trans Comput Biol Bioinform 7:166–171

    Article  Google Scholar 

  • Namgung J, Do HDK, Kim C, Choi HJ, Kim JH (2021) Complete chloroplast genomes shed light on phylogenetic relationships divergence time and biogeography of Allioideae (Amaryllidaceae). Sci Rep 11:1–13

    Article  Google Scholar 

  • Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453

    Article  Google Scholar 

  • Nelson G (1979) Cladistic analysis and synthesis - principles and definitions, with a historical note on Adanson’s Familles des Plantes (1763–1764). Syst Zool 28:1–21

    Article  Google Scholar 

  • Nelson G (2011) Resemblance as Evidence of Ancestry. Zootaxa 2946:137–141

    Article  Google Scholar 

  • Nelson G, Platnick NI (1991) Three-taxon statements—a more precise use of parsimony? Cladistics 7:351–366

    Article  Google Scholar 

  • O’Rourke F (2004) Aristotle and the Metaphyics of Evolution. Rev Metaphys 58:3–59

    Google Scholar 

  • Patterson C (1982) Homology in classical and molecular biology. Mol Biol Evol 5:603–625

    Google Scholar 

  • Platnick NI (1979) Philosophy and the transformation of cladistics. Syst Zool 28:537–546

    Article  Google Scholar 

  • Platnick NI (1993) Character optimization and weighting - differences between the standard and three-taxon approaches to phylogenetic inference. Cladistics 9:267–272

    Google Scholar 

  • Ragan MA (1992a) Phylogenetic inference based on matrix representation of trees. Mol Biol Evol 1:53–58

    Google Scholar 

  • Ragan MA (1992b) Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. BioSystems 28:47–55

    Article  Google Scholar 

  • Rambaut A, Drummond AJ (2009) Tracer v. 16. http://beast.bio.ed.acuk/

  • Rambaut A, Drummond AJ (2018) FigTree v. 1.4. Molecular evolution, phylogenetics and epidemiology. University of Edinburgh, Edinburgh. http://tree.bio.ed.ac.uk/software/figtree/

  • Rannala B, Yang ZH (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43:304–311

    Article  Google Scholar 

  • Ronquist F, Teslenko M, Van Der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP (2012) MrBayes 32: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542

    Article  Google Scholar 

  • Samigullin T, Logacheva M, Terentieva E, Degtjareva G, Pimenov M, Valiejo-Roman C (2022) Plastid Phylogenomic analysis of tordylieae tribe (Apiaceae Apioideae). Plants 11:709

    Article  Google Scholar 

  • Sankoff D, Morel C, Cedergren RJ (1973) Evolution of 5S RNA and non-randomness of base replacement. Nature New Biol 245:232–234

    Article  Google Scholar 

  • Sokal RR (1986) Phenetic taxonomy—theory and methods. Annu Rev Ecol Evol Syst 17:423–442

    Article  Google Scholar 

  • Soltis DE, Soltis PS (2004) Amborella not a “basal angiosperm”? not so fast. Am J Bot 91:997–1001

    Article  Google Scholar 

  • Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690

    Article  Google Scholar 

  • Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313

    Article  Google Scholar 

  • Swofford DL (2002) PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates Inc, Sunderland

  • Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564–577

    Article  Google Scholar 

  • Thompson JD, Plewniak F, Poch O (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15:87–88

    Article  Google Scholar 

  • Townsend JP (2007) Profiling phylogenetic informativeness. Syst Biol 56(2):222–231

    Article  Google Scholar 

  • Townsend JP, Lopez-Giraldez F (2010) Optimal selection of gene and ingroup taxon sampling for resolving phylogenetic relationships. Syst Biol 59:446–457

    Article  Google Scholar 

  • Tremblay F (2013) Nicolai Hartmann and the metaphysical foundation of phylogenetic systematics. Biol Theory 7:56–68

    Article  Google Scholar 

  • Tremblay F (2020) Nikolai Lossky’s evolutionary metaphysics of reincarnation. Sophia 59:733–753

    Article  Google Scholar 

  • Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59:581–607

    Article  Google Scholar 

  • Wagner ND, Volf M, Hörandl E (2021) Highly diverse shrub willows (Salix L.) share highly similar plastomes. Front Plant Sci 12:662715

    Article  Google Scholar 

  • Watson HC, Kendrew JC (1961) Comparison between the amino-acid sequences of sperm whale myoglobin and of human hemoglobin. Nature 190:670–672

    Article  Google Scholar 

  • Wei R, Zhang X-C (2020) Phylogeny of Diplazium (Athyriaceae) revisited: resolving the backbone relationships based on plastid genomes and phylogenetic tree space analysis. Mol Phylogenet Evol 143:106699

    Article  Google Scholar 

  • Williams DM (1994) Combining trees and combining data. Taxon 43:449–453

    Article  Google Scholar 

  • Williams DM (1996) Characters and cladograms. Taxon 45:275–283

    Article  Google Scholar 

  • Williams DM (2004) Supertrees, components and three-item data. In: Bininda-Emonds ORP (ed) Phylogenetic supertrees: Combining information to reveal the Tree of Life. Springer-Kluwer Academic Publisher, Dordrecht, The Netherlands, pp 389–408

    Chapter  Google Scholar 

  • Williams DM, Ebach MC (2006) The data matrix. Geodiversitas 28:409–420

    Google Scholar 

  • Williams DM, Ebach MC (2008) Foundations of systematics and biogeography. Springer, New York

    Book  Google Scholar 

  • Williams DM, Siebert DJ (2000) Characters, homology and three-item analysis. In: Scotland RW, Pennington TR (eds) Homology and systematics: coding characters for phylogenetic analysis Systematics Association Special Volume (Book 58). Taylor and Francis, London, pp 183–208

    Google Scholar 

  • Williams DM, Ebach MC, Wheeler QD (2010) Beyond belief: The steady resurrection of phenetics. In: Williams DM, Knapp S (eds) Beyond cladistics: The branching of a paradigm. University of California Press, Berkley, California, USA, pp 169–197

    Google Scholar 

  • Williams DM, Ebach MC (2020) Cladistics: A guide to biological classification, 3rd ed, Systematics Association Special Volume (Book 88). Cambridge University Press, Cambridge, UK

  • Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728

    Article  Google Scholar 

  • Zhang C, Rabiee M, Sayyari E, Mirarab S (2018) ASTRAL–III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform 19:15–30

    Article  Google Scholar 

  • Zhang X, Sun Y, Landis JB, Lv Z, Shen J, Zhang H, Lin N, Li L, Sun J, Deng T (2020) Plastome phylogenomic study of Gentianeae (Gentianaceae): widespread gene tree discordance and its association with evolutionary rate heterogeneity of plastid genes. BMC Plant Biol 20:1–15

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Prof. Richard Buggs (School of Biological and Chemical Sciences, Queen Mary University of London, UK) for his helpful notes and comments on an early version of the article. We acknowledge Dr. David M. Williams (the Natural History Museum, London, UK) and Prof. Malte C. Ebach (University of New South Wales & the Sydney’s Australian Museum, AU) for their helpful discussion. We also thank Dr. Williams for bringing the study of Nelson (1979) to our attention. Two anonymous Reviewers (especially Reviewer 2) are highly acknowledged for their elegant comments, that helped to improve the article's content.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeny V. Mavrodiev.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mavrodiev, E.V., Madorsky, A. On Pattern-Cladistic Analyses Based on Complete Plastid Genome Sequences. Acta Biotheor 71, 22 (2023). https://doi.org/10.1007/s10441-023-09475-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10441-023-09475-5

Keywords

Navigation