Abstract
Inferring the true biological sequences from amplicon mixtures remains a difficult bioinformatics problem. The traditional approach is to cluster sequencing reads by similarity thresholds and treat the consensus sequence of each cluster as an “operational taxonomic unit” (OTU). Recently, this approach has been improved by model-based methods that correct PCR and sequencing errors in order to infer “amplicon sequence variants” (ASVs). To date, ASV approaches have been used primarily in metagenomics, but they are also useful for determining homeologs in polyploid organisms. To facilitate the usage of ASV methods among polyploidy researchers, we incorporated ASV inference alongside OTU clustering in PURC v2.0, a major update to PURC (Pipeline for Untangling Reticulate Complexes). In addition, PURC v2.0 features faster demultiplexing than the original version and has been updated to be compatible with Python 3. In this chapter we present results indicating that using the ASV approach is more likely to infer the correct biological sequences in comparison to the earlier OTU-based PURC and describe how to prepare sequencing data, run PURC v2.0 under several different modes, and interpret the output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ramanauskas K, Igić B (2017) The evolutionary history of plant T2/S-type ribonucleases. PeerJ 5:e3790
Goldberg EE, Kohn JR, Lande R et al (2010) Species selection maintains self-incompatibility. Science 330:493–495
Yang Z, Rannala B (2010) Bayesian species delimitation using multilocus sequence data. Proc Natl Acad Sci 107:9264–9269
Yang Z (2015) The BPP program for species tree estimation and species delimitation. Curr Zool 61:854–865
Griffin PC, Robin C, Hoffmann AA (2011) A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses. BMC Biol 9:19. https://doi.org/10.1186/1741-7007-9-19
Rothfels CJ (2021) Polyploid phylogenetics. New Phytol 230:66–72
Schuettpelz E, Grusz AL, Windham MD, Pryer KM (2008) The utility of nuclear gapCp in resolving polyploid fern origins. Syst Bot 33:621–629
Li F-W, Pryer KM, Windham MD (2012) Gaga, a new fern genus segregated from Cheilanthes (Pteridaceae). Syst Bot 37:845–860. https://doi.org/10.1600/036364412X656626
Rothfels CJ, Pryer KM, Li F-W (2017) Next-generation polyploid phylogenetics: rapid resolution of hybrid polyploid complexes using PacBio single-molecule sequencing. New Phytol 213. https://doi.org/10.1111/nph.14111
Dauphin B, Grant JR, Farrar DR, Rothfels CJ (2018) Rapid allopolyploid radiation of moonwort ferns (Botrychium; Ophioglossaceae) revealed by PacBio sequencing of homologous and homeologous nuclear regions. Mol Phylogenet Evol 120:342–353. https://doi.org/10.1016/j.ympev.2017.11.025
Kao T-T, Rothfels CJ, Melgoza-Castillo A et al (2020) Infraspecific diversification of the star cloak fern (Notholaena standleyi) in the deserts of the United States and Mexico. Am J Bot 107:658–675
Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460–2461
Edgar RC, Haas BJ, Clemente JC et al (2011) UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27:2194–2200
Morales-Briones DF, Tank DC (2019) Extensive allopolyploidy in the neotropical genus Lachemilla (Rosaceae) revealed by PCR-based target enrichment of the nuclear ribosomal DNA cistron and plastid phylogenomics. Am J Bot 106:415–437. https://doi.org/10.1002/ajb2.1253
Suissa JS, Kinosian SP, Schafran PW et al (2022) Homoploid hybrids, allopolyploids, and high ploidy levels characterize the evolutionary history of a western North American quillwort (Isoëtes) complex. Mol Phylogenet Evol 166:107332
Blischak PD, Thompson CE, Waight EM et al (2020) Inferring patterns of hybridization and polyploidy in the plant genus Penstemon (Plantaginaceae). bioRxiv
Kao T-T, Pryer KM, Freund FD et al (2019) Low-copy nuclear sequence data confirm complex patterns of farina evolution in notholaenid ferns (Pteridaceae). Mol Phylogenet Evol 138:139–155. https://doi.org/10.1016/j.ympev.2019.05.016
Chery JG, Acevedo-Rodrı́guez P, Rothfels CJ, Specht CD (2019) Phylogeny of Paullinia L. (Paullinieae: Sapindaceae), a diverse genus of lianas with dynamic fruit evolution. Mol Phylogenet Evol 140:106577
Wolfe AD, Blischak PD, Kubatko L (2021) Phylogenetics of a rapid, continental radiation: Diversification, biogeography, and circumscription of the beardtongues (Penstemon; Plantaginaceae). bioRxiv
Frost LA, O’Leary N, Lagomarsino LP et al (2021) Phylogeny, classification, and character evolution of the tribe Citharexyleae (Verbenaceae). Am J Bot 108(10):1982–2001
Blischak PD, Latvis M, Morales-Briones DF et al (2018) Fluidigm2PURC: automated processing and haplotype inference for double-barcoded PCR amplicons. Appl Plant Sci 6:e01156
Callahan BJ, McMurdie PJ, Holmes SP (2017) Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J 11:2639–2643
Barnes CJ, Rasmussen L, Asplund M et al (2020) Comparing DADA2 and OTU clustering approaches in studying the bacterial communities of atopic dermatitis. J Med Microbiol 69:1293–1302
Joos L, Beirinckx S, Haegeman A et al (2020) Daring to be differential: Metabarcoding analysis of soil and plant-related microbial communities using amplicon sequence variants and operational taxonomical units. BMC Genomics 21:733
Nelson JM, Hauser DA, Li F-W (2021) The diversity and community structure of symbiotic cyanobacteria in hornworts inferred from long-read amplicon sequencing. Am J Bot 108(9):1731–1744
Callahan BP, McMurdie PJ, Rosen MJ et al (2016) DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. https://doi.org/10.1038/nmeth.3869
Rognes T, Flouri T, Nichols B et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ 4. https://doi.org/10.7717/peerj.2584
Tukey JW (1977) Exploratory data analysis. Addison-Wesley Publishing Company, Reading
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780
Breinholt JW, Carey SB, Tiley GP et al (2021) A target enrichment probe set for resolving the flagellate land plant tree of life. Appl Plant Sci 9. https://doi.org/10.1002/aps3.11406
Johnson MG, Pokorny L, Dodsworth S et al (2019) A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering. Syst Biol 68:594–606
Rothfels CJ, Li F-W, Sigel EM et al (2015) The evolutionary history of ferns inferred from 25 low-copy nuclear genes. Am J Bot 102:1089–1107
Rothfels CJ, Larsson A, Kuo L-Y et al (2012) Overcoming deep roots, fast rates, and short internodes to resolve the ancient rapid radiation of eupolypod II ferns. Syst Biol 61:490
Frost LA, Lagomarsino LP (2021) More-curated data outperforms more data: Treatment of cryptic and known paralogs improves phylogenomic analysis and resolves a northern Andean origin of Freziera (Pentaphylacaceae). bioRxiv
Philippe H, Brinkmann H, Lavrov DV et al (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biol 9:e1000602
Freyman WA, Johnson MG, Rothfels CJ (2020) homologizer: phylogenetic phasing of gene copies into polyploid subgenomes. bioRxiv. https://doi.org/10.1101/2020.10.22.351486
Goldberg AR, Conway CJ, Tank DC et al (2020) Diet of a rare herbivore based on DNA metabarcoding of feces: selection, seasonality, and survival. Ecol Evol 10:7627–7643
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Schafran, P., Li, FW., Rothfels, C.J. (2023). PURC Provides Improved Sequence Inference for Polyploid Phylogenetics and Other Manifestations of the Multiple-Copy Problem. In: Van de Peer, Y. (eds) Polyploidy. Methods in Molecular Biology, vol 2545. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2561-3_10
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2561-3_10
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2560-6
Online ISBN: 978-1-0716-2561-3
eBook Packages: Springer Protocols