Skip to main content

PURC Provides Improved Sequence Inference for Polyploid Phylogenetics and Other Manifestations of the Multiple-Copy Problem

  • Protocol
  • First Online:
Polyploidy

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2545))

Abstract

Inferring the true biological sequences from amplicon mixtures remains a difficult bioinformatics problem. The traditional approach is to cluster sequencing reads by similarity thresholds and treat the consensus sequence of each cluster as an “operational taxonomic unit” (OTU). Recently, this approach has been improved by model-based methods that correct PCR and sequencing errors in order to infer “amplicon sequence variants” (ASVs). To date, ASV approaches have been used primarily in metagenomics, but they are also useful for determining homeologs in polyploid organisms. To facilitate the usage of ASV methods among polyploidy researchers, we incorporated ASV inference alongside OTU clustering in PURC v2.0, a major update to PURC (Pipeline for Untangling Reticulate Complexes). In addition, PURC v2.0 features faster demultiplexing than the original version and has been updated to be compatible with Python 3. In this chapter we present results indicating that using the ASV approach is more likely to infer the correct biological sequences in comparison to the earlier OTU-based PURC and describe how to prepare sequencing data, run PURC v2.0 under several different modes, and interpret the output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ramanauskas K, Igić B (2017) The evolutionary history of plant T2/S-type ribonucleases. PeerJ 5:e3790

    Article  Google Scholar 

  2. Goldberg EE, Kohn JR, Lande R et al (2010) Species selection maintains self-incompatibility. Science 330:493–495

    Article  CAS  Google Scholar 

  3. Yang Z, Rannala B (2010) Bayesian species delimitation using multilocus sequence data. Proc Natl Acad Sci 107:9264–9269

    Article  CAS  Google Scholar 

  4. Yang Z (2015) The BPP program for species tree estimation and species delimitation. Curr Zool 61:854–865

    Article  Google Scholar 

  5. Griffin PC, Robin C, Hoffmann AA (2011) A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses. BMC Biol 9:19. https://doi.org/10.1186/1741-7007-9-19

    Article  CAS  Google Scholar 

  6. Rothfels CJ (2021) Polyploid phylogenetics. New Phytol 230:66–72

    Article  CAS  Google Scholar 

  7. Schuettpelz E, Grusz AL, Windham MD, Pryer KM (2008) The utility of nuclear gapCp in resolving polyploid fern origins. Syst Bot 33:621–629

    Article  Google Scholar 

  8. Li F-W, Pryer KM, Windham MD (2012) Gaga, a new fern genus segregated from Cheilanthes (Pteridaceae). Syst Bot 37:845–860. https://doi.org/10.1600/036364412X656626

    Article  Google Scholar 

  9. Rothfels CJ, Pryer KM, Li F-W (2017) Next-generation polyploid phylogenetics: rapid resolution of hybrid polyploid complexes using PacBio single-molecule sequencing. New Phytol 213. https://doi.org/10.1111/nph.14111

  10. Dauphin B, Grant JR, Farrar DR, Rothfels CJ (2018) Rapid allopolyploid radiation of moonwort ferns (Botrychium; Ophioglossaceae) revealed by PacBio sequencing of homologous and homeologous nuclear regions. Mol Phylogenet Evol 120:342–353. https://doi.org/10.1016/j.ympev.2017.11.025

    Article  Google Scholar 

  11. Kao T-T, Rothfels CJ, Melgoza-Castillo A et al (2020) Infraspecific diversification of the star cloak fern (Notholaena standleyi) in the deserts of the United States and Mexico. Am J Bot 107:658–675

    Article  CAS  Google Scholar 

  12. Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460–2461

    Article  CAS  Google Scholar 

  13. Edgar RC, Haas BJ, Clemente JC et al (2011) UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27:2194–2200

    Article  CAS  Google Scholar 

  14. Morales-Briones DF, Tank DC (2019) Extensive allopolyploidy in the neotropical genus Lachemilla (Rosaceae) revealed by PCR-based target enrichment of the nuclear ribosomal DNA cistron and plastid phylogenomics. Am J Bot 106:415–437. https://doi.org/10.1002/ajb2.1253

    Article  CAS  Google Scholar 

  15. Suissa JS, Kinosian SP, Schafran PW et al (2022) Homoploid hybrids, allopolyploids, and high ploidy levels characterize the evolutionary history of a western North American quillwort (Isoëtes) complex. Mol Phylogenet Evol 166:107332

    Google Scholar 

  16. Blischak PD, Thompson CE, Waight EM et al (2020) Inferring patterns of hybridization and polyploidy in the plant genus Penstemon (Plantaginaceae). bioRxiv

    Google Scholar 

  17. Kao T-T, Pryer KM, Freund FD et al (2019) Low-copy nuclear sequence data confirm complex patterns of farina evolution in notholaenid ferns (Pteridaceae). Mol Phylogenet Evol 138:139–155. https://doi.org/10.1016/j.ympev.2019.05.016

    Article  Google Scholar 

  18. Chery JG, Acevedo-Rodrı́guez P, Rothfels CJ, Specht CD (2019) Phylogeny of Paullinia L. (Paullinieae: Sapindaceae), a diverse genus of lianas with dynamic fruit evolution. Mol Phylogenet Evol 140:106577

    Article  Google Scholar 

  19. Wolfe AD, Blischak PD, Kubatko L (2021) Phylogenetics of a rapid, continental radiation: Diversification, biogeography, and circumscription of the beardtongues (Penstemon; Plantaginaceae). bioRxiv

    Google Scholar 

  20. Frost LA, O’Leary N, Lagomarsino LP et al (2021) Phylogeny, classification, and character evolution of the tribe Citharexyleae (Verbenaceae). Am J Bot 108(10):1982–2001

    Google Scholar 

  21. Blischak PD, Latvis M, Morales-Briones DF et al (2018) Fluidigm2PURC: automated processing and haplotype inference for double-barcoded PCR amplicons. Appl Plant Sci 6:e01156

    Article  Google Scholar 

  22. Callahan BJ, McMurdie PJ, Holmes SP (2017) Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J 11:2639–2643

    Article  Google Scholar 

  23. Barnes CJ, Rasmussen L, Asplund M et al (2020) Comparing DADA2 and OTU clustering approaches in studying the bacterial communities of atopic dermatitis. J Med Microbiol 69:1293–1302

    Article  CAS  Google Scholar 

  24. Joos L, Beirinckx S, Haegeman A et al (2020) Daring to be differential: Metabarcoding analysis of soil and plant-related microbial communities using amplicon sequence variants and operational taxonomical units. BMC Genomics 21:733

    Article  CAS  Google Scholar 

  25. Nelson JM, Hauser DA, Li F-W (2021) The diversity and community structure of symbiotic cyanobacteria in hornworts inferred from long-read amplicon sequencing. Am J Bot 108(9):1731–1744

    Google Scholar 

  26. Callahan BP, McMurdie PJ, Rosen MJ et al (2016) DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. https://doi.org/10.1038/nmeth.3869

    Article  CAS  Google Scholar 

  27. Rognes T, Flouri T, Nichols B et al (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ 4. https://doi.org/10.7717/peerj.2584

  28. Tukey JW (1977) Exploratory data analysis. Addison-Wesley Publishing Company, Reading

    Google Scholar 

  29. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  CAS  Google Scholar 

  30. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780

    Article  CAS  Google Scholar 

  31. Breinholt JW, Carey SB, Tiley GP et al (2021) A target enrichment probe set for resolving the flagellate land plant tree of life. Appl Plant Sci 9. https://doi.org/10.1002/aps3.11406

  32. Johnson MG, Pokorny L, Dodsworth S et al (2019) A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering. Syst Biol 68:594–606

    Article  CAS  Google Scholar 

  33. Rothfels CJ, Li F-W, Sigel EM et al (2015) The evolutionary history of ferns inferred from 25 low-copy nuclear genes. Am J Bot 102:1089–1107

    Article  CAS  Google Scholar 

  34. Rothfels CJ, Larsson A, Kuo L-Y et al (2012) Overcoming deep roots, fast rates, and short internodes to resolve the ancient rapid radiation of eupolypod II ferns. Syst Biol 61:490

    Article  Google Scholar 

  35. Frost LA, Lagomarsino LP (2021) More-curated data outperforms more data: Treatment of cryptic and known paralogs improves phylogenomic analysis and resolves a northern Andean origin of Freziera (Pentaphylacaceae). bioRxiv

    Google Scholar 

  36. Philippe H, Brinkmann H, Lavrov DV et al (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biol 9:e1000602

    Article  CAS  Google Scholar 

  37. Freyman WA, Johnson MG, Rothfels CJ (2020) homologizer: phylogenetic phasing of gene copies into polyploid subgenomes. bioRxiv. https://doi.org/10.1101/2020.10.22.351486

  38. Goldberg AR, Conway CJ, Tank DC et al (2020) Diet of a rare herbivore based on DNA metabarcoding of feces: selection, seasonality, and survival. Ecol Evol 10:7627–7643

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Schafran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Schafran, P., Li, FW., Rothfels, C.J. (2023). PURC Provides Improved Sequence Inference for Polyploid Phylogenetics and Other Manifestations of the Multiple-Copy Problem. In: Van de Peer, Y. (eds) Polyploidy. Methods in Molecular Biology, vol 2545. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2561-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2561-3_10

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2560-6

  • Online ISBN: 978-1-0716-2561-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics