Skip to main content

Metagenomics and CAZyme Discovery

  • Protocol
  • First Online:
Protein-Carbohydrate Interactions

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1588))

Abstract

Microorganisms play a primary role in regulating biogeochemical cycles and are a valuable source of enzymes that have biotechnological applications, such as carbohydrate-active enzymes (CAZymes). However, the inability to culture the majority of microorganisms that exist in natural ecosystems using common culture-dependent techniques restricts access to potentially novel cellulolytic bacteria and beneficial enzymes. The development of molecular-based culture-independent methods such as metagenomics enables researchers to study microbial communities directly from environmental samples, and presents a platform from which enzymes of interest can be sourced. We outline key methodological stages that are required as well as describe specific protocols that are currently used for metagenomic projects dedicated to CAZyme discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Morrison M et al (2009) Plant biomass degradation by gut microbiomes: more of the same or something new? Curr Opin Biotechnol 20:358–363

    Article  CAS  PubMed  Google Scholar 

  2. Warnecke F et al (2007) Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. Nature 450:560–565

    Article  CAS  PubMed  Google Scholar 

  3. Hess M et al (2011) Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331(6016):463–467

    Article  CAS  PubMed  Google Scholar 

  4. Pope PB et al (2010) Adaptation to herbivory by the Tammar wallaby includes bacterial and glycoside hydrolase profiles different to other herbivores. Proc Natl Acad Sci U S A 107:14793–14798

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Liu J et al (2011) Cloning and functional characterization of a novel endo-beta-1,4-glucanase gene from a soil-derived metagenomic library. Appl Microbiol Biotechnol 89:1083–1092

    Article  CAS  PubMed  Google Scholar 

  6. Pope PB et al (2012) Metagenomics of the Svalbard reindeer rumen microbiome reveals abundance of polysaccharide utilization loci. PLoS One 7:e38571

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Zhou Y et al (2016) A novel efficient β-glucanase from a paddy soil microbial metagenome with versatile activities. Biotechnol Biofuels 9:36

    Article  PubMed  PubMed Central  Google Scholar 

  8. Ouwerkerk D et al (2005) Characterization of culturable anaerobic bacteria from the forestomach of an eastern grey kangaroo, Macropus giganteus. Lett Appl Microbiol 41:327–333

    Article  CAS  PubMed  Google Scholar 

  9. Naas AE et al (2014) Do rumen Bacteroidetes utilize an alternative mechanism for cellulose degradation? MBio 5:e01401–e01414

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zhou Y et al (2014) Omics-based interpretation of synergism in a soil-derived cellulose-degrading microbial community. Sci Rep 4:5288

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Sims D et al (2014) Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet 15:121–132

    Article  CAS  PubMed  Google Scholar 

  12. Kuczynski J et al (2011) Using QIIME to analyze 16S rRNA gene sequences from microbial communities (Chapter:Unit). Curr Protoc Bioinformatics 10:7

    PubMed  Google Scholar 

  13. Gilbert JA, Jansson JK, Knight R (2014) The Earth Microbiome project: successes and aspirations. BMC Biol 12:69

    Article  PubMed  PubMed Central  Google Scholar 

  14. Yilmaz P et al (2011) Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat Biotechnol 29:415–420

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Yilmaz P et al (2011) The genomic standards consortium: bringing standards to life for microbial ecology. ISME J 5:1565–1567

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Field D et al (2008) The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 26:541–547

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Burke C, Kjelleberg S, Thomas T (2009) Selective extraction of bacterial DNA from the surfaces of macroalgae. Appl Environ Microbiol 75:252–256

    Article  CAS  PubMed  Google Scholar 

  18. Delmont TO et al (2011) Metagenomic comparison of direct and indirect soil DNA extraction approaches. J Microbiol Methods 86:397–400

    Article  PubMed  Google Scholar 

  19. Rosewarne CP et al (2011) High-yield and phylogenetically robust methods of DNA recovery for analysis of microbial biofilms adherent to plant biomass in the herbivore gut. Microb Ecol 61:448–454

    Article  CAS  PubMed  Google Scholar 

  20. Denman SE et al (2015) Metagenomic analysis of the rumen microbial community following inhibition of methane formation by a halogenated methane analog. Front Microbiol 6:1087

    Article  PubMed  PubMed Central  Google Scholar 

  21. Cardenas E et al (2015) Forest harvesting reduces the soil metagenomic potential for biomass decomposition. ISME J 9:2465–2476

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Marine R et al (2014) Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome. Microbiome 2:3

    Article  PubMed  PubMed Central  Google Scholar 

  23. Binga EK, Lasken RS, Neufeld JD (2008) Something from (almost) nothing: the impact of multiple displacement amplification on microbial ecology. ISME J 2:233–241

    Article  CAS  PubMed  Google Scholar 

  24. Bragg L, Tyson GW (2014) Metagenomics using next-generation sequencing. Methods Mol Biol 1096:183–201

    Article  CAS  PubMed  Google Scholar 

  25. Di Bella JM et al (2013) High throughput sequencing methods and analysis for microbiome research. J Microbiol Methods 95:401–414

    Article  CAS  PubMed  Google Scholar 

  26. Laehnemann D, Borkhardt A, McHardy AC (2016) Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction. Brief Bioinform 17:154–179

    Article  PubMed  Google Scholar 

  27. Frank JA et al (2016) Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data. Sci Rep 6:25373

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14:157–167

    Article  CAS  PubMed  Google Scholar 

  29. Peng Y et al (2012) IDBA-UD: a denovo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28:1420–1428

    Article  CAS  PubMed  Google Scholar 

  30. Li D et al (2015) MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31:1674–1676

    Article  CAS  PubMed  Google Scholar 

  31. Nurk S et al (2016) MetaSPAdes: a new versatile de novo metagenomics assembler. arXiv:1604.03071

    Google Scholar 

  32. Scholz M, Lo CC, Chain PS (2014) Improved assemblies using a source-agnostic pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of contigs. Sci Rep 4:e6480

    Article  Google Scholar 

  33. Tsai YC et al (2016) Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing. MBio 7:e01948

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Koren S et al (2012) Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 30:693–700

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Utturkar SM et al (2014) Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. Bioinformatics 30:2709–2716

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Chevreux B, Wetter T, Suhai S (1999) Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. Comput Sci Biol 99:45–46

    Google Scholar 

  37. Eren AM et al (2015) Anvi’o: an advanced analysis and visualization platform for ‘omics data. Peer J 3:e1319

    Article  PubMed  PubMed Central  Google Scholar 

  38. Zhu Z et al (2013) MGAviewer: a desktop visualization tool for analysis of metagenomics alignment data. Bioinformatics 29:122–123

    Article  CAS  PubMed  Google Scholar 

  39. McHardy AC, Rigoutsos I (2007) What’s in the mix: phylogenetic classification of metagenome sequence samples. Curr Opin Microbiol 10:499–503

    Article  CAS  PubMed  Google Scholar 

  40. Huson DH et al (2007) MEGAN analysis of metagenomic data. Genome Res 17:377–386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Gregor I et al (2016) PhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes. Peer J 4:e1603

    Google Scholar 

  42. Dröge J, Gregor I, McHardy AC (2015) Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods. Bioinformatics 31:817–824

    Article  PubMed  Google Scholar 

  43. Federhen S (2012) The NCBI Taxonomy database. Nucleic Acids Res 40:D136–D143

    Article  CAS  PubMed  Google Scholar 

  44. Teeling H et al (2004) TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences. BMC Bioinformatics 5:163

    Article  PubMed  PubMed Central  Google Scholar 

  45. Iverson V et al (2012) Untangling genomes from metagenomes: revealing an uncultured class of marine Euryarchaeota. Science 335:587–590

    Article  CAS  PubMed  Google Scholar 

  46. Wu YW et al (2014) MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome 2:26

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Imelfort M et al (2014) GroopM: an automated tool for the recovery of population genomes from related metagenomes. Peer J 2:e409v1

    Article  Google Scholar 

  48. Alneberg J et al (2014) Binning metagenomic contigs by coverage and composition. Nat Methods 11:1144–1146

    Article  CAS  PubMed  Google Scholar 

  49. Kang DD et al (2015) MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. Peer J 3:e1165

    Article  PubMed  PubMed Central  Google Scholar 

  50. Albertsen M et al (2013) Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 31:533–538

    Article  CAS  PubMed  Google Scholar 

  51. Parks DH et al (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Sczyrba A et al (2017) Critical Assessment of Metagenome Interpretation − a benchmark of computational metagenomics software. bioRxiv: 099127

    Google Scholar 

  53. Kunin V et al (2008) A bioinformatician’s guide to metagenomics. Microbiol Mol Biol Rev 72:557–578

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Zhu W, Lomsadze A, Borodovsky M (2010) Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 38:e132

    Article  PubMed  PubMed Central  Google Scholar 

  55. Lopez-Lopez O et al (2015) Metagenomics of an alkaline hot spring in Galicia (Spain): microbial diversity analysis and screening for novel lipolytic enzymes. Front Microbiol 6:1291

    Article  PubMed  PubMed Central  Google Scholar 

  56. Mhuantong W et al (2015) Comparative analysis of sugarcane bagasse metagenome reveals unique and conserved biomass-degrading enzymes among lignocellulolytic microbial communities. Biotechnol Biofuels 8:16

    Article  PubMed  PubMed Central  Google Scholar 

  57. Jimenez DJ, Chaves-Moreno D, van Elsas JD (2015) Unveiling the metabolic potential of two soil-derived microbial consortia selected on wheat straw. Sci Rep 5:13845

    Article  PubMed  PubMed Central  Google Scholar 

  58. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637

    Article  CAS  PubMed  Google Scholar 

  59. Finn RD et al (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285

    Article  PubMed  Google Scholar 

  60. Haft DH (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–373

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Bairoch A (2000) The ENZYME database in 2000. Nucleic Acids Res 28:304–305

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Kanehisa M et al (2015) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 43:1–6

    Article  Google Scholar 

  63. Caspi R et al (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42:D459–D471

    Article  CAS  PubMed  Google Scholar 

  64. Marchler-Bauer A et al (2015) CDD: NCBI’s conserved domain database. Nucleic Acids Res 43:D222–D226

    Article  PubMed  Google Scholar 

  65. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–W37

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460–2461

    Article  CAS  PubMed  Google Scholar 

  67. Markowitz VM et al (2014) IMG/M 4 version of the integrated metagenome comparative analysis system. Nucleic Acids Res 42:D568–D573

    Article  CAS  PubMed  Google Scholar 

  68. Lombard V et al (2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42:D490–D495

    Article  CAS  PubMed  Google Scholar 

  69. Cantarel BL et al (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res 37:233–238

    Article  Google Scholar 

  70. Yin Y et al (2012) dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 40:W445–W451

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Park BH et al (2010) CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. Glycobiology 20:1574–1584

    Article  CAS  PubMed  Google Scholar 

  72. Rosewarne CP et al (2014) Analysis of the bovine rumen microbiome reveals a diversity of Sus-like polysaccharide utilization loci from the bacterial phylum Bacteroidetes. J Ind Microbiol Biotechnol 41:601–606

    Article  CAS  PubMed  Google Scholar 

  73. Martens EC et al (2009) Complex glycan catabolism by the human gut microbiota: The bacteroidetes Sus-like paradigm. J Biol Chem 284:24673–24677

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Hemsworth GR et al (2014) Discovery and characterization of a new family of lytic polysaccharide monooxygenases. Nat Chem Biol 10:122–126

    Article  CAS  PubMed  Google Scholar 

  75. Liu Y et al (2006) An integrative genomic approach to uncover molecular mechanisms of prokaryotic traits. PLoS Comput Biol 2:e159

    Article  PubMed  PubMed Central  Google Scholar 

  76. Korbel JO et al (2005) Systematic association of genes to phenotypes by genome and literature mining. PLoS Biol 3:e134

    Article  PubMed  PubMed Central  Google Scholar 

  77. Lingner T et al (2010) Predicting phenotypic traits of prokaryotes from protein domain frequencies. BMC Bioinformatics 11:481

    Article  PubMed  PubMed Central  Google Scholar 

  78. Feldbauer R et al (2015) Prediction of microbial phenotypes based on comparative genomics. BMC Bioinformatics 16(Suppl. 14):S1.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Boser B, Guyon I, and Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Fifth proceedings of the fifth annual workshop on computational learning theory, Pittsburgh, ACM, pp 144–152

    Google Scholar 

  80. Weimann A et al (2013) De novo prediction of the genomic components and capabilities for microbial plant biomass degradation from (meta-)genomes. Biotechnol Biofuels 6:24

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Konietzny SG et al (2014) Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders. Biotechnol Biofuels 7:124

    Article  PubMed  PubMed Central  Google Scholar 

  82. Weimann A et al (2016) From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer. mSystems 1:e00101–16

    Google Scholar 

  83. Wang A et al (2010) Enrichment strategy to select functional consortium from mixed cultures: consortium from rumen liquor for simultaneous cellulose degradation and hydrogen production. Int J Hydrogen Energy 35:13413–13418

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phillip B. Pope .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Kunath, B.J., Bremges, A., Weimann, A., McHardy, A.C., Pope, P.B. (2017). Metagenomics and CAZyme Discovery. In: Abbott, D., Lammerts van Bueren, A. (eds) Protein-Carbohydrate Interactions. Methods in Molecular Biology, vol 1588. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6899-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6899-2_20

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6898-5

  • Online ISBN: 978-1-4939-6899-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics