Skip to main content

An Introduction to Whole-Metagenome Shotgun Sequencing Studies

  • Protocol
  • First Online:
Deep Sequencing Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2243))

Abstract

Microbial communities are found across diverse environments, including within and across the human body. As many microbes are unculturable in the lab, much of what is known about a microbiome—a collection of bacteria, fungi, archaea, and viruses inhabiting an environment—-is from the sequencing of DNA from within the constituent community. Here, we provide an introduction to whole-metagenome shotgun sequencing studies, a ubiquitous approach for characterizing microbial communities, by reviewing three major research areas in metagenomics: assembly, community profiling, and functional profiling. Though not exhaustive, these areas encompass a large component of the metagenomics literature. We discuss each area in depth, the challenges posed by whole-metagenome shotgun sequencing, and approaches fundamental to the solutions of each. We conclude by discussing promising areas for future research. Though our emphasis is on the human microbiome, the methods discussed are broadly applicable across study systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon, JI (2007) The human microbiome project. Nature 449(7164):804

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Luckey TD (1972) Introduction to intestinal microecology. Am J Clin Nutr 25:1292–1294

    Article  CAS  PubMed  Google Scholar 

  3. Berg RD (1996) The indigenous gastrointestinal microflora. Trends Microbiol 4(11):430–435

    Article  CAS  PubMed  Google Scholar 

  4. Sender R, Fuchs S, Milo R (2016) Revised estimates for the number of human and bacteria cells in the body. PLoS Biol 14(8):e1002533

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Wooley JC, Godzik A, Friedberg I (2010) A primer on metagenomics. PLoS Comput Biol 6(2):e1000667

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  6. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W et al (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304(5667):66–74

    Article  CAS  PubMed  Google Scholar 

  7. Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428(6978):37

    Article  CAS  PubMed  Google Scholar 

  8. Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman DA (2005) Diversity of the human intestinal microbial flora. Science 308(5728):1635–1638

    Article  PubMed  PubMed Central  Google Scholar 

  9. Ley RE, Peterson DA, Gordon JI (2006) Ecological and evolutionary forces shaping microbial diversity in the human intestine. Cell 124(4):837–848

    Article  CAS  PubMed  Google Scholar 

  10. Bäckhed F, Ley RE, Sonnenburg JL, Peterson DA, Gordon JI (2005) Host-bacterial mutualism in the human intestine. Science 307(5717):1915–1920

    Article  PubMed  CAS  Google Scholar 

  11. Huttenhower C, Gevers D, Knight R, Abubucker S, Badger JH, Chinwalla AT, Creasy HH, Earl AM, FitzGerald MG, Fulton RS et al (2012) Structure, function and diversity of the healthy human microbiome. Nature 486(7402):207

    Article  CAS  Google Scholar 

  12. Lloyd-Price J, Mahurkar A, Rahnavard G, Crabtree J, Orvis J, Hall AB, Brady A, Creasy HH, McCracken C, Giglio MG et al (2017) Strains, functions and dynamics in the expanded human microbiome project. Nature 550(7674):61

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T et al (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464(7285):59

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li J, Jia H, Cai X, Zhong H, Feng Q, Sunagawa S, Arumugam M, Kultima JR, Prifti E, Nielsen T et al (2014) An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol 32(8):834

    Article  CAS  PubMed  Google Scholar 

  15. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R (2009) Bacterial community variation in human body habitats across space and time. Science 326(5960):1694–1697

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Clemente JC, Ursell LK, Parfrey LW, Knight R (2012) The impact of the gut microbiota on human health: an integrative view. Cell 148(6):1258–1270

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Pasolli E, Asnicar F, Manara S, Zolfo M, Karcher N, Armanini F, Beghini F, Manghi P, Tett A, Ghensi P et al (2019) Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176(3):649–662

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Pevzner PA, Tang H, Waterman MS (2001) An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci 98(17):9748–9753

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Olson ND, Treangen TJ, Hill CM, Cepeda-Espinoza V, Ghurye J, Koren S, Pop M (2017) Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes. Brief Bioinform 20(4):1140–1150

    Article  PubMed Central  CAS  Google Scholar 

  20. Ayling M, Clark MD, Leggett RM (2019) New approaches for metagenome assembly with short reads. Brief Bioinform 21:584–594

    Article  PubMed Central  CAS  Google Scholar 

  21. Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20(2):265–272

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Peng Y, Leung HCM, Yiu S-M, Chin FYL (2012) IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28(11):1420–1428

    Article  CAS  PubMed  Google Scholar 

  23. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA (2017) metaspades: a new versatile metagenomic assembler. Genome Res 27(5):824–834

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kolmogorov M, Rayko M, Yuan J, Polevikov E, Pevzner P (2019) metaFlye: scalable long-read metagenome assembly using repeat graphs. bioRxiv, p 637637

    Google Scholar 

  25. Sczyrba A, Hofmann P, Belmann P, Koslicki D, Janssen S, Dröge J, Gregor I, Majda S, Fiedler J, Dahms E et al (2017) Critical assessment of metagenome interpretation-a benchmark of metagenomics software. Nat Methods 14(11):1063

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Cho I, Blaser MJ (2012) The human microbiome: at the interface of health and disease. Nat Rev Genet 13(4):260

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI (2006) An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444(7122):1027

    Article  PubMed  Google Scholar 

  28. Manichanh C, Rigottier-Gois L, Bonnaud E, Gloux K, Pelletier E, Frangeul L, Nalin R, Jarrin C, Chardon P, Marteau P et al (2006) Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach. Gut 55(2):205–211

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Kuczynski J, Lauber CL, Walters WA, Parfrey LW, Clemente JC, Gevers D, Knight R (2012) Experimental and analytical tools for studying the human microbiome. Nat Rev Genet 13(1):47

    Article  CAS  Google Scholar 

  30. Huson DH, Auch AF, Qi J, Schuster SC (2007) Megan analysis of metagenomic data. Genome Res 17(3):377–386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Martin J, Sykes S, Young S, Kota K, Sanka R, Sheth N, Orvis J, Sodergren E, Wang Z, Weinstock GM et al (2012) Optimizing read mapping to reference genomes to determine composition and species prevalence in microbial communities. PLoS One 7(6):e36427

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wommack KE, Bhavsar J, Ravel J (2008) Metagenomics: read length matters. Appl Environ Microbiol 74(5):1453–1463

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Wood DE, Salzberg SL (2014) Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15(3):R46

    Article  PubMed  PubMed Central  Google Scholar 

  34. McHardy AC, Martín HG, Tsirigos A, Hugenholtz P, Rigoutsos I (2007) Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 4(1):63

    Article  CAS  PubMed  Google Scholar 

  35. Rosen G, Garbarine E, Caseiro D, Polikar R, Sokhansanj B (2008) Metagenome fragment classification using n-mer frequency profiles. Adv Bioinform 2008:205969

    Article  Google Scholar 

  36. Brady A, Salzberg SL (2009) Phymm and phymmbl: metagenomic phylogenetic classification with interpolated Markov models. Nat Methods 6(9):673

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C (2012) Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9(8):811

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Mende DR, Sunagawa S, Zeller G, Bork P (2013) Accurate and universal delineation of prokaryotic species. Nat Methods 10(9):881

    Article  CAS  PubMed  Google Scholar 

  39. Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, Coelho LP, Arumugam M, Tap J, Nielsen HB et al (2013) Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods 10(12):1196

    Article  CAS  PubMed  Google Scholar 

  40. Gloor GB, Wu JR, Pawlowsky-Glahn V, Egozcue JJ (2016) It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol 26(5):322–329

    Article  PubMed  Google Scholar 

  41. Tsilimigras MCB, Fodor AA (2016) Compositional data analysis of the microbiome: fundamentals, tools, and challenges. Ann Epidemiol 26(5):330–335

    Article  PubMed  Google Scholar 

  42. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ (2017) Microbiome datasets are compositional: and this is not optional. Front Microbiol 8:2224

    Article  PubMed  PubMed Central  Google Scholar 

  43. Fernandes AD, Reid JNS, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB (2014) Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16s rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome 2(1):15

    Article  PubMed  PubMed Central  Google Scholar 

  44. Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA (2015) Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol 11(5):e1004226

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Silverman JD, Washburne AD, Mukherjee S, David LA (2017) A phylogenetic transform enhances analysis of compositional microbiota data. Elife 6:e21887

    Article  PubMed  PubMed Central  Google Scholar 

  46. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M et al (2004) UniProt: the universal protein knowledgebase. Nucleic Acids Res 32(suppl 1):D115–D119

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer ELL (2002) The Pfam protein families database. Nucleic Acids Res 30(1):276–280

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Jensen LJ, Julien P, Kuhn M, von Mering C, Muller J, Doerks T, Bork P (2007) eggnog: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res 36(suppl 1):D250–D254

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278(5338):631–637

    Article  CAS  PubMed  Google Scholar 

  50. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M (2015) KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44(D1):D457–D462

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Galperin MY, Makarova KS, Wolf YI, Koonin EV (2014) Expanded microbial genome coverage and improved protein family annotation in the cog database. Nucleic Acids Res 43(D1):D261–D269

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Kanehisa M, Sato Y, Morishima K (2016) BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol 428(4):726–731

    Article  CAS  PubMed  Google Scholar 

  53. Prakash T, Taylor TD (2012) Functional assignment of metagenomic data: challenges and applications. Brief Bioinform 13(6):711–727

    Article  PubMed  PubMed Central  Google Scholar 

  54. Abubucker S, Segata N, Goll J, Schubert AM, Izard J, Cantarel BL, Rodriguez-Mueller B, Zucker J, Thiagarajan M, Henrissat B et al (2012) Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol 8(6):e1002358

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Franzosa EA, McIver LJ, Rahnavard G, Thompson LR, Schirmer M, Weingart G, Lipson KS, Knight R, Caporaso JG, Segata N et al (2018) Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods 15(11):962

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Quince C, Walker AW, Simpson JT, Loman NJ, Segata N (2017) Shotgun metagenomics, from sampling to analysis. Nat Biotechnol 35(9):833

    Article  CAS  PubMed  Google Scholar 

  57. Integrative HMP (2014) The integrative human microbiome project: dynamic analysis of microbiome-host omics profiles during periods of human health and disease. Cell Host Microbe 16(3):276

    Article  CAS  Google Scholar 

  58. Integrative HMP (2019) The integrative human microbiome project. Nature 569(7758):641

    Article  CAS  Google Scholar 

  59. Olson CA, Vuong HE, Yano JM, Liang QY, Nusbaum DJ, Hsiao EY (2018) The gut microbiota mediates the anti-seizure effects of the ketogenic diet. Cell 173(7):1728–1741

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Lloyd-Price J, Arze C, Ananthakrishnan AN, Schirmer M, Avila-Pacheco J, Poon TW, Andrews E, Ajami NJ, Bonham KS, Brislawn CJ et al (2019) Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569(7758):655

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Buffie CG, Bucci V, Stein RR, McKenney PT, Ling L, Gobourne A, No D, Liu H, Kinnebrew M, Viale A et al (2015) Precision microbiome reconstitution restores bile acid mediated resistance to clostridium difficile. Nature 517(7533):205

    Article  CAS  PubMed  Google Scholar 

  62. Korem T, Zeevi D, Suez J, Weinberger A, Avnit-Sagi T, Pompan-Lotan M, Matot E, Jona G, Harmelin A, Cohen N et al (2015) Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples. Science 349(6252):1101–1106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Zeevi D, Korem T, Godneva A, Bar N, Kurilshikov A, Lotan-Pompan M, Weinberger A, Fu J, Wijmenga C, Zhernakova A et al (2019) Structural variation in the gut microbiome associates with host health. Nature 568(7750):43

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Itsik Pe’er .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Joseph, T.A., Pe’er, I. (2021). An Introduction to Whole-Metagenome Shotgun Sequencing Studies. In: Shomron, N. (eds) Deep Sequencing Data Analysis. Methods in Molecular Biology, vol 2243. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1103-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1103-6_6

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1102-9

  • Online ISBN: 978-1-0716-1103-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics