Skip to main content
Log in

Microbial source tracking using metagenomics and other new technologies

  • Review
  • Omics in Microbial Ecology
  • Published:
Journal of Microbiology Aims and scope Submit manuscript

Abstract

The environment is under siege from a variety of pollution sources. Fecal pollution is especially harmful as it disperses pathogenic bacteria into waterways. Unraveling origins of mixed sources of fecal bacteria is difficult and microbial source tracking (MST) in complex environments is still a daunting task. Despite the challenges, the need for answers far outweighs the difficulties experienced. Advancements in qPCR and next generation sequencing (NGS) technologies have shifted the traditional culture-based MST approaches towards culture independent technologies, where community-based MST is becoming a method of choice. Metagenomic tools may be useful to overcome some of the limitations of community-based MST methods as they can give deep insight into identifying host specific fecal markers and their association with different environments. Adoption of machine learning (ML) algorithms, along with the metagenomic based MST approaches, will also provide a statistically robust and automated platform. To compliment that, ML-based approaches provide accurate optimization of resources. With the successful application of ML based models in disease prediction, outbreak investigation and medicine prescription, it would be possible that these methods would serve as a better surrogate of traditional MST approaches in future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Ahmed, W., Payyappat, S., Cassidy, M., and Besley, C. 2019. A duplex PCR assay for the simultaneous quantification of Bacteroides HF183 and crAssphage CPQ_056 marker genes in untreated sewage and stormwater. Environ. Int. 126, 252–259.

    Article  CAS  PubMed  Google Scholar 

  • Alikhan, N.F., Zhou, Z., Sergeant, M.J., and Achtman, M. 2018. A genomic overview of the population structure of Salmonella. PLoS Genet. 14, e1007261.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Allard, M.W., Strain, E., Melka, D., Bunning, K., Musser, S.M., Brown, E.W., and Timme, R. 2016. Practical value of food pathogen traceability through building a whole-genome sequencing network and database. J. Clin. Microbiol. 54, 1975–1983.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alves, L.F., Westmann, C.A., Lovate, G.L., de Siqueira, G.M.V., Borelli, T.C., and Guazzaroni, M.E. 2018. Metagenomic approaches for understanding new concepts in microbial science. Int. J. Genomics 2018, 2312987.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Amgarten, D., Braga, L.P.P., da Silva, A.M., and Setubal, J.C. 2018. MARVEL, a tool for prediction of bacteriophage sequences in metagenomic bins. Front. Genet. 9, 304.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Ballesté, E., Pascual-Benito, M., Martín-Díaz, J., Blanch, A., Lucena, F., Muniesa, M., Jofre, J., and García-Aljaro, C. 2019. Dynamics of crAssphage as a human source tracking marker in potentially faecally polluted environments. Water Res. 155, 233–244.

    Article  PubMed  CAS  Google Scholar 

  • Barrett, T.J., Lior, H., Green, J.H., Khakhria, R., Wells, J.G., Bell, B.P., Greene, K.D., Lewis, J., and Griffin, P.M. 1994. Laboratory investigation of a multistate food-borne outbreak of Escherichia coli O157: H7 by using pulsed-field gel electrophoresis and phage typing. J. Clin. Microbiol. 32, 3013–3017.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bauza, V., Madadi, V., Ocharo, R.M., Nguyen, T.H., and Guest, J.S. 2019. Microbial source tracking using 16S rRNA amplicon sequencing identifies evidence of widespread contamination from young children’s feces in an urban slum of Nairobi, Kenya. Environ. Sci. Technol. 53, 8271–8281.

    Article  CAS  PubMed  Google Scholar 

  • Bernhard, A.E. and Field, K.G. 2000a. A PCR assay to discriminate human and ruminant feces on the basis of host differences in Bacteroides-Prevotella genes encoding 16S rRNA. Appl. Environ. Microbiol. 66, 4571–4574.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bernhard, A.E. and Field, K.G. 2000b. Identification of nonpoint sources of fecal pollution in coastal waters by using host-specific 16S ribosomal DNA genetic markers from fecal anaerobes. Appl. Environ. Microbiol. 66, 1587–1594.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Besser, J., Carleton, H.A., Gerner-Smidt, P., Lindsey, R.L., and Trees, E. 2018. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin. Microbiol. Infect. 24, 335–341.

    Article  CAS  PubMed  Google Scholar 

  • Besser, J.M., Carleton, H.A., Trees, E., Stroika, S.G., Hise, K., Wise, M., and Gerner-Smidt, P. 2019. Interpretation of whole-genome sequencing for enteric disease surveillance and outbreak investigation. Foodborne Pathog. Dis. 16, 504–512.

    Article  PubMed  PubMed Central  Google Scholar 

  • Boehm, A.B., Van De Werfhorst, L.C., Griffith, J.F., Holden, P.A., Jay, J.A., Shanks, O.C., Wang, D., and Weisberg, S.B. 2013. Performance of forty-one microbial source tracking methods: A twenty-seven lab evaluation study. Water Res. 47, 6812–6828.

    Article  CAS  PubMed  Google Scholar 

  • Boers, S.A., Van der Reijden, W.A., and Jansen, R. 2012. High-throughput multilocus sequence typing: bringing molecular typing to the next level. PLoS ONE 7, e39630.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Borry, M. 2019. Sourcepredict: prediction of metagenomic sample sources using dimension reduction followed by machine learning classification. J. Open Source Softw. 4, 1540.

    Article  Google Scholar 

  • Breiman, L. 2001. Random forests. Mach. Learn. 45, 5–32.

    Article  Google Scholar 

  • Brown, C.M., Mathai, P.P., Loesekann, T., Staley, C., and Sadowsky, M.J. 2019. Influence of library composition on sourcetracker predictions for community-based microbial source tracking. Environ. Sci. Technol. 53, 60–68.

    Article  CAS  PubMed  Google Scholar 

  • Burkhardt, M.R., Soliven, P.P., Werner, S.L., and Vaught, D.G. 1999. Determination of submicrogram-per-liter concentrations of caffeine in surface water and groundwater samples by solid-phase extraction and liquid chromatography. J. AOAC Int. 82, 161–166.

    Article  CAS  PubMed  Google Scholar 

  • Callahan, B.J., McMurdie, P.J., and Holmes, S.P. 2017. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 11, 2639–2643.

    Article  PubMed  PubMed Central  Google Scholar 

  • Cammarota, G., Ianiro, G., Ahern, A., Carbone, C., Temko, A., Claesson, M.J., Gasbarrini, A., and Tortora, G. 2020. Gut microbiome, big data and machine learning to promote precision medicine for cancer. Nat. Rev. Gastroenterol. Hepatol. 17, 635–648.

    Article  PubMed  Google Scholar 

  • Carrieri, A.P., Rowe, W.P., Winn, M., and Pyzer-Knapp, E.O. 2019. A fast machine learning workflow for rapid phenotype prediction from whole shotgun metagenomes. Proc. Conf. AAAI Artif. Intell. 33, 9434–9439.

    Google Scholar 

  • Carson, C.A., Shear, B.L., Ellersieck, M.R., and Asfaw, A. 2001. Identification of fecal Escherichia coli from humans and animals by ribotyping. Appl. Environ. Microbiol. 67, 1503–1507.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Carter, K.M., Lu, M., Luo, Q., Jiang, H., and An, L. 2020. Microbial community dissimilarity for source tracking with application in forensic studies. PLoS ONE 15, e0236082.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chan, K.H., Lam, M.H.W., Poon, K.F., Yeung, H.Y., and Chiu, T.K.T. 1998. Application of sedimentary fecal stanols and sterols in tracing sewage pollution in coastal waters. Water Res. 32, 225–235.

    Article  CAS  Google Scholar 

  • Chattaway, M.A., Greig, D.R., Gentle, A., Hartman, H.B., Dallman, T.J., and Jenkins, C. 2017. Whole-genome sequencing for national surveillance of Shigella flexneri. Front. Microbiol. 8, 1700.

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen, H., Bai, X., Li, Y., Jing, L., Chen, R., and Teng, Y. 2019a. Source identification of antibiotic resistance genes in a peri-urban river using novel crAssphage marker genes and metagenomic signatures. Water Res. 167, 115098.

    Article  CAS  PubMed  Google Scholar 

  • Chen, H., Jing, L., Yao, Z., Meng, F., and Teng, Y. 2019b. Prevalence, source and risk of antibiotic resistance genes in the sediments of Lake Tai (China) deciphered by metagenomic assembly: a comparison with other global lakes. Environ. Int. 127, 267–275.

    Article  CAS  PubMed  Google Scholar 

  • Cody, A.J., Bray, J.E., Jolley, K.A., McCarthy, N.D., and Maiden, M.C.J. 2017. Core genome multilocus sequence typing scheme for stable, comparative analyses of Campylobacter jejuni and C. coli human disease isolates. J. Clin. Microbiol. 55, 2086–2097.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Coipan, C.E., Dallman, T.J., Brown, D., Hartman, H., van der Voort, M., van den Berg, R.R., Palm, D., Kotila, S., van Wijk, T., and Franz, E. 2020. Concordance of SNP-and allele-based typing workflows in the context of a large-scale international Salmonella Enteritidis outbreak investigation. Microb. Genom. 6, e000318.

    PubMed Central  Google Scholar 

  • Cole, D., Long, S.C., and Sobsey, M.D. 2003. Evaluation of F+ RNA and DNA coliphages as source-specific indicators of fecal contamination in surface waters. Appl. Environ. Microbiol. 69, 6507–6514.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Comte, J., Berga, M., Severin, I., Logue, J.B., and Lindström, E.S. 2017. Contribution of different bacterial dispersal sources to lakes: population and community effects in different seasons. Environ. Microbiol. 19, 2391–2404.

    Article  CAS  PubMed  Google Scholar 

  • Cooke, M.D. 1976. Antibiotic resistance among coliform and fecal coliform bacteria isolated from sewage, seawater, and marine shellfish. Antimicrob. Agents Chemother. 9, 879–884.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Crank, K., Li, X., North, D., Ferraro, G.B., Iaconelli, M., Mancini, P., La Rosa, G., and Bibby, K. 2020. CrAssphage abundance and correlation with molecular viral markers in Italian wastewater. Water Res. 184, 116161.

    Article  CAS  PubMed  Google Scholar 

  • Dave, M., Higgins, P.D., Middha, S., and Rioux, K.P. 2012. The human gut microbiome: current knowledge, challenges, and future directions. Transl. Res. 160, 246–257.

    Article  CAS  PubMed  Google Scholar 

  • Davis, S., Pettengill, J.B., Luo, Y., Payne, J., Shpuntoff, A., Rand, H., and Strain, E. 2015. CFSAN SNP pipeline: An automated method for constructing SNP matrices from next-generation sequence data. PeerJ Comput. Sci. 1, e20.

    Article  Google Scholar 

  • de Knegt, L.V., Pires, S.M., Löfström, C., Sørensen, G., Pedersen, K., Torpdahl, M., Nielsen, E.M., and Hald, T. 2016. Application of molecular typing results in source attribution models: The case of multiple locus variable number tandem repeat analysis (MLVA) of Salmonella isolates obtained from integrated surveillance in denmark. Risk Anal. 36, 571–588.

    Article  PubMed  Google Scholar 

  • Dombek, P.E., Johnson, L.K., Zimmerley, S.T., and Sadowsky, M.J. 2000. Use of repetitive DNA sequences and the PCR to differentiate Escherichia coli isolates from human and animal sources. Appl. Environ. Microbiol. 66, 2572–2577.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dubinsky, E.A., Butkus, S.R., and Andersen, G.L. 2016. Microbial source tracking in impaired watersheds using phylochip and machine-learning classification. Water Res. 105, 56–64.

    Article  CAS  PubMed  Google Scholar 

  • Dutilh, B.E., Cassman, N., McNair, K., Sanchez, S.E., Silva, G.G.Z., Boling, L., Barr, J.J., Speth, D.R., Seguritan, V., Aziz, R.K., et al. 2014. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun. 5, 4498.

    Article  CAS  PubMed  Google Scholar 

  • Edge, T., Hill, S., Seto, P., and Marsalek, J. 2010. Library-dependent and library-independent microbial source tracking to identify spatial variation in faecal contamination sources along a Lake Ontario beach (Ontario, Canada). Water Sci. Technol. 62, 719–727.

    Article  CAS  PubMed  Google Scholar 

  • Edwards, R.A., Vega, A.A., Norman, H.M., Ohaeri, M., Levi, K., Dinsdale, E.A., Cinek, O., Aziz, R.K., McNair, K., Barr, J.J., et al. 2019. Global phylogeography and ancient evolution of the widespread human gut virus crAssphage. Nat. Microbiol. 4, 1727–1736.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fang, Z., Tan, J., Wu, S., Li, M., Xu, C., Xie, Z., and Zhu, H. 2019. PPRMeta: A tool for identifying phages and plasmids from metagenomic fragments using deep learning. GigaScience 8, giz066.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Farkas, K., Adriaenssens, E.M., Walker, D.I., McDonald, J.E., Malham, S.K., and Jones, D.L. 2019. Critical evaluation of crAssphage as a molecular marker for human-derived wastewater contamination in the aquatic environment. Food Environ. Vrol. 11, 113–119.

    Article  CAS  Google Scholar 

  • Feng, Y., Zou, S., Chen, H., Yu, Y., and Ruan, Z. 2020. BacWGSTdb 2.0: A one-stop repository for bacterial whole-genome sequence typing and source tracking. Nucleic Acids Res. 49, D644–D650.

    Article  PubMed Central  Google Scholar 

  • García-Aljaro, C., Ballesté, E., Muniesa, M., and Jofre, J. 2017. Determination of crAssphage in water samples and applicability for tracking human faecal pollution. Microb. Biotechnol. 10, 1775–1780.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Gómez-Doñate, M., Casanovas-Massana, A., Muniesa, M., and Blanch, A.R. 2016. Development of new host-specific bacteroides qPCRs for the identification of fecal contamination sources in water. Microbiologyopen 5, 83–94.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Gourmelon, M., Caprais, M.P., Ségura, R., Le Mennec, C., Lozach, S., Piriou, J.Y., and Rincé, A. 2007. Evaluation of two library-independent microbial source tracking methods to identify sources of fecal contamination in French estuaries. Appl. Environ. Microbiol. 73, 4857–4866.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Guan, S., Xu, R., Chen, S., Odumeru, J., and Gyles, C. 2002. Development of a procedure for discriminating among Escherichia coli isolates from animal and human sources. Appl. Environ. Microbiol. 68, 2690–2698.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hagedorn, C., Blanch, A.R., and Harwood, V.J. 2011. Microbial source tracking: Methods, applications, and case studies. Springer Science & Business Media, Springer-Verlag New York, New York, USA.

    Book  Google Scholar 

  • Hagedorn, C., Crozier, J.B., Mentz, K.A., Booth, A.M., Graves, A.K., Nelson, N.J., and Reneau, R.B. 2003. Carbon source utilization profiles as a method to identify sources of faecal pollution in water. J. Appl. Microbiol. 94, 792–799.

    Article  CAS  PubMed  Google Scholar 

  • Hägglund, M., Bäckman, S., Macellaro, A., Lindgren, P., Borgmastars, E., Jacobsson, K., Dryselius, R., Stenberg, P., Sjodin, A., Forsman, M., et al. 2018. Accounting for bacterial overlap between raw water communities and contaminating sources improves the accuracy of signature-based microbial source tracking. Front. Microbiol. 9, 2364.

    Article  PubMed  PubMed Central  Google Scholar 

  • Hahm, B.K., Maldonado, Y., Schreiber, E., Bhunia, A.K., and Nakatsu, C.H. 2003. Subtyping of foodborne and environmental isolates of Escherichia coli by multiplex-PCR, rep-PCR, PFGE, ribotyping and AFLP. J. Microbiol. Methods 53, 387–399.

    Article  CAS  PubMed  Google Scholar 

  • Hamilton, M.J., Yan, T., and Sadowsky, M.J. 2006. Development of goose- and duck-specific DNA markers to determine sources of Escherichia coli in waterways. Appl. Environ. Microbiol. 72, 4012–4019.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hampton-Marcell, J.T., Larsen, P., Anton, T., Cralle, L., Sangwan, N., Lax, S., Gottel, N., Salas-Garcia, M., Young, C., Duncan, G., et al. 2020. Detecting personal microbiota signatures at artificial crime scenes. Forensic Sci. Int. 313, 110351.

    Article  CAS  PubMed  Google Scholar 

  • Harwood, V.J., Staley, C., Badgley, B.D., Borges, K., and Korajkic, A. 2014. Microbial source tracking markers for detection of fecal contamination in environmental waters: Relationships between pathogens and human health outcomes. FEMS Microbiol. Rev. 38, 1–40.

    Article  CAS  PubMed  Google Scholar 

  • Harwood, V.J., Wiggins, B., Hagedorn, C., Ellender, R.D., Gooch, J., Kern, J., Samadpour, M., Chapman, A.C.H., Robinson, B.J., and Thompson, B.C. 2003. Phenotypic library-based microbial source tracking methods: efficacy in the california collaborative study. J. Water Health 1, 153–166.

    Article  PubMed  Google Scholar 

  • Havelaar, A. and Hogeboom, W. 1984. A method for the enumeration of male-specific bacteriophages in sewage. J. Appl. Bacteriol. 56, 439–447.

    Article  CAS  PubMed  Google Scholar 

  • Hemedan, A.A., Abd Elaziz, M., Jiao, P., Alavi, A.H., Bahgat, M., Ostaszewski, M., Schneider, R., Ghazy, H.A., Ewees, A.A., and Lu, S. 2020. Prediction of the vaccine-derived poliovirus outbreak incidence: A hybrid machine learning approach. Sci. Rep. 10, 5058.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hendriksen, R.S., Munk, P., Njage, P., van Bunnik, B., McNally, L., Lukjancenko, O., Röder, T., Nieuwenhuijse, D., Pedersen, S.K., Kjeldgaard, J., et al. 2019. Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage. Nat. Commun. 10, 1124.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Henry, R., Schang, C., Coutts, S., Kolotelo, P., Prosser, T., Crosbie, N., Grant, T., Cottam, D., O’Brien, P., Deletic, A., et al. 2016. Into the deep: evaluation of sourcetracker for assessment of faecal contamination of coastal waters. Water Res. 93, 242–253.

    Article  CAS  PubMed  Google Scholar 

  • Holcomb, D.A. and Stewart, J.R. 2020. Microbial indicators of fecal pollution: Recent progress and challenges in assessing water quality. Curr. Environ. Health Rep. 7, 311–324.

    Article  PubMed  PubMed Central  Google Scholar 

  • Hsieh, S.L., Hsieh, S.H., Cheng, P.H., Chen, C.H., Hsu, K.P., Lee, I.S., Wang, Z., and Lai, F. 2012. Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 36, 2841–2847.

    Article  PubMed  Google Scholar 

  • Jackson, B.R., Tarr, C., Strain, E., Jackson, K.A., Conrad, A., Carleton, H., Katz, L.S., Stroika, S., Gould, L.H., Mody, R.K., et al. 2016. Implementation of nationwide real-time whole-genome sequencing to enhance listeriosis outbreak detection and investigation. Clin. Infect. Dis. 63, 380–386.

    Article  PubMed  PubMed Central  Google Scholar 

  • Jagadeesan, B., Gerner-Smidt, P., Allard, M.W., Leuillet, S., Winkler, A., Xiao, Y., Chaffron, S., Van Der Vossen, J., Tang, S., Katase, M., et al. 2019. The use of next generation sequencing for improving food safety: Translation into practice. Food Microbiol. 79, 96–115.

    Article  CAS  PubMed  Google Scholar 

  • Jennings, W.C., Galvez-Arango, E., Prieto, A.L., and Boehm, A.B. 2020. CrAssphage for fecal source tracking in Chile: covariation with norovirus, HF183, and bacterial indicators. Water Res. X 9, 100071.

    Article  PubMed  PubMed Central  Google Scholar 

  • Jo, H., Hong, J., and Unno, T. 2019. Investigation of MiSeq reproducibility on biomarker identification. Appl. Biol. Chem. 62, 60.

    Article  CAS  Google Scholar 

  • Johnson, C.M. and Grossman, A.D. 2015. Integrative and conjugative elements (ICEs): what they do and how they work. Annu. Rev. Genet. 49, 577–601.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kaas, R.S., Leekitcharoenphon, P., Aarestrup, F.M., and Lund, O. 2014. Solving the problem of comparing whole bacterial genomes across different sequencing platforms. PLoS ONE 9, e104984.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Karkman, A., Pärnänen, K., and Larsson, D.G.J. 2019. Fecal pollution can explain antibiotic resistance gene abundances in anthropogenically impacted environments. Nat. Commun. 10, 80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Katz, L.S., Griswold, T., Williams-Newkirk, A.J., Wagner, D., Petkau, A., Sieffert, C., Van Domselaar, G., Deng, X., and Carleton, H.A. 2017. A comparative analysis of the lyve-set phylogenomics pipeline for genomic epidemiology of foodborne pathogens. Front. Microbiol. 8, 375.

    Article  PubMed  PubMed Central  Google Scholar 

  • Knights, D., Kuczynski, J., Charlson, E.S., Zaneveld, J., Mozer, M.C., Collman, R.G., Bushman, F.D., Knight, R., and Kelley, S.T. 2011. Bayesian community-wide culture-independent microbial source tracking. Nat. Methods 8, 761–763.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kongprajug, A., Mongkolsuk, S., and Sirikanchana, K. 2019. CrAssphage as a potential human sewage marker for microbial source tracking in southeast Asia. Environ. Sci. Technol. Lett. 6, 159–164.

    Article  CAS  Google Scholar 

  • Krumperman, P.H. 1983. Multiple antibiotic resistance indexing of Escherichia coli to identify high-risk sources of fecal contamination of foods. Appl. Environ. Microbiol. 46, 165–170.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kulski, J.K. 2016. Next-generation sequencing-an overview of the history, tools, and “omic” applications. In Next Generation Sequencing: Advances, Applications and Challenges. pp. 3–60. Intech, Rijeka, Croatia.

    Chapter  Google Scholar 

  • Kvistholm Jensen, A., Nielsen, E.M., Björkman, J.T., Jensen, T., Müller, L., Persson, S., Bjerager, G., Perge, A., Krause, T.G., Kiil, K., et al. 2016. Whole-genome sequencing used to investigate a nationwide outbreak of listeriosis caused by ready-to-eat delicatessen meat, Denmark, 2014. Clin. Infect. Dis. 63, 64–70.

    Article  PubMed  CAS  Google Scholar 

  • Larsen, M.V., Cosentino, S., Rasmussen, S., Friis, C., Hasman, H., Marvig, R.L., Jelsbak, L., Sicheritz-Ponten, T., Ussery, D.W., Aarestrup, F.M., et al. 2012. Multilocus sequence typing of totalgenome-sequenced bacteria. J. Clin. Microbiol. 50, 1355–1361.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lee, C.M., Lin, T.Y., Lin, C.C., Kohbodi, G.A., Bhatt, A., Lee, R., and Jay, J.A. 2006. Persistence of fecal indicator bacteria in Santa Monica Bay beach sediments. Water Res. 40, 2593–2602.

    Article  CAS  PubMed  Google Scholar 

  • Leekitcharoenphon, P., Nielsen, E.M., Kaas, R.S., Lund, O., and Aarestrup, F.M. 2014. Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica. PLoS ONE 9, e87991.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Li, L.G., Huang, Q., Yin, X., and Zhang, T. 2020. Source tracking of antibiotic resistance genes in the environment — challenges, progress, and prospects. Water Res. 185, 116127.

    Article  CAS  PubMed  Google Scholar 

  • Li, L.G., Yin, X., and Zhang, T. 2018. Tracking antibiotic resistance gene pollution from different sources using machine-learning classification. Microbiome 6, 93.

    Article  PubMed  PubMed Central  Google Scholar 

  • Liu, Q., Liu, F., He, J., Zhou, M., Hou, T., and Liu, Y. 2019. VFM: Identification of bacteriophages from metagenomic bins and contigs based on features related to gene and genome composition. IEEE Access 7, 177529–177538.

    Article  Google Scholar 

  • Mathai, P.P., Staley, C., and Sadowsky, M.J. 2020. Sequence-enabled community-based microbial source tracking in surface waters using machine learning classification: A review. J. Microbiol. Methods 177, 106050.

    Article  CAS  PubMed  Google Scholar 

  • Mattioli, M.C.M., Benedict, K.M., Murphy, J., Kahler, A., Kline, K.E., Longenberger, A., Mitchell, P.K., Watkins, S., Berger, P., and Shanks, O.C. 2021. Identifying septic pollution exposure routes during a waterborne norovirus outbreak-a new application for human-associated microbial source tracking qPCR. J. Microbiol. Methods 180, 106091.

    Article  CAS  PubMed  Google Scholar 

  • Mattioli, M.C.M., Davis, J., Mrisho, M., and Boehm, A.B. 2015. Quantification of human norovirus GII on hands of mothers with children under the age of five years in Bagamoyo, Tanzania. Am. J. Trop. Med. Hyg. 93, 478–484.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McGhee, J.J., Rawson, N., Bailey, B.A., Fernandez-Guerra, A., Sisk-Hackworth, L., and Kelley, S.T. 2020. Meta-SourceTracker: Application of bayesian source tracking to shotgun metagenomics. Peer J. 8, e8783.

    Article  PubMed  PubMed Central  Google Scholar 

  • Miro, E., Rossen, J.W.A., Chlebowicz, M.A., Harmsen, D., Brisse, S., Passet, V., Navarro, F., Friedrich, A.W., and García-Cobos, S. 2019. Core/Whole genome multilocus sequence typing and core genome SNP-based typing of OXA-48-producing Klebsiella pneumoniae clinical isolates from Spain. Front. Microbiol. 10, 2961.

    Article  PubMed  Google Scholar 

  • Moura, A., Tourdjman, M., Leclercq, A., Hamelin, E., Laurent, E., Fredriksen, N., Van Cauteren, D., Bracq-Dieye, H., Thouvenot, P., Vales, G., et al. 2017. Real-time whole-genome sequencing for surveillance of Listeria monocytogenes, France. Emerg. Infect. Dis. 23, 1462–1470.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Murphy, K.P. 2006. Naive bayes classifiers. University of British Columbia 18, 60.

    Google Scholar 

  • Myszczynska, M.A., Ojamies, P.N., Lacoste, A.M.B., Neil, D., Saffari, A., Mead, R., Hautbergue, G.M., Holbrook, J.D., and Ferraiuolo, L. 2020. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat. Rev. Neurol. 16, 440–456.

    Article  PubMed  Google Scholar 

  • Nguyen, N.P., Warnow, T., Pop, M., and White, B. 2016. A perspective on 16S rRNA operational taxonomic unit clustering using sequence similarity. npj Biofilms Microbiomes 2, 16004.

    Article  PubMed  PubMed Central  Google Scholar 

  • O’Dea, C., Zhang, Q., Staley, C., Masters, N., Kuballa, A., Fisher, P., Veal, C., Stratton, H., Sadowsky, M.J., Ahmed, W., et al. 2019. Compositional and temporal stability of fecal taxon libraries for use with sourcetracker in sub-tropical catchments. Water Res. 165, 114967.

    Article  PubMed  CAS  Google Scholar 

  • Olsen, J.E., Brown, D.J., Baggesen, D.L., and Bisgaard, M. 1992. Biochemical and molecular characterization of Salmonella enterica serovar berta, and comparison of methods for typing. Epidemiol. Infect. 108, 243–260.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pandey, P.K., Kass, P.H., Soupir, M.L., Biswas, S., and Singh, V.P. 2014. Contamination of water resources by pathogenic bacteria. AMB Express 4, 51.

    Article  PubMed  PubMed Central  Google Scholar 

  • Parveen, S., Hodge, N.C., Stall, R.E., Farrah, S.R., and Tamplin, M.L. 2001. Phenotypic and genotypic characterization of human and nonhuman Escherichia coli. Water Res. 35, 379–386.

    Article  CAS  PubMed  Google Scholar 

  • Pasolli, E., Truong, D.T., Malik, F., Waldron, L., and Segata, N. 2016. Machine learning meta-analysis of large metagenomic datasets: Tools and biological insights. PLoS Comput. Biol. 12, e1004977.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Portmann, A.C., Fournier, C., Gimonet, J., Ngom-Bru, C., Barretto, C., and Baert, L. 2018. A validation approach of an end-to-end whole genome sequencing workflow for source tracking of Listeria monocytogenes and Salmonella enterica. Front. Microbiol. 9, 446.

    Article  PubMed  PubMed Central  Google Scholar 

  • Reischer, G.H., Ebdon, J.E., Bauer, J.M., Schuster, N., Ahmed, W., Astrom, J., Blanch, A.R., Bloschl, G., Byamukama, D., Coakley, T., et al. 2013. Performance characteristics of qPCR assays targeting human- and ruminant-associated bacteroidetes for microbial source tracking across sixteen countries on six continents. Environ. Sci. Technol. 47, 8548–8556.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rhodes, D.R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Pandey, A., and Chinnaiyan, A.M. 2004. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc. Natl. Acad. Sci. USA 101, 9309–9314.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Riedel, T.E., Zimmer-Faust, A.G., Thulsiraj, V., Madi, T., Hanley, K.T., Ebentier, D.L., Byappanahalli, M., Layton, B., Raith, M., Boehm, A.B., et al. 2014. Detection limits and cost comparisons of human- and gull-associated conventional and quantitative PCR assays in artificial and environmental waters. J. Environ. Manage. 136, 112–120.

    Article  CAS  PubMed  Google Scholar 

  • Roer, L., Hansen, F., Thomsen, M.C.F., Knudsen, J.D., Hansen, D. S., Wang, M., Samulioniené, J., Justesen, U.S., Røder, B.L., Schumacher, H., et al. 2017. WGS-based surveillance of third-generation cephalosporin-resistant Escherichia coli from bloodstream infections in Denmark. J. Antimicrob. Chemother. 72, 1922–1929.

    Article  CAS  PubMed  Google Scholar 

  • Roguet, A., Eren, A.M., Newton, R.J., and McLellan, S.L. 2018. Fecal source identification using random forest. Microbiome 6, 185.

    Article  PubMed  PubMed Central  Google Scholar 

  • Roguet, A., Esen, Ù.C., Eren, A.M., Newton, R.J., and McLellan, S.L. 2020. FORENSIC: An online platform for fecal source identification. mSystems 5, e00869–19.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Saltykova, A., Wuyts, V., Mattheus, W., Bertrand, S., Roosens, N.H.C., Marchal, K., and De Keersmaecker, S.C.J. 2018. Comparison of SNP-based subtyping workflows for bacterial isolates using WGS data, applied to Salmonella enterica serotype Typhimurium and serotype 1,4,[5],12:i:-. PLoS ONE 13, e0192504.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Schadt, E.E., Turner, S., and Kasarskis, A. 2010. A window into third-generation sequencing. Hum. Mol. Genet. 19, R227–R240.

    Article  CAS  PubMed  Google Scholar 

  • Schloissnig, S., Arumugam, M., Sunagawa, S., Mitreva, M., Tap, J., Zhu, A., Waller, A., Mende, D.R., Kultima, J.R., Martin, J., et al. 2013. Genomic variation landscape of the human gut microbiome. Nature 493, 45–50.

    Article  PubMed  CAS  Google Scholar 

  • Seurinck, S., Defoirdt, T., Verstraete, W., and Siciliano, S.D. 2005. Detection and quantification of the human-specific HF183 bacteroides 16S rRNA genetic marker with real-time PCR for assessment of human faecal pollution in freshwater. Environ. Microbiol. 7, 249–259.

    Article  CAS  PubMed  Google Scholar 

  • Shanks, O.C., Domingo, J.W., Lu, J., Kelty, C.A., and Graham, J.E. 2007. Identification of bacterial DNA markers for the detection of human fecal pollution in water. Appl. Environ. Microbiol. 73, 2416–2422.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shenhav, L., Thompson, M., Joseph, T.A., Briscoe, L., Furman, O., Bogumil, D., Mizrahi, I., Pe’er, I., and Halperin, E. 2019. FEAST: Fast expectation-maximization for microbial source tracking. Nat. Methods 16, 627–632.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shkoporov, A.N., Khokhlova, E.V., Fitzgerald, C.B., Stockdale, S.R., Draper, L.A., Ross, R.P., and Hill, C. 2018. ΦCrAss001 represents the most abundant bacteriophage family in the human gut and infects Bacteroides intestinalis. Nat. Commun. 9, 4781.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Smith, A., Sterba-Boatwright, B., and Mott, J. 2010. Novel application of a statistical technique, random forests, in a bacterial source tracking study. Water Res. 44, 4067–4076.

    Article  CAS  PubMed  Google Scholar 

  • Stachler, E., Akyon, B., de Carvalho, N.A., Ference, C., and Bibby, K. 2018. Correlation of crAssphage qPCR markers with culturable and molecular indicators of human fecal pollution in an impacted urban watershed. Environ. Sci. Technol. 52, 7505–7512.

    Article  CAS  PubMed  Google Scholar 

  • Stachler, E. and Bibby, K. 2014. Metagenomic evaluation of the highly abundant human gut bacteriophage crAssphage for source tracking of human fecal pollution. Environ. Sci. Technol. Lett. 1, 405–409.

    Article  CAS  Google Scholar 

  • Stachler, E., Kelty, C., Sivaganesan, M., Li, X., Bibby, K., and Shanks, O.C. 2017. Quantitative crAssphage PCR assays for human fecal pollution measurement. Environ. Sci. Technol. 51, 9146–9154.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Staley, Z.R., Chuong, J.D., Hill, S.J., Grabuski, J., Shokralla, S., Hajibabaei, M., and Edge, T.A. 2018a. Fecal source tracking and eDNA profiling in an urban creek following an extreme rain event. Sci Rep. 8, 14390.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Staley, C., Kaiser, T., Lobos, A., Ahmed, W., Harwood, V.J., Brown, C.M., and Sadowsky, M.J. 2018b. Application of sourcetracker for accurate identification of fecal pollution in recreational freshwater: A double-blinded study. Environ. Sci. Technol. 52, 4207–4217.

    Article  CAS  PubMed  Google Scholar 

  • Stoeckel, D.M. and Harwood, V.J. 2007. Performance, design, and analysis in microbial source tracking studies. Appl. Environ. Microbiol. 73, 2405–2415.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Stucki, D., Ballif, M., Bodmer, T., Coscolla, M., Maurer, A.M., Droz, S., Butz, C., Borrell, S., Längle, C., Feldmann, J., et al. 2015. Tracking a tuberculosis outbreak over 21 years: strain-specific singlenucleotide polymorphism typing combined with targeted whole-genome sequencing. J. Infect. Dis. 211, 1306–1316.

    Article  PubMed  Google Scholar 

  • Suykens, J.A. and Vandewalle, J. 1999. Least squares support vector machine classifiers. Neural Process. Lett. 9, 293–300.

    Article  Google Scholar 

  • Tarca, A.L., Carey, V.J., Chen, X., Romero, R., and Drăghici, S. 2007. Machine learning and its applications to biology. PLoS Comput. Biol. 3, e116.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Tipping, M.E. 2000. The relevance vector machine. In Solla, S.A., Leen, T.K., and Muller, K.R. (eds.), Advances in neural information processing systems, vol. 12, pp. 652–658. MIT press, Cambridge, Massachusetts, USA.

    Google Scholar 

  • Unno, T. 2015. Bioinformatic suggestions on MiSeq-based microbial community analysis. J. Microbiol. Biotechnol. 25, 765–770.

    Article  CAS  PubMed  Google Scholar 

  • Unno, T., Di, D.Y., Jang, J., Suh, Y.S., Sadowsky, M.J., and Hur, H.G. 2012. Integrated online system for a pyrosequencing-based microbial source tracking method that targets Bacteroidetes 16S rDNA. Environ. Sci. Technol. 46, 93–98.

    Article  CAS  PubMed  Google Scholar 

  • Unno, T., Jang, J., Han, D., Kim, J.H., Sadowsky, M.J., Kim, O.S., Chun, J., and Hur, H.G. 2010. Use of barcoded pyrosequencing and shared OTUs to determine sources of fecal bacteria in watersheds. Environ. Sci. Technol. 44, 7777–7782.

    Article  CAS  PubMed  Google Scholar 

  • Unno, T., Staley, C., Brown, C.M., Han, D., Sadowsky, M.J., and Hur, H.G. 2018. Fecal pollution: New trends and challenges in microbial source tracking using next-generation sequencing. Environ. Microbiol. 20, 3132–3140.

    Article  PubMed  Google Scholar 

  • Wang, K., Pereira, G.V., Cavalcante, J.J.V., Zhang, M., Mackie, R., and Cann, I. 2016. Bacteroides intestinalis DSM 17393, a member of the human colonic microbiome, upregulates multiple endoxylanases during growth on xylan. Sci. Rep. 6, 34360.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wery, N., Monteil, C., Pourcher, A.M., and Godon, J.J. 2010. Humanspecific fecal bacteria in wastewater treatment plant effluents. Water Res. 44, 1873–1883.

    Article  CAS  PubMed  Google Scholar 

  • Wiggins, B.A. 1996. Discriminant analysis of antibiotic resistance patterns in fecal streptococci, a method to differentiate human and animal sources of fecal pollution in natural waters. Appl. Environ. Microbiol. 62, 3997–4002.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu, Z., Greaves, J., Arp, L., Stone, D., and Bibby, K. 2020. Comparative fate of CrAssphage with culturable and molecular fecal pollution indicators during activated sludge wastewater treatment. Environ. Int. 136, 105452.

    Article  PubMed  Google Scholar 

  • Wu, H., Nguyen, Q.D., Tran, T.T.M., Tang, M.T., Tsuruta, T., and Nishino, N. 2019. Rumen fluid, feces, milk, water, feed, airborne dust, and bedding microbiota in dairy farms managed by automatic milking systems. Anim. Sci. J. 90, 445–452.

    Article  PubMed  Google Scholar 

  • Xia, E., Mei, J., Xie, G., Li, X., Li, Z., and Xu, M. 2017. Learning doctors’ medicine prescription pattern for chronic disease treatment by mining electronic health records: a multi-task learning approach. AMIA Annu. Symp. Proc. 2017, 1828–1837.

    PubMed  Google Scholar 

  • Zhang, P., Chen, B., Ma, L., Li, Z., Song, Z., Duan, W., and Qiu, X. 2015. The large scale machine learning in an artificial society: prediction of the ebola outbreak in Beijing. Comput. Intell. Neurosci. 2015, 531650.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work was supported by the Research Program funded by the Korea Disease Control and Prevention Agency (2020-ER5408-00). This research was also supported by the Minnesota Agricultural Experiment Station and the Basic Science Research Program to Research Institute for Basic Sciences (RIBS) of Jeju National University through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2019R1A6A1A10072987).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatsuya Unno.

Additional information

Conflict of Interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raza, S., Kim, J., Sadowsky, M.J. et al. Microbial source tracking using metagenomics and other new technologies. J Microbiol. 59, 259–269 (2021). https://doi.org/10.1007/s12275-021-0668-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12275-021-0668-9

Keywords

Navigation