Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

DADA2: High-resolution sample inference from Illumina amplicon data

Abstract

We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Comparison of sequence variants inferred by DADA2 with OTUs constructed by UPARSE.
Figure 2: L.crispatus sequence variants in the human vaginal community during pregnancy.

Accession codes

Primary accessions

Sequence Read Archive

References

  1. Human Microbiome Project Consortium. Nature 486, 207–214 (2012).

  2. Rosen, M.J., Davison, M., Bhaya, D. & Fisher, D.S. Science 348, 1019–1023 (2015).

    Article  CAS  Google Scholar 

  3. Reeder, J. & Knight, R. Nat. Methods 7, 668–669 (2010).

    Article  CAS  Google Scholar 

  4. Quince, C., Lanzen, A., Davenport, R.J. & Turnbaugh, P.J. BMC Bioinformatics 12, 38 (2011).

    Article  Google Scholar 

  5. Rosen, M.J., Callahan, B.J., Fisher, D.S. & Holmes, S.P. BMC Bioinformatics 13, 283 (2012).

    Article  Google Scholar 

  6. Bragg, L., Stone, G., Imelfort, M., Hugenholtz, P. & Tyson, G.W. Nat. Methods 9, 425–426 (2012).

    Article  CAS  Google Scholar 

  7. Schloss, P.D. et al. Appl. Environ. Microbiol. 75, 7537–7541 (2009).

    Article  CAS  Google Scholar 

  8. Caporaso, J.G. et al. Nat. Methods 7, 335–336 (2010).

    Article  CAS  Google Scholar 

  9. Edgar, R.C. Nat. Methods 10, 996–998 (2013).

    Article  CAS  Google Scholar 

  10. Eren, A.M., Borisy, G.G., Huse, S.M. & Welch, J.L.M. Proc. Natl. Acad. Sci. USA 111, E2875–E2884 (2014).

    Article  CAS  Google Scholar 

  11. Eren, A.M. et al. ISME J. 9, 968–979 (2015).

    Article  CAS  Google Scholar 

  12. Tikhonov, M., Leach, R.W. & Wingreen, N.S. ISME J. 9, 68–80 (2015).

    Article  Google Scholar 

  13. Wang, C., Mitsuya, Y., Gharizadeh, B., Ronaghi, M. & Shafer, R.W. Genome Res. 17, 1195–1201 (2007).

    Article  CAS  Google Scholar 

  14. McElroy, K., Zagordi, O., Bull, R., Luciani, F. & Beerenwinkel, N. BMC Genomics 14, 501 (2013).

    Article  Google Scholar 

  15. Guarner, F. Nat. Rev. Gastroenterol. Hepatol. 11, 647–649 (2014).

    Article  Google Scholar 

  16. Schirmer, M. et al. Nucleic Acids Res. 43, e37 (2015).

    Article  Google Scholar 

  17. Kozich, J.J., Westcott, S.L., Baxter, N.T., Highlander, S.K. & Schloss, P.D. Appl. Environ. Microbiol. 79, 5112–5120 (2013).

    Article  CAS  Google Scholar 

  18. Edgar, R.C. & Flyvbjerg, H. Bioinformatics 31, 3476–3482 (2015).

    Article  CAS  Google Scholar 

  19. MacIntyre, D.A. et al. Sci. Rep. 11, 8988 (2015).

    Article  Google Scholar 

  20. Ravel, J. et al. Proc. Natl. Acad. Sci. USA 108, 4680–4687 (2011).

    Article  CAS  Google Scholar 

  21. Sun, Y. et al. Nucleic Acids Res. 37, e76 (2009).

    Article  Google Scholar 

  22. Caporaso, J.G. et al. ISME J. 6, 1621–1624 (2012).

    Article  CAS  Google Scholar 

  23. Edgar, R.C., Haas, B.J., Clemente, J.C., Quince, C. & Knight, R. Bioinformatics 27, 2194–2200 (2011).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank M. Schirmer and D. MacIntyre for productive correspondence. This work was supported by the NSF (DMS-1162538 to S.P.H.), the NIH (R01AI112401 to S.P.H.), and the Samarth Foundation (Stanford Microbiome Seed Grant to B.J.C. and S.P.H.).

Author information

Authors and Affiliations

Authors

Contributions

B.J.C. and S.P.H. designed the research; B.J.C., P.J.M., and M.J.R. implemented the algorithm; B.J.C. performed the analysis; B.J.C., P.J.M., M.J.R., and S.P.H. wrote the paper; and A.W.H. and A.J.A.J. generated the Extreme data set designed by B.J.C., P.J.M., and A.W.H.

Corresponding author

Correspondence to Benjamin J Callahan.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1–3 and Supplementary Notes 1 and 2 (PDF 1809 kb)

Supplementary Software

DADA2 software package and scripts for benchmarking and analysis (ZIP 1312 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Callahan, B., McMurdie, P., Rosen, M. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13, 581–583 (2016). https://doi.org/10.1038/nmeth.3869

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3869

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing