Data from: The cancer microbiome atlas (TCMA): A pan-cancer comparative analysis to distinguish organ-associated microbiota from contaminants

Public

  • Studying the microbial composition of internal organs and their associations with disease remains challenging due to the difficulty of acquiring clinical biopsies. We designed a statistical model to analyze the prevalence of species across sample types from The Cancer Genome Atlas (TCGA), revealing that species equiprevalent across sample types are predominantly contaminants, bearing unique signatures from each TCGA-designated sequencing center. Removing such species mitigated batch effects and isolated the tissue-resident microbiome, which was validated with original TCGA samples. "Mixed-evidence"species can be further distinguished by gene copy and nucleotide variants. We thus present The Cancer Microbiome Atlas (TCMA), a collection of curated, decontaminated microbial compositions of oropharyngeal, esophageal, gastrointestinal, and colorectal tissues. This led to discovery of prognostic species and blood signatures of mucosal barrier injuries, and enabled systematic matched microbe-host multi-omics analyses, which will help guide future studies of the microbiome's role in human health and disease. ... [Read More]

Total Size
49 files (164 MB)
Data Citation
  • Dohlman, A., Arguijo Mendoza, D., Ding, S., Gao, M., Dressman, H., Iliev, I., Lipkin, S., & Shen, X. (2020). Data from: The cancer microbiome atlas (TCMA): A pan-cancer comparative analysis to distinguish organ-associated microbiota from contaminants. Duke Research Data Repository. https://doi.org/10.7924/r4rn36833
DOI
  • 10.7924/r4rn36833
Publication Date
ARK
  • ark:/87924/r4rn36833
Is Replaced By
  • 10.7924/r4bk1j35s
Type
Format
Related Materials
Funding Agency
  • NIH
  • DARPA
Grant Number
  • R35GM122465
  • DK119795
  • W911NF1920111
Contact
Title
  • Data from: The cancer microbiome atlas (TCMA): A pan-cancer comparative analysis to distinguish organ-associated microbiota from contaminants

Versions

Version DOI Comment Publication Date
2 10.7924/r4bk1j35s This dataset has been updated with an improved decontamination algorithm. 2022-08-04
1 10.7924/r4rn36833 2020-09-29
This Dataset
Usage Stats