Mice
C57BL/6 mice, maintained under specific-pathogen-free or germ-free (GF) conditions, were purchased from Sankyo Laboratories Japan, SLC Japan, or CLEA Japan. GF and gnotobiotic mice were bred and maintained within the gnotobiotic facility of Keio University School of Medicine or the JSR-Keio University Medical and Chemical Innovation Center. Il10-/- and Ifngr1-/- mice were purchased from Jackson Laboratories. Myd88-/-Trif-/- and Rag2-/-gc-/- mice were purchased from Oriental Bio Service, Japan. All animal experiments were approved by the Keio University Institutional Animal Care and Use Committee.
Human faecal samples and isolation of bacterial strains
Human faecal samples were obtained from healthy human donors, patients with ulcerative colitis (UC), and patients with Crohn's disease (CD) following the protocol approved by the Institutional Review Board of Keio University School of Medicine (approval numbers #20150075, #20140211, and #20150075). Informed consent was obtained from each individual. Faecal samples were mixed with PBS (containing 20% glycerol) and stored at –80°C. An aliquot of each sample was diluted with PBS in an anaerobic chamber (80% N2, 10% H2, and 10% CO2; Coy Laboratory Products) and plated onto different agar plates (EG, mGAM, BHK, CM0151, MRS, or BL). After incubating for 2–7 days, colonies with different appearances were transferred to liquid media (EG, mGAM, HK, or CM0149), incubated for 24–48 hours, mixed with glycerol [final concentration 20% (v/v)], and stored at –80°C. Bacterial genomic DNA was extracted from the isolated strains using the same protocol as DNA isolation from faecal samples (below). The 16S rRNA was amplified by PCR using the KOD plus Neo kit (TOYOBO) according to the manufacturer’s protocol. DNA sequencing was performed by Eurofins. Sequences were aligned using the BLAST program of NCBI and the Ribosomal Database Project (RDP) databases. Primers used for DNA sequencing were as follows: F27 primer: 5’-AGRGTTTGATYMTGGCTCAG-3’; R1492 primer: 5’-TACGGYTACCTTGTTACGACTT-3’. Individual isolates in the culture collection were grouped as “strains” if their 16S rRNA gene sequences shared >98.0% homology.
To prepare the bacterial mixture for inoculation, isolated strains were individually cultured in the appropriate broth at 37°C for 1–2 days (mGAM broth was used for culturing the F18 strains). Bacterial density was adjusted based on absorbance at 600 nm values, and equal volumes of the cultured strains were mixed and centrifuged at 3000 × g for 10 min at 4°C to concentrate fivefold. Thereafter, GF mice were administered 200 mL of the bacterial mixture/mouse (approximately 1–2 × 109 CFU of total bacteria) by oral gavage. The bacterial mixture was administered into GF mice (200 mL/mouse, approximately 1-2 × 109 CFU of total bacteria) by oral gavage.
Colonization of mice with pathogenic bacterial strains
C57BL/6 GF mice (8–14 weeks of age, housed in separate GF isolators) were inoculated with Klebsiella pneumoniae 2H7 (Kp-2H7), carbapenem-resistant Klebsiella pneumoniae (CPM+ Kp, ATCC BAA1705), Klebsiella aerogenes (strain Ka-11E12, ref. 17), extended-spectrum-b-lactamase producing E. coli (ESBL+ E. coli, ATCC BAA2777), adherent-invasive E. coli (AIEC, strain LF82, provided by Nicolas Barnich22), Pseudomonas aeruginosa (ATCC 10145), Vancomycin-resistant Enterococcus faecium (VRE Ef, ATCC 700221), Campylobacter upsaliensis (ATCC BAA1059), or Clostridioides difficile (strain 630, ATCC BAA1382) by oral gavage (2 × 108 CFU/ mouse). Seven days after colonization with pathogenic microbes, the mice were administered 200 mL of human faecal suspension or 200 mL of isolated bacterial strain mix (total 109 CFU) by oral gavage. Faecal samples were collected from mice every two or three days, suspended in PBS (containing 20% glycerol), and cultured on selective media [DHL agar with 30 mg/L ampicillin and 30 mg/L spectinomycin for Kp-2H7, CRE Kp, Ka-11E12, and P. aeruginosa, MacConkey agar with 1 mg/L cefotaxime, and VRE-selective agar plates (BD #251832) for VRE]. After 24–48 hours of incubation, the CFUs were counted. In cases where evaluation was not possible by counting CFUs, bacterial DNA extracted from faeces was evaluated by quantitative real-time PCR (qPCR). Unless otherwise stated, mice were fed a high-calorie diet (CL-2; CLEA Japan, Inc.). To evaluate the effect of dietary gluconate supplementation, a chemically defined diet (AIN93G; Oriental Yeast Co., Ltd) supplemented with 0%, 2.5% or 10% gluconate was used. To examine the effectiveness of each of the 18 strains, C57BL/6 GF mice were inoculated with Kp-2H7 (2 × 108 CFU/mouse) by oral gavage, followed by oral administration of each strain of the F18-mix one by one every five days for 95 days. Faecal samples were collected every five days to count the CFU of Kp-2H7 as well as to quantify the levels of gluconate.
Bacterial DNA extraction, quantitative real-time PCR, and 16S rRNA gene sequencing
The frozen faecal samples were thawed and 50 µL of each sample was mixed with 350 µL TE10 (10 mM Tris-HCl, 10 mM EDTA) buffer containing RNase A (final concentration 100 µg/mL, Invitrogen) and lysozyme (final concentration 3.0 mg/mL, Sigma). The suspension was incubated for one hour at 37°C with gentle mixing. Purified achromopeptidase (Wako) was added to a final concentration of 2,000 unit/mL, and the sample was further incubated for 30 min at 37°C. Then, sodium dodecyl sulphate (final concentration 1%) and proteinase K (final concentration 1 mg/mL, Nacalai) were added to the suspension and the mixture was incubated for one hour at 55°C. Thereafter, purified DNA was obtained from the samples using the Maxwell®︎ RSC cultured cell DNA kit, according to the manufacturer’s protocol. For quantifying the amount of bacterial DNA, real-time qPCR was performed using the Thunderbird SYBR qPCR Mix (TOYOBO) and LightCycler 480 (Roche). The primer pairs used in this study are listed in Table S4.
16S rRNA gene sequencing was performed using MiSeq according to the Illumina protocol. PCR was performed using primers 27Fmod (5’-AGRGTTTGATYMTGGCTCAG-3’) and 338R (5’-TGCTGCCTCCCGTAGGAGT-3’) to amplify the V1–V2 region of the 16S rRNA gene. Amplicons (approximately 330 bp in size) generated from each sample were purified using AMPure XP magnetic beads (Beckman Coulter). DNA was quantified using the Quant-iT Picogreen dsDNA assay kit (Invitrogen) and Infinite M Plex plate reader (Tecan), according to the manufacturer’s instructions, and then stored at 4°C. The pooled amplicon library was sequenced using the MiSeq Reagent Kit v2 (500 cycles) and MiSeq sequencer (Illumina; 2 × 250-bp paired-end reads). After demultiplexing the 16S sequence reads based on the sample-specific index, primer sequences were trimmed by Cutadapt v. 3.346. The trimmed reads were uploaded to the DADA2 R package v.4.0.3 (ref. 47) to construct amplicon sequence variants (ASVs) using the filterAndTrim function with the following parameters: maxN = 0, truncQ = 2, maxEE = 2, and truncLen = c (200,180). Possible chimeric reads were removed with the removeBimeraDenovo function of the DADA2. The taxonomic assignment of each ASV was determined by similarity searching using the GLSEARCH program. For determining taxonomy of ASV sequences that originated from human faecal samples, 16S RefSeq from NCBI, RDP48, CORE49, and GRD (https://metasystems.riken.jp/grd/) were used as the reference database. The sequences of isolates were compared to ASV detected in the faecal microbiome of donors F, I, and K, and those matching >99% were determined to be their corresponding ASVs.
Bacterial whole-genome sequencing
The Illumina MiSeq and PacBio Sequel platforms were used for bacterial whole-genome sequencing. For Illumina sequencing, the library was prepared using the TruSeq DNA PCR-free library prep kit (Illumina), with a target insert size of 550 bp. All the Illumina reads were trimmed and filtered using the FASTX-toolkit (version 0.0.13). For the PacBio sequencing, the library was prepared using the SMRTbell template prep kit 1.0. Sequence data for both types of sequencing were assembled using the hybrid assembler Unicycler. Taxonomic assignment of the genomes was determined by classify_wf of GTDB-tk50 version 2.3.0 with GTDB51 database R214. NCBI taxonomy of fastANI reference genome related to the genome of each strain was retrieved using NCBI-genome-download version 0.3.3 (ncbi-genome-download; DOI: 10.5281/zenodo.8192432) and rankedlineage.dmp from NCBI taxonomy database52 (downloaded on 14/09/2023). The genes were predicted using Prokka version 1.14.0 with “--kingdom Bacteria --rnammer” options, and rnammer version 1.2. The homology search for the predicted genes was performed using diamond53 version 2.0.15 with “blastp --evalue 0.00001 --id 30 --query-cover 60 --ultra-sensitive” options, with KEGG (downloaded on 19/04/2022)54, COG (downloaded on 19/05/2021)55, VFDB (downloaded on 10/09/2022)56, and UniRef90 (downloaded on 24/05/2022; https://www.uniprot.org/help/uniref) databases. For homology search against KEGG DB, a database was manually constructed from protein sequences with KEGG Ontology (K number) which were extracted from KEGG non-redundant datasets at the species level. We also added homology search for gluconate metabolism genes in our isolated strains with “blastp --evalue 0.00001 --id 20 --query-cover 60 --ultra-sensitive” options. The sequences of gluconate kinase (gntK, MKMCEHOJ_02531) and gluconate transporters (MKMCEHOJ_02530 and MKMCEHOJ_02505) from f37_E. coli strain, and gluconate dehydratase (EAOGLLOI_00767), gluconate transporters (EAOGLLOI_00766 and EAOGLLOI_00912), 2-dehydro-3-deoxygluconokinase (kdgK, EAOGLLOI_00768), and 2-dehydro-3-deoxyphosphogluconate aldolase (eda, EAOGLLOI_00769) from f17_Blautia caecimuris strain were used as reference sequences.
Ex vivo caecal suspension culture
Caecal contents from GF or F31-, F18-, and F13-mix colonized mice were anaerobically resuspended in water at a concentration of 100 mg/mL. Caecal contents were either filtered through a 0.22 µm filter (Millex Millipore) after centrifuging at 10,000 × g for 5 min, heat-killed at 105 °C for 30 min, or left untreated. Thereafter, a diluted overnight culture of Kp-2H7 (103 CFU in 10 µL) was added to 200 µL of each caecal suspension. After incubating at 37°C for 48 hours under aerobic or anaerobic conditions, samples were serially diluted and plated on a selection agar plate (DHL with 30 mg/L ampicillin and 30 mg/L spectinomycin) for counting Kp-2H7 CFU.
Bacterial growth monitoring
The wild type, ΔgntK, or ΔgntR Kp-2H7 strain was cultured in M9 minimal medium for 24 hours at 37°C, which was diluted 100 times with sterile water. A 10 µL culture dilution was inoculated into 200 µL of M9 medium with 0.4% of glucose or gluconate as the sole carbon source or without carbon. To examine the effect of metabolites on Kp-2H7 growth, 10 mL of Kp-2H7 culture dilutions were inoculated into 200 µL of M9 medium containing varying concentrations of 4-HBA (4-hydroxybenzoic acid) (100, 10, 1, or 0.1 mM), cholic acid (500, 100, 20, or 4 µM), and acetate or butyrate (100, 25, 6.25, 1.56 or 0.39 mM). The pH of acetate and butyrate was adjusted to either 5.0 or 7.0. Bacterial growth was monitored by measuring absorbance at 600 nm every 30 minutes using a microplate reader [Sunrise Thermo (Tecan) for anaerobic conditions and Infinite 200 PRO (Tecan) for aerobic conditions] at 37°C with a 100-second shaking before each time point.
Transcriptome analysis of epithelial cells
Total RNA was isolated from colonic epithelial cells using NucleoSpin RNA (MACHEREY-NAGEL), according to the manufacturer’s instructions. Libraries for RNA sequencing were prepared using TruSeq Stranded mRNA Library Prep (Illumina Inc.), according to the manufacturer’s instructions. The libraries were sequenced using NovaSeq 6000 (Illumina Inc.) with the mode of 150-bp paired-end. The sequenced paired-end reads were quality-controlled using Trimmomatic57 version 0.39 with “2:30:10 LEADING:3 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:5” options and FASTX-Toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/index.html) with “-q 20 -p 80” options. Unpaired reads were excluded from further analyses. The remaining quality-controlled reads were mapped to the mouse reference genome (mm10) using STAR58 version 2.7.2b. The mapped reads were counted for each gene using featureCounts59 version 1.5.2 with “-t exon -p -B -Q 1” options. the transcripts per million (TPM) values of each gene in each sample were calculated. The differential expression analysis was performed using DESeq260 version 1.28.1, and the p-values were corrected by Benjamini-Hochberg (BH) method to maintain the false discovery rate (FDR) below 5%.
Transcriptome analysis of Kp-2H7
Total RNA was isolated from faecal samples using NucleoSpin RNA (MACHEREY-NAGEL), according to the manufacturer’s instructions. Libraries for RNA sequencing were prepared using TruSeq Stranded mRNA Library Prep (Illumina Inc.) and sequenced using HiSeq X (Illumina Inc.) with the mode of 150-bp paired-end. To analyse the transcriptome profiles of Kp-2H7 in the presence or absence of F18-mix, a reference genome was created by concatenating the genome sequence of Kp-2H7 with the genome sequences of the F18-mix. The sequenced paired-end reads were quality-controlled using Trimmomatic57 version 0.39 with “2:30:10 LEADING:3 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:5” options and FASTX-Toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/index.html) with “-q 20 -p 80” options. Unpaired reads were excluded from further analyses. The remaining reads were mapped to the mouse (mm10) and PhiX reference genomes using minimap2 version 2.17-r941 with “-N 1 -a” options61. Then, the reads unmapped to the mouse genome were extracted to obtain quality-controlled reads for subsequent analyses. The quality-controlled reads were mapped to the concatenated reference genome using bowtie262 version 2.3.4.1. Uniquely mapped reads were counted for each Kp-2H7 gene. The differential expression analysis was performed using DESeq260 version 1.28.1 with BH-correction method to maintain the FDR below 5%. The heatmap was obtained from the variance-stabilizing transformations values obtained from the DESeq2 output.
For real-time qPCR analysis, cDNA was synthesized using ReverTra Ace qPCR RT Master Mix (TOYOBO), and qPCR was performed using Thunderbird SYBR qPCR Mix (TOYOBO) on a LightCycler 480 (Roche).
Construction of transposon mutant library
A transposon insertion library of Kp-2H7 was constructed using the EZ-Tn5TM <KAN-2> Tnp TransposomeTM kit (Lucigen Corp, USA). Briefly, 80 µL (109 CFU) of Kp-2H7 suspension was mixed with 0.5 µL of EZ-Tn5TM <KAN-2>, transferred to a 1-mm gap width electroporation cuvette, and subjected to electroporation using ELEPO21 (Nepa Gene Co. Ltd., Japan) with the following parameters: poring pulse; voltage: 1800 V, pulse length: 5.0 msec, pulse interval: 50 msec, number of pulses: 1, and polarity: +, and transfer pulse; voltage: 150 V, pulse length: 50 msec, pulse interval: 50 msec, number of pulses: 5, and polarity: ±. Transformed Kp2H7 cells were incubated in 1 mL of LB broth for three hours at 37°C, and then selected on LB agar plates containing kanamycin (90 mg/L) at 37°C. Thereafter, approximately 8 x 105 transposon mutant colonies were collected and stored at –80°C in LB containing 20% glycerol.
Transposon sequencing
GF mice were colonized with the pool of 8 x 105 Kp-2H7 transposon mutants. Faecal samples were collected on day 0, 4, 10, and 28 following colonization, suspended in PBS (50 mg/mL) containing 20% glycerol, and cultured overnight at 37°C on LB agar plates containing kanamycin (90 mg/L). Kp-2H7 mutant colonies were scraped together and DNA was extracted by the method described above. Transposon sequencing was carried out according to the method described by Kazi et al.63. Briefly, genomic DNA was fragmented via sonication. Then, a poly-C tail was added to the 3' end of the DNA fragment by terminal deoxynucleotidyl transferase. The transposon junctions were amplified using a biotinylated primer, which was then enriched using streptavidin beads. By performing a second nested PCR, a single barcode was added to each sample. The libraries were sequenced using HiSeq 2500 (Illumina Inc.) with the mode of 50-bp single-end. The sequenced reads were quality-controlled using Trimmomatic57 version 0.39 with “2:30:10 LEADING:3 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:5” options and FASTX-Toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/index.html) with “-q 20 -p 80” options. Unpaired reads were excluded from further analyses. The remaining reads were mapped to the mouse reference genome (mm10) using minimap2 version 2.17-r941 with “-N 1 -a -x sr” options61. Then, the reads unmapped to the mouse genome were extracted to obtain quality-controlled reads for subsequent analyses. The quality-controlled reads were mapped to the Kp-2H7 assembled genome using bowtie2 version 2.4.2. The mapped reads were counted for each gene using featureCounts3 version 1.5.2 with “-t CDS -p -B -Q 1” options, and the TPM of each gene was calculated as the relative abundance of a gene mutant in a sample by assuming that each transposon mutant has a single insertion. The differential abundance mutants were detected by Welch’s t-test for log-scaled TPM with BH-correction method to maintain the FDR below 5%.
Generation of Kp-2H7 mutants
The Kp-2H7 deletion mutants of were generated using the Quick and Easy E. coli Gene Deletion Kit (Gene Bridges, Heidelberg) according to the manufacturer’s protocol. Briefly, Kp-2H7 cells were transformed with the pRED/ET plasmid harbouring the tetracycline-resistant gene by electroporation. Bacteria with pRED/ET were selected on LB plates containing tetracycline (30 mg/L) at 30°C. Thereafter, these cells were incubated in LB broth with appropriate antibiotics at 30°C until absorbance at 600 nm reached 0.2, followed by an additional hour of incubation with 0.3% L-arabinose at 37°C to induce the expression of the recombinant proteins. These cells were used to prepare electrocompetent cells and were transformed with the linear DNA fragment (the FRT-PGK-gb2-neo-FRT cassette)-flanked homology arms. The functional cassettes were generated by PCR, according to the manufacturer’s protocol. The primers with homology arms are listed in Table S4. The electroporated cells were incubated in 1 mL of LB broth for three hours at 37°C. Gene deletion strains were selected on LB agar plates with kanamycin (90 mg/L) after overnight growth at 37°C. The double or triple knockout strains were generated by removing the kanamycin selection marker through electroporation of the FLP expression plasmid (707-FLPe) and repeating the above-mentioned protocol. The deletions were confirmed by DNA sequencing.
Isolation of lymphocytes and flow cytometry
Lymphocytes were collected from the large intestines and analysed according to previously described protocols17,64. Briefly, the intestines were dissected longitudinally and washed with PBS to remove all luminal contents. All samples were incubated in 15 mL of Hanks’ balanced salt solution (HBSS) containing 5 mM EDTA for 20 min at 37°C in a shaking water bath to remove epithelial cells. Thereafter, after removal of any remaining epithelial cells, muscular layers and fat tissues using forceps, the samples were cut into small pieces and incubated in 10 mL of RPMI1640 containing 4% foetal bovine serum (FBS), 0.5 mg/mL collagenase D (Roche Diagnostics), 0.5 mg/mL dispase II (Roche Diagnostics), and 40 μg/mL DNase I (Roche Diagnostics) for 50 min at 37°C in a shaking water bath. Thereafter, the resultant digested tissues were washed with 10 mL of HBSS containing 5 mM EDTA, resuspended in 5 mL of 40% Percoll (GE Healthcare), and underlaid with 2.5 mL of 80% Percoll in a 15-mL Falcon tube. Percoll gradient separation was performed by centrifugation at 850 × g for 25 min at 25°C. Lymphocytes were collected from the interface of the Percoll gradient and washed with RPMI1640 containing 10% FBS, and then stimulated with 50 ng/mL PMA and 750 ng/mL ionomycin (both from Sigma) in the presence of Golgistop (BD Biosciences) at 37°C for four hours. After labelling of the dead cells with Ghost Dye Red 780 Viability Dye (Cell Signaling Technology), the cells were permeabilized and stained with anti-CD3e (BUV395; Biolegend), CD4 (BUV737; Biolegend), CD8a (PE/Cy7; Biolegend), TCRβ (BV421; Biolegend), and IFN-γ (FITC; Biolegend) using the Foxp3/Transcription Factor Staining Buffer Kit (Tonbo Biosciences), according to the manufacturer's instructions. All data were collected on a BD LSRFortessa (BD Biosciences) and analysed using Flowjo software (TreeStar). CD4+ T cells were defined as a CD4+ TCRβ+ CD3e+ subset within the live lymphocyte gate.
Measurement of lipocalin-2 and calprotectin
The faecal pellets from Il10-/- mice were vortexed, suspended in PBS (5% w/v) with Complete Protease Inhibitor Cocktail (1 tablet dissolved in 50 mL PBS; Roche) and centrifuged, and supernatants were collected. The concentration of lipocalin-2 and calprotectin in faecal supernatants was measured by ELISA (Mouse Lipocalin-2 Matched Antibody Pair Kit; Abcam, Mouse S100A8/S100A9 Heterodimer DuoSet; R&D), according to the manufacturer’s protocol.
Histological analysis
Colon tissue samples were dissected longitudinally and swiss-rolled, fixed with 4% paraformaldehyde, embedded in paraffin, sliced to 5µm sections and stained with hematoxylin and eosin. The degrees of colitis were graded by The Mouse Colitis Histology Index65. The histological slides were evaluated blind by two investigators.
Non-targeted metabolomics analysis
C57BL/6 GF mice were monocolonized with Kp-2H7, followed by oral administration of bacterial mix. Caecal contents were collected on day 28 after administration of isolated bacterial mix and stored at –80℃ until use. Frozen caecum contents were homogenized by shaking with metal corn using a multi beads shocker as previously described66. Then, the samples were suspended in 400 µL of methanol per 100 mg caecum content, and a 40 mL aliquot was subjected to the single layer extraction and untargeted LC-QTOF/MS analysis66. SCFAs were simultaneously extracted and derivatized from 20 µL of the suspension by using pentafluorobenzyl bromide alkylation reagent (Thermo Fischer Scientific, Waltham, MA, USA), and analysed by GCMS as previously described67. Water-soluble metabolites were extracted by first mixing 4 µL of the suspension, 196 µL of methanol, 200 µL of chloroform, 70 µL of water, and 10 µL of internal standards mix [100 µM of cycloleucine, 500 µM of citric acid-d4, and 1.0 mM of ornithine-d7 (Cambridge isotope laboratories, Andover, MA, USA)]. After vortexing for 1 min and centrifugation at 15000 × g for 5 min at 4℃, 100 µL of supernatant was evaporated to dryness. The dried samples were derivatized via methoxyamination, trimethylsilylation, or tert-butyldimethylsilylation, and then analysed by GC-MS/MS using Smart Metabolite DatabaseTM (Shimadzu Corp., Kyoto, Japan) or GC-MS operated in selected ion monitoring mode, as described previously68. Bile acids were extracted from 4 µL of the suspension mixed with deuterium-labelled internal standard mix [1.0 µM of cholic acid-d4, 1.0 µM of lithocholic acid-d4, 1.0 µM of deoxycholic acid-d4, 1.0 µM of taurocholic acid-d4, and 1.0 µM of glycocholic acid-d4 (Cayman Chemical)] using the Monospin C18 column (GL science). The column was washed with 300 µL of water (x2) and 300 µL of hexane (x1). Bile acids were eluted with 100 µL of methanol, then subjected to LC-MS/MS analysis using an UPLC I class (Waters) with a linear ion-trap quadrupole mass spectrometer (QTRAP 6500; AB SCIEX) equipped with an Acquity UPLC BEH C18 column (50 mm, 2.1 mm, and 1.7 μm; Waters). Samples were analysed with a mobile phase consisting of water/methanol/acetonitrile [14:3:3 (vol/vol/vol)] and acetonitrile, both containing 5 mM ammonium acetate, for 4 min, which was changed to 40:60 after 12 min, to 5:95 after 2 min, and then held for 2 min; with flow rates of 300 μL/min. Bile acids were detected by multiple-reaction monitoring in negative mode. Ions of [M-H]-, taurine (m/z = 124), and glycine (m/z = 74), generated from the precursor ion, were monitored as product ions for non-conjugated, taurine-conjugated, and glycine-conjugated bile acids, respectively. MS/MS settings were as follows: ion source, turbo spray; curtain gas, 30 psi; collision gas, 9 psi; ionspray voltage, –4500 V; source temperature, 600℃; ion source gas 1, 50 psi; and ion source gas 2, 60 psi.
Measurement of carbohydrate levels
To evaluate bacterial gluconate utilization in vitro, isolated strains were cultured in mGAM broth or RCM containing 300 µM gluconic acid for 48 hours at 37°C under anaerobic conditions. Supernatant of each culture broth was collected, and the concentration of gluconate was measured by the ExionLC AD and SCIEX Triple Quad 6500+ LC-MS/MS system. To evaluate carbon level in faeces, each faecal sample was suspended in water (50 mg/mL), and the carbon levels in the culture supernatant were measured by LC-MS/MS. The measurement conditions for gluconate, glucuronate, and galacturonate were as follows: chromatographic separation was performed using the Intrada Organic Acid column, 150 × 2 mm (Imtakt); column temperature was 40℃; and the volume of each injection was 2 μL. The mobile phase comprising A (acetonitrile/water/formic acid = 10/90/0.1) and B (acetonitrile/100mM ammonium formate = 10/90) was used under gradient conditions: 0–3 min, A 100%, B 0%; 3–10 min, A 100%, B 0%; 10–13 min, A 0%, B 100%; 13–13.1 min, A 0%, B 100%; and 13.1–18 min, A 100%, B 0%); and the flow rate was 0.2 mL/min. Detailed MS conditions were as follows: curtain gas, 30 psi; Collision Gas, 9; ionSpray voltage, –4500 V; temperature, 400℃; ion source gas 1, 50 psi; and ion source gas 2, 80 psi. The retention time and Multiple reaction monitoring (MRM) transitions are listed in Table S5. The measurement conditions for other carbons were as follows: chromatographic separation was performed using the UK-Amino column (UKA26), 250 × 2 mm, (Imtakt); column temperature was 60℃ and the volume of each injection was 2 μL. The mobile phase comprising A (10 mM ammonium acetate) and B (acetonitrile) was used under gradient conditions: 0–10 min, A 0%, B 100%; 10–50 min, A 0%, B 100%; 50–65 min, A 12%, B 88%; 65–70 min, A 60%, B 40%; 70–70.1 min, A 60%, B 40%; and 70.1–75 min, A 100%, B 0%); and the flow rate was 0.2 mL/min. Detailed MS conditions were as follows: curtain gas, 25 psi; collision gas, 9; ionspray voltage, –4500 V in negative mode and 5500 V in positive mode; temperature, 250℃, ion source gas 1, 50 psi; and ion source gas 2, 70 psi. Multiple reaction monitoring parameters are listed in Table S5. Data were obtained using Analyst software version 1.7.1 and analysed using SCIEX OS-MQ software version 2.1.0.55343.
Metagenomic analysis of IBD cohorts
To systematically explore both established and novel microbial taxa possessing gluconate operon genes, gene catalogues were acquired from two distinct cohorts with IBD etiology: the paediatric PROTECT and adult HMP2 cohorts, comprising 240 and 1638 longitudinal metagenomic samples from 94 and 91 individuals, respectively. Metagenomic Species Pangenomes (MSPs) were constructed via the co-abundant gene binning approach (MSPminer69), followed by quality assessment (CheckM70), as described by Schirmer et al. (PROTECT)3 and Kenny et al. (HMP2)71. A targeted screening of these bins with DIAMOND BLASTP version 0.9.1472 was conducted to identify genes associated with gluconate transport and metabolism, retaining hits with an e-value <0.01 and sequence identity ≥60%. MSPs were categorized based on the combinations of gluconate-related genes detected. A differential abundance analysis was performed on TPM-normalized and Centred Log-Ratio-transformed MSP counts to control for sequencing depth, gene length, and compositional biases. Statistical significance was ascertained through a non-parametric Mann-Whitney U test accompanied by Benjamini-Hochberg correction. Effect sizes (r), calculated as the test statistic divided by the square root of the sample size, along with bootstrapped confidence intervals, were computed to account for unbalanced group sizes, offering insights into the robustness and directionality of the observed effects.
For the PROTECT cohort, comparative analyses were executed on randomly chosen cross-sectional samples from children manifesting mild UC (n = 32) or moderate to severe UC (n = 23), against inactive UC (n = 39). To validate the robustness of the findings, these analyses were iteratively repeated with varying seed values for random sample selection from longitudinal data pools of mild (n = 64), moderate/severe (n = 57), and non-IBD samples (n = 119). Within the HMP2 cohort, inclusion was also limited to cross-sectional samples accompanied by available calprotectin data. In response to the attenuated metagenome disease signal observed in the study cohort73, a targeted inflammation-specific selection approach was utilized instead of choosing the cross-sectional data via repeated random sampling. For the CD and UC sub-cohorts, the sample with maximal calprotectin value per patient was included (CD = 41, UC = 26). Conversely, for the non-IBD control group, the cross-sectional sample with the minimal calprotectin value per patient was chosen (n = 24). Statistical analyses were conducted using R software version 4.2.1 (Ubuntu 20.04.5 LTS).
Untargeted stool metabolomics and gluconate intensity estimation
Untargeted stool metabolomics of faecal samples from the PROTECT cohort was performed using LC-MS in negative mode, and calprotectin was measured by ELISA. Briefly, hydrophilic interaction liquid chromatography (HILIC) analyses of water-soluble metabolites in the negative ionization mode were conducted using Shimadzu Nexera X2 U-HPLC (Shimadzu Corp.) coupled to a Q Exactive Plus mass spectrometer (Thermo Fisher Scientific). Metabolites were extracted from plasma or stool (30 µL) using 120 µL of 80% methanol containing inosine-15N4, thymine-d4, and glycocholate-d4 internal standards (Cambridge Isotope Laboratories). The samples were centrifuged (10 min, 9,000 × g, 4°C), and the supernatants were injected directly onto a 150 × 2.0 mm Luna NH2 column (Phenomenex; Torrance, CA). All masses detected in HILIC negative mode were matched via adduct subtraction and molecular formula match to compounds downloaded from the Human Metabolome Database (HMDB) on 10/10/2022. The measured m/z values were adjusted for [M-H]- adducts, and molecular formulae matching to within 5 ppm were selected as candidate identifiers. In cases where multiple molecular formulae matched the adduct-adjusted mass (as a result of multiple potential adducts), the one with a minimal ppm difference was selected. Out of 4,461 detected features (m/z, retention time pairs) a single feature 195.0512 m/z @ 4.34 min resolved to the formula C6H12O7 (delta ppm = 0.89), related to a group of five compounds with canonical structure O=C(O)C(O)C(O)C(O)C(O)CO, which includes L-gluconic acid (HMDB0000625).
Statistical analyses
Statistical analyses were performed using GraphPad Prism software (GraphPad Software, Inc.). Kruskal-Wallis test and FDR method of Benjamini and Hochberg were used for multiple comparisons during CFU comparisons. Mann-Whitney U test with Welch’s correction was used for comparisons between the two groups. Spearman’s rank correlation was used to investigate the correlation between the relative abundance of Kp-2H7 and isolated strains.