Introduction

The human gut microbiota is considered currently as one of the most active research fields in microbiology1. In fact, this microflora harbours a huge biodiversity of bacteria of which a large part is still unknow2. Researchers used a variety of strategies to speed up and simplify the description of new bacterial species by optimizing their in vitro growth conditions3,4,5. Culturomics is one among these strategies, which relies on a diversification of culture conditions that allowed the identification of several new bacterial species isolated from the human gastrointestinal tract3,4,5. Since 2012, it allowed the isolation of over 1000 different human-associated bacterial species, including several hundreds of new species3,4,5,6. This method highlighted the need to adopt taxonomic approaches to clinical microbiology by including the use of modern and reproducible tools, such as high throughput genomic and proteomic analyses.

In November 30, 2015, two putative new bacterial species, Culturomica massiliensis gen. nov., sp. nov., and Emergencia timonensis gen. nov., sp. nov., were isolated from patient’s stools, and partially described7, 8. The genomic sequencing of new described bacterial species constitutes currently a necessary step for performing their comparative taxogenomic descriptions with their closest related known species. In fact, several recent publications have used genomic descriptions to characterize the new species by comparison to their closest relatives strains9,10,11. The aim of our current study was to complete the phenotypic, taxonomic and genomic characterization proposal of new genera and new species of Culturomica massiliensis gen. nov., sp. nov., strain Marseille-P2698T, and Emergencia timonensis gen. nov., sp. nov., strain Marseille-P2260T and formally expose the creation of both species.

Materials and methods

Sample collection and ethics approval

In November 30, 2015, stool samples were collected from hospitalized patients in the Timone Hospital (Marseille, France) as a part of a study of human microbiota diversity. Patients provided signed informed consent7, 8. The study protocol was approved by the ethics committee of the institut de recherche fédératif 48, under agreement number 09-022. In addition, all methods were performed in accordance with the relevant guidelines and regulations. Each sample was then cultured according to the culturomics method previously established in our laboratory3, 5. Various types of bacterial colonies were isolated on 5% of sheep blood-enriched Columbia agar (bioMérieux®, Marcy l’Etoile, France). Bacterial colonies were then screened for identification by Matrix-Assisted Laser Desorption Ionization-Time Of Flight Mass Spectrometry (MALDI-TOF MS) instrument (Bruker Daltonics®, Bremen, Germany) as previously reported12. Both two strains studied herein had a MALDI-TOF score lower than 2.0, which did not allow their correct identification. Their spectra were then added to the local MALDI-TOF MS database (https://www.mediterranee-infection.com/urms-data-base).

16S rRNA gene sequencing and identification

The 16S rRNA sequences from both strains were directly extracted from their whole genomes sequences and then, compared by Basic Local Alignment Search Tool nucleotide (BLASTn) to the non-redundant (nr) databases13. The obtained sequence similarity percentages allowed identification of the closest species to each strain, and to predict if it was new species (< 98.65% of similarity). Then, the phylogenetic tree was constructed based on these 16S rRNA gene sequences in comparison to the closest related species of each studied strain. Designated species sequences were downloaded from nr14, and aligned with ClustalW. Phylogenetic trees were constructed using MEGA 11 version 11.0.10 with the maximum likelihood method and 1000 bootstrap replications15.

Phenotypic and biochemical characterizations

Optimal culture conditions were determined by testing various incubation temperatures (25, 28, 37, 42, and 50 °C), atmospheres (aerobic, anaerobic and microaerophilic), NaCl concentrations (5, 5.5, 7.5, 10, 15, and 20% of NaCl) and pH levels (5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5). The morphology and motility were observed using a new-generation scanning electron microscope (Hitachi High-71 Technologies Corporation, Tokyo, Japan).

Furthermore, three semi-quantitative standardized micro-methods of Analytical Profile Index (API®, bioMérieux®) tests: API® 20A, API® 50 CH, and API® ZYM were used, according to the manufacturer’s instructions16, in order to study carbohydrate metabolism and enzymatic activities.

Fatty acid methyl ester (FAME) analysis was explored by Gas Chromatography/Mass Spectrometry, as previously reported17, 18. FAMEs were separated using an Elite 5-MS column and monitored by mass spectrometry (Clarus 500—SQ 8 S, Perkin Elmer®, Courtaboeuf, France). Obtained spectra were compared with those contained in the repertory databases using MS Search 2.0 operated with the Standard Reference Database 1A (National Institute of Standards and Technology-NIST, Gaithersburg, USA), and FAMEs mass spectral database (Wiley, Chichester, UK).

Whole genomic sequencing and bioinformatic analyses

First, bacterial DNA was extracted using the EZ1 DNeasy Blood Tissue Kit (Qiagen® GmbH, Hilden, Germany) in line with the manufacturer’s protocol19. Whole-genome sequencing was performed using an Illumina® MiSeq sequencer (Illumina®, San Diego, CA, USA)20. Then, sequenced genomes were assembled using SPAdes 3.5.0 software21, which reduces short indels and the huge number of mismatches. Raw reads in contigs less than 700-bp-long were removed. Finally, the quality of the sequenced genome was checked using BLAST against the nr/nt database. This method allowed us to better explore the relationship between a submitted assembly of our new species to the International Nucleotide Sequence Database Collaboration (INSDC), i.e., DDBJ, ENA, or GenBank, and the assembly represented in the NCBI reference sequence (RefSeq) project. The global statistics section reported general statistics information including Gaps between scaffolds, number of scaffolds, number of contigs, total sequence length, and total ungapped length. Furthermore, taxonomic data were checked according to the best-matching-type strain with the declared new species repertory in NCBI22.

During annotation, genomic parameters were evaluated including transfer-messenger RNAs (tmRNAs) and transfer RNAs (tRNAs) using ARAGORN version 1.2 and ribosomal RNAs (rRNAs) using Barrnap version 0.923, 24. Generated file (.faa) was used for BLAST-P analyses against the Clusters of Orthologous Genes (COGs) database, and used for CRISPR-Cas identification25. Resistance genes were screened using ResFinder26. Other bioinformatic tools were also used such as AntiSMASH to search polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS)27. Circular maps of the two genomes were generated using CGView (Circular Genome Viewer) software. This Java application converts XML or tab-delimited input into a Vector Graphics format28.

Besides, phylogenetic trees of interest were generated with the FastME 2.1.6.1 software to highlight the position of each new bacterial strain among its closest relatives29. Digital DNA-DNA Hybridization (dDDH) values were calculated to check the difference between the genomes using the following website (https://ggdc.dsmz.de). Critical limit was set at 70% below which a prokaryotic species may be considered as new30. Orthologous average nucleotide identity (OrthoANI) version 0.93.1 was also used to calculate genomic similarities between studied species and their related taxa.

Ethics approval

The study was approved by the ethics committee of the Institut de Recherche Fédératif 48 under Authorization number 09-022 with the consent of the patients.

Results

Strain identification and phylogenetic analyses

The species names Culturomica massiliensis gen. nov., sp. nov., and Emergencia timonensis gen. nov., sp. nov., had been previously proposed for the new species mainly as representative strains Marseille-P2698T and Marseille-P2260T, respectively7, 8. As these previous descriptions were not exhaustive, we revisited the work by including phylogenetic, morphological, and genomic data.

Strain Marseille-P2698T had a 16S rRNA gene sequence similarity and a query coverage of 91.5% and 99% respectively with Odoribacter laneus strain YIT 12061T (Fig. 1). Strain Marseille-P2260T exhibited 16S rRNA gene similarity and a query coverage of 92.72%, and 100% respectively, with Eubacterium sulci strain ATCC 35585T (Fig. 1).

Figure 1
figure 1

Phylogenetic tree with the position of new species (strains Marseille-P2698T and Marseille-P2260T) among closely related species. The following phylogenetic tree was performed from the comparison of 16S rRNA sequences. The accession numbers of 16S rRNA gene are mentioned in parentheses. Bootstrap appears at the nodes. MUSCLE software was used to align sequences. The tree was designed with the MEGA-X software. The used methodology is the Maximum Likelihood method and Kimura 2-parameter model.

Phenotypic and biochemical characterizations

Growth of strains Marseille-P2698T and Marseille-P2260T occurred on 5% sheep blood-enriched Columbia agar (bioMérieux®), after 48 h of incubation at 37 °C in a strict anaerobic atmosphere. Optimal growth was obtained at pH 7. However, strain Marseille-P2260T did not tolerate NaCl, whereas strain Marseille-P2698T could grow with a NaCl concentration of 0.5%.

Strain Marseille-P2698T is a Gram-negative rod, strictly anaerobic, motile, and non-spore-forming with a size of 1.5–3 μm in length and 0.3 to 0.4 μm in diameter. It exhibits a positive catalase, but no oxidase activity. The colonies are circular, beige and from 0.7 to 1.2 mm in diameter.

Strain Marseille-P2260T is a Gram-positive, strictly anaerobic rod, ranging in length from 1 to 1.5 μm, and in diameter from 0.5 to 1 μm. It has no catalase or oxidase activity. Colonies of strain Marseille-P2260T are translucent with a diameter of 0.5 to 1 mm. The remaining cell characteristics of both strains compared to their closest relatives are summarized in Table 1. Using API® 50CH strips (bioMérieux®), positive reactions were obtained for both studied strains for glycerol, d-ribose, d-galactose, d-glucose, d-fructose, d-mannose, d-mannitol, d-sorbitol, N-acetylglucosamine, amygdalin, arbutin, esculin ferric citrate, salicin, d-cellobiose, d-maltose, d-lactose, d-saccharose, d-trehalose, d-melezitose, gentiobiose, d-tagatose. Using API® ZYM strips (bioMérieux®), positive activities were observed for esterase lipase (C8), leucine arylamidase, phosphatase acid, and naphthol-as-bi-phosphohydrolase for both studied strains. In contrast, phosphatase alkaline, esterase (C4), α-chymotrypsin, β-galactosidase, α-glucosidase, β-glucosidase, and N-acetyl-β-glucosaminidase were only positive for strain Marseille-P2260T. Using API® 20A strips, positive results for the two strains were obtained for d-glucose, d-mannitol, d-lactose, d-saccharose, d-maltose, salicin, esculin ferric citrate, glycerol, d-cellobiose, d-mannose, d-melezitose, d-sorbitol, l-rhamnose, and d-trehalose. However, strain Marseille-P2698T was positive to hydrolysis of gelatin, unlike strain Marseille-P2260T which was negative.

Table 1 Phenotypic characteristics of strains Marseille-P2698T and Marseille-P2260T compared with closely related species.

The most abundant fatty acids of strain Marseille-P2698T were 13-methyl-tetradecanoic acid (63%), 12-methyl-tetradecanoic acid (11%), and 3-hydroxy-15-methyl-hexadecanoic acid (8%). Several other branched structures, mainly iso, were also detected. Specific 3-hydroxy structures were detected, mainly branched as well. For strain Marseille-P2260T, the major fatty acid was hexadecanoic acid (39%) (Table 2).

Table 2 Cellular fatty acids composition of strains Marseille-P2698T and Marseille-P2260T.

Genomic properties and analyses

The whole genome of strain Marseille-P2698T was composed of 14 contigs, for a total size of 4,410,591 bp, with a G+C content of 43 mol% (Fig. 2). This genome contained 3679 genes, of which 3487 were protein-coding genes. In addition, 59 RNA sequences were also identified and distributed as follows: 8 rRNAs (three 16S, two 23S, and three 5S), 51 tRNAs, and 1 tmRNA.

Figure 2
figure 2

Circular genome map of strains Marseille-P2698T (left) and Marseille-P2260T (right) generated by the CGView software. From outside to the center: blue rings demonstrate the ORFs (Open Reading Frames) on both forward and reverse strands, green and red rings represent both positive and negative GC skew respectively, and black ring represents the GC content plot.

Genome from strain Marseille-P2260T was composed of 9 contigs, with a size of 4,661,482 bp, and a 45.8 mol% G + C content (Fig. 2). Genome annotation identified 4380 genes, of which 4288 were protein-coding genes. There were 56 RNA sequences including 5 rRNAs (one 16S, one 23S, and three 5S), 51 tRNAs, and 1 tmRNA.

Odoribacter laneusT, O. splanchnicusT, Butyricimonas synergisticaT, B. faecihominisT and B. virosaT exhibited genome sizes ranging from 3.77 to 4.81 Mbp. The closest bacteria of strains Marseille-P2698T and Marseille-P2260T are presented in Table 3. Eubacterium sulciT, E. infirmumT, A. terraeT, E. minutumT, and Aminipila odorimutansT exhibited genome sizes ranging from 1.73 to 4.66 Mbp.

Table 3 Summary of comparative genomes and characteristics for strains Marseille-P2698T and Marseille-P2260T.

Strain Marseille-P2698T shared dDDH values of 21.6% with Odoribacter laneusT, 20.6% with O. splanchnicusT, 30.6% with Butyricimonas faecihominisT, 21.8% with B. virosaT, and 18.9% with B. synergisticaT. Strain Marseille-P2260T exhibited dDDH values of 20.1% with Eubacterium sulciT, 21.1% with E. infirmumT, 23.1% with A. terraeT, 24.4% with E. minutum, and 21.1% with Aminipila odorimutans (Table 4).

Table 4 Comparative digital DNA-DNA Hybridization (dDDH) values (%) between studied bacterial genomes.

The phylogenetic relationships of strains Marseille-P2698T and Marseille-P2260T with relative strains, based on whole-genome sequencing, is represented in Fig. 3.

Figure 3
figure 3

Whole-Genome sequence-based phylogenetic tree of strains Marseille-P2698T and Marseille-P2260T using TYGS design tree with FastME 2.1.4 software based on Genome BLAST Distance Phylogeny (GBDP) parameters. Distances were calculated from the genome sequences, and branch lengths were calculated by GBDP distance formula d5.

CM. Culturomica massiliensis gen. nov., sp. nov., Marseille-P2698T. OL. Odoribacter laneusT. OS. Odoribacter splanchnicusT. BS. Butyricimonas synergisticaT. BF. Butyricimonas faecihominisT. BV. Butyricimonas virosaT. ET. Emergencia timonensis gen. nov., sp. nov., Marseille-P2260T. EM. Eubacterium minutumT. AT. Aminipila terraeT. AO. Anaerovorax odorimutansT. ES. Eubacterium sulciT. EI. Eubacterium infirmumT.

The obtained dDDH values were lower than the 70% threshold used for delineating prokaryotic species4. In addition, strain Marseille-P2698T exhibited OrthoANI values of 73.99% with O. laneusT, 72.22% with O. splanchnicusT, and 69.24% with B. synergisticaT. Strain Marseille-P2260T had an OrthoANI value of 67.65% with E. sulciT and 67.36% with E. infirmumT. These values were lower than 95%, also suggesting that strains Marseille-P2698T and Marseille-P2260T belonged to distinct species (Fig. 4).

Figure 4
figure 4

Heatmap and phylogenetic trees showing the average nucleotide identity based on calculated orthology (OrthoANI) of Marseille-P2698T (A) and Marseille-P2260T (B) compared to their closest related bacterial species.

Genes encoding proteins are divided into several categories according to their functions and others have unknown functions. COGs of strains Marseille-P2698T and Marseille-P2260T have different functional distributions from their closest species (Fig. 5).

Figure 5
figure 5

Distribution of functional classes of predicted genes expressing group of proteins that clusters according to functions.

Antibiotic resistance genes and defense mechanisms

Using the ResFinder software, antibiotic resistance genes (ARG) erm (F) and tet (Q) were detected within the genome of strain Marseille-P2698T, with identity percentages of 100% and 99.8%, respectively. For strain Marseille-P2260T, ARG included erm (B), tet (M), and tet (O), with identity percentage ranging from 99.7% to 100% (Table 5).

Table 5 Antibiotic resistance genes detected in the genomes strains Marseille-P2698T and Marseille-P2260T using ResFinder software.

The genomes of strains Marseille-P2698T and Marseille-P2260T contained CRISPR-Cas modules (Table 6). Strain Marseille-P2698T contains one defense mechanism composed of polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS) enzymes. NRPS-PKS was previously demonstrated to have a role in the biosynthesis of pharmaceutically-important natural products31. Using ANTI SMASH software, NRPS-PKS-like genes cluster had been detected in genome of strain Marseille-P2698T, and β-lactone; containing potent medicinal properties32, 33, had been detected in strain Marseille-P2698T genome. In addition, according to our results in ToxFinder-1.0 software detecting toxins34, our two strains did not possess toxin genes.

Table 6 Identification of CRISPR-Cas type and subtype genes in user-submitted sequence for strains Marseille-P2698T and Marseille-P2260T.

Conclusion

Culturomics combined with Taxogenomics allowed the isolation and the full characterization of two new bacterial species isolated from human intestine. Primitive phylogenetic comparisons were previously performed7, 8, while a thorough phylogenetic and genomic analyses were performed in our study. The results of phenotypic, biochemical, phylogenetic, and genomic analyses obtained for both studied strains proved that they belong to new species. Thus, we propose the creation of two new bacterial genera and new species, Culturomica massiliensis gen. nov., sp. nov., and Emergencia timonensis gen. nov., sp. nov..

Description of Culturomica gen. nov

Culturomica (Cul.tu.ro.mi’ca. N.L. fem, culturomica, referring to a new method of diversified bacterial culture). Cells are anaerobic, Gram-negative, motile and rod-shaped. The type species is Culturomica massiliensis gen. nov., sp. nov.. It was isolated from human feces.

Description of Culturomica massiliensis gen. nov., sp. nov

Culturomica massiliensis (mas.si.li.en’sis. L. fem. adj.massiliensis, form “Massilia”, the Latin name of Marseille, France, where the type strain was iso-lated). Bacterial cells are Gram-negative and rod-shaped. They are motile and non-spore-forming, with lengths ranging from 1.5 to 3.0 µm, and diameters from 0.3 to 0.4 µm. The optimal growth conditions are 37 °C, a strict anaerobic at-mosphere, and neutral pH. On 5% sheep blood-enriched Columbia agar, colonies appear small, circular, beige to white, and measure between 0.7 and 1.2 mm in diameter. They ferment d-galactose, d-glucose, d-fructose, d-mannose, inosi-tol, d-mannitol, d-sorbitol, N-acetylglucosamine, amygdalin, arbutin, esculin, salicin, d-cellobiose, d-maltose, d-glucose, glycerol, d-lactose, d-saccharose, d-trehalose, d-melezitose, Amidon, gentiobiose, d-tagatose, potassium gluconate and potassium 5-ketogluconate. Enzymatic activities for phosphatase al-kaline, esterase, esterase lipase, leucine arylamidase, α-chymotrypsin, phospha-tase acid, naphthol-as-bi-phosphohydrolase, β-galactosidase, α-glucosidase, β-glucosidase, and N-acetyl-β-glucosaminidase are present. The major cell wall fatty acids are C15:0 iso (63%), C15:0 anteiso (11%), and C17:0 3-OH iso (8%). The genome size from strain Marseille-P2698T is 4.41Mbp-long with a G+C content of 43 mol%.

The type strain Marseille-P2698T (CSUR P2698 = DSM 103,121) was isolated from a stool specimen of a 66-year-old man with diabetes mellitus.

The accession numbers for the genomic and 16S rRNA gene sequences of strain Marseille- P2698T are deposited in the GenBank database under references FLSN00000000 and LT558805, respectively.

Description of Emergencia gen. nov

Emergencia timonensis (e.mer.gen’cia N.L. fem. n., Emergencia, for emergence, in reference to the discovery of emerging human bacteria).

Description of Emergencia timonensis gen. nov., sp. nov

Emergencia timonensis (ti.mo.nen’sis. L. fem. adj., timonensis, from Timone, the name of a university hospital in Marseille, France, where the type strain was isolated).

Bacterial cells are Gram-positive rod-shaped and bacilli. They are motile, non-spore-forming, with lengths ranging from 1.0 to 1.5 µm, and a diameter of 0.5 to 1.0 µm. The optimal growth conditions are 37 °C, a strict anaerobic atmosphere, and pH.7. On 5% sheep, blood-enriched Columbia agar, colonies of strain Marseille-P2260T appear small, translucent, and measure from 0.7 to 1.2 mm in diameter. Cells ferment d-glucose, d-mannitol, d-lactose, d-saccharose, d-maltose, salicin, esculin ferric citrate, glycerol, d-cellobiose, d-mannose, d-melezitose, d-sorbitol, l-rhamnose, d-trehalose, glycerol, d-arabinose, l-arabinose, d-ribose, d-xylose, d-galactose, d-glucose, d-fructose, d-mannose, l-rhamnose, dulcitol, d-mannitol, d-sorbitol, Methyl-αd-glucopyranoside, N-acetylglucosamine, amygdalin, arbutin, esculin, salicin, d-cellobiose, d-maltose, d-lactose, d-melibiose, d-saccharose, d-trehalose, d-melezitose, d-raffinose, gentiobiose, d-turanose, d-tagalose, l-fucose, and potassium gluconate. In addition, enzymatic activities such as esterase lipase, leucine arylamidase, acid phosphatase, and naphthol phosphohydrolase are present. The major cell wall fatty acids are C16:00 (39%), C18:1n9 (16%), and C18:1n7 (14%).

The genome size from strain Marseille-P2260T is 4.66 Mbp with a G+C content of 45.8 mol%.

The type strain Marseille-P2260T (CSUR P2260 = DSM 101,844 = SN18) was isolated from a stool sample the feces of a healthy patient with an unremarkable medical history.