Candida vulturna was recently described as a member of the Candida haemulonii species complex that currently comprises nine species of which four are known as human pathogens: C. haemulonii (including the variety vulnera), Candida pseudohaemulonii, Candida duobushaemulonii and C. auris [1,2,3]. The type strain of C. vulturna was first described in 2016, originated from a flower, but since then several cases of C. vulturna invasive candidiasis were reported in Malaysian patients (Ratna Mohd Tap, personal communication; [3]).

Misidentification of C. haemulonii species complex members has been one of the major driving forces behind the global emergence of the multidrug-resistant pathogen C. auris (Fig. 1; [4]). As decreased amphotericin B and fluconazole susceptibilities have been reported for C. haemulonii species complex members, a reliable identification is of upmost importance for daily clinical practice [5]. Molecular detection and identification methods set foot in diagnostic mycology laboratories. Hence, there is a need for high-quality genome data to set up molecular tools to reliably detect and identify pathogens, to detect emergence of mutations related to antifungal resistance and other biological processes [6]. As visualized by the plethora of published C. auris qPCR assays [7], different sets of sibling species were used to develop the assays, which may result in incorrect identification and/or detection results.

Fig. 1
figure 1

Colony phenotypes of pathogenic members of the Candida haemulonii species complex. The colony phenotypes of Candida vulturna CBS 14366T (top), Candida haemulonii CBS 5149T (right), Candida duobushaemulonii CBS 7798T (bottom) and Candida pseudohaemulonii CBS 10004T (left) are depicted. Strains were inoculated onto Candiselect chromogenic agar (Bio-Rad, Marnes-la-Coquette, France) and incubated for 48 h at 35 °C

There are several genomes published of pathogenic C. haemulonii species complex members, mostly being C. auris [8], and single or few genomes of C. haemulonii [9], C. duobushaemulonii [10] and C. pseudohaemulonii [8, 11]. Unfortunately, C. vulturna has not been considered in previous genome sequencing projects. Therefore, we sequenced a high-quality genome of this emerging fungal pathogen.

Candida vulturna CBS 14366T was cultured in 10 ml yeast peptone glucose broth for 3 days at 25 °C on a rotary shaker (125 rpm). Strain identity was confirmed by standard ITS sequencing (GenBank MN330068; [3]); however, MALDI-TOF analysis (Brüker-Daltonics, Bremen, Germany) yielded repetitively a good, yet incorrect, hit with its sibling C. pseudohaemulonii.

An established cetyltrimethylammonium bromide (CTAB) DNA extraction protocol was optimized to yield high quantity and quality genomic DNA, as in detail described hereafter [12]. Biomass was collected by centrifugation at 4,000 rpm for 10 min (Centrifuge 5810R; Eppendorf, Hamburg, Germany), supernatant was decanted, and 1.5 ml CTAB-buffer (see [12]) containing 1 mg proteinase K (V3021, Promega, Leiden, The Netherlands) was added followed by 2 h incubation at 60 °C with periodically vortexing. The sample was cooled to room temperature, and 2 ml chloroform/isoamyl alcohol (24:1) was added and mixed by 5–10 min flipping. After centrifugation at 4,000 rpm for 10 min, the supernatant (~ 1200 µl) was collected in a 2.0-ml DNA low-binding tube (0030108078; Eppendorf). DNA was precipitated by adding 660 µl ice-cold 2-propanol (I9516; Sigma-Aldrich, Saint Louis, MO, USA) and mixed by flipping. DNA yield increased by overnight incubation at -20 °C. After centrifugation at 14,000 rpm (Centrifuge 5430; Eppendorf) for 10 min, supernatant was removed and the pellet was washed with 1 ml 70% ice-cold ethanol. The dried pellet was resuspended in 150 µl IDTE-buffer (10 mM Tris, 0.1 mM EDTA, pH 8; IDT, San Diego, CA, USA); more IDTE was added until the pellet dissolved completely. RNA was removed by adding 1 µl RNAse Cocktail Enzyme mix (AM2286; ThermoFisher, Waltham, MA, USA) per 100 µl sample and incubated for 1 h at 37 °C. DNA samples were again washed with chloroform/isoamyl alcohol and precipitated with 2-propanol as described above. DNA quality and quantity were measured in triplicate using Qubit and Nanodrop (both ThermoFisher); purity and integrity was checked on 0.8% agarose gel. DNA was stored at -20 °C until further use.

Library preparation was done by the ligation sequencing kit (SQK-LSK108; ONT, Oxford, UK) followed by the nanopore sequencing run on a MinION flow cell (FLO-MIN106; ONT) according to the manufacturer’s instructions.

Basecalling of raw data was performed using Guppy (v3.2.2 + 9fe0a78; parameters: –flow cell FLO-MIN106 –kit SQK-LSK108) [13]. A draft assembly was prepared using Canu (v1.8; parameters: genomeSize = 13 m -nanopore-raw) [14]. The raw reads produced by Guppy were re-mapped into the draft genome using minimap2 (v 2.17-r954-dirty; parameters: -L -ax map-ont) [15]. The draft assembly was polished twice, first with racon (v1.4.6; parameters: -m 8 -x -6 -g -8 -t 6) and, after manual inspection and curation, with medaka (v0.8.1-p; parameters: -m r941_min_high) [15, 16].

Assembly of the Candida vulturna CBS 14366T reads produced 8 scaffolds between 4.2Mbp and 300Kbp (N50 = 1,937,935 bp), and a shorter scaffold of 43Kbp whose best blast hit within the NCBI database is annotated as mitochondrial DNA (Candida auris JCM 15448 = CBS 10913T; accession AP018713) for a total of 12.9Mbp. The 9 scaffolds likely correspond to 8 chromosomes plus the mitochondrial genome; similar numbers have been reported for other pathogenic members of the C. haemulonii species complex [8,9,10, 17, 18]. Draft genome was analyzed with funannotate (v1.6.0-297abc4; https://github.com/nextgenusfs/funannotate). Genes (n = 5,560) were predicted ab initio and functionally annotated with Pfam (v32), InterProScan (v75), BuscoDB (Saccharomycetales_odb9) and eggnog (v5.0).

This whole-genome shotgun project has been deposited at DDBJ/ENA/GenBank under the BioProject accession number PRJNA560499. The version described in this paper is the first version. The Sequence Read Archive (SRA) accession number is SRR10142922, associated with the BioSample number SAMN12587626.