Abstract

The direct repeat (DR) region is a singular locus of the Mycobacterium tuberculosis complex genome. This region consists of 36 bp repetitive sequences separated by non-repetitive unique spacer sequences. Around this region there are several genes coding for proteins of unknown function. To determine whether the M. smegmatis, M. avium, M. marinum and M. leprae genomes contain sequences and ORFs similar to those of the DR locus of the M. tuberculosis complex, we analysed the corresponding regions in these species. As a first step, some conserved genes that flank the DR genes [Rv2785c (rpsO), Rv2786c (ribF), Rv2790c (ltp1 ), Rv2793c (truB), Rv2800, Rv2825, Rv2828, Rv2831 (echA16 ), Rv2838 (rbfA) and Rv2845 (proS )] were used as markers to locate the corresponding orthologues in M. smegmatis, M. avium, M. marinum and M. leprae in silico. Most of these M. tuberculosis marker genes have highly similar orthologues located in the same order and orientation in the other mycobacteria. In contrast, no orthologues were found for ORFs Rv2801–Rv2824, suggesting that these genes are unique to M. tuberculosis within the genus Mycobacterium.We observed that in M. smegmatis and M. avium, Rv2800 and Rv2825 are adjacent. This observation was experimentally confirmed by PCR. In conclusion, as the DR locus and the ORFs around it are absent in M. smegmatis and M. avium and, as it is possible that these species are older than M. tuberculosis, we postulated that the DR locus was acquired by the M. tuberculosis complex species or by an ancestor bacterium.