Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

RNA polymerases in strict endosymbiont bacteria with extreme genome reduction show distinct erosions that might result in limited and differential promoter recognition

  • Cynthia Paola Rangel-Chávez,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft

    Affiliation Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México

  • Edgardo Galán-Vásquez,

    Roles Formal analysis, Methodology, Supervision

    Affiliation Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Coyoacán, Ciudad de México, CDMX, México

  • Azucena Pescador-Tapia,

    Roles Data curation, Formal analysis

    Affiliation Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México

  • Luis Delaye,

    Roles Formal analysis, Validation, Writing – review & editing

    Affiliation Evolutionary Genomics Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México

  • Agustino Martínez-Antonio

    Roles Conceptualization, Funding acquisition, Project administration, Supervision, Writing – review & editing

    agustino.martinez@cinvestav.mx

    Affiliation Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México

Abstract

Strict endosymbiont bacteria present high degree genome reduction, retain smaller proteins, and in some instances, lack complete functional domains compared to free-living counterparts. Until now, the mechanisms underlying these genetic reductions are not well understood. In this study, the conservation of RNA polymerases, the essential machinery for gene expression, is analyzed in endosymbiont bacteria with extreme genome reductions. We analyzed the RNA polymerase subunits to identify and define domains, subdomains, and specific amino acids involved in precise biological functions known in Escherichia coli. We also perform phylogenetic analysis and three-dimensional models over four lineages of endosymbiotic proteobacteria with the smallest genomes known to date: Candidatus Hodgkinia cicadicola, Candidatus Tremblaya phenacola, Candidatus Tremblaya Princeps, Candidatus Nasuia deltocephalinicola, and Candidatus Carsonella ruddii. We found that some Hodgkinia strains do not encode for the RNA polymerase α subunit. The rest encode genes for α, β, β’, and σ subunits to form the RNA polymerase. However, 16% shorter, on average, respect their orthologous in E. coli. In the α subunit, the amino-terminal domain is the most conserved. Regarding the β and β’ subunits, both the catalytic core and the assembly domains are the most conserved. However, they showed compensatory amino acid substitutions to adapt to changes in the σ subunit. Precisely, the most erosive diversity occurs within the σ subunit. We identified broad amino acid substitution even in those recognizing and binding to the -10-box promoter element. In an overall conceptual image, the RNA polymerase from Candidatus Nasuia conserved the highest similarity with Escherichia coli RNA polymerase and their σ70. It might be recognizing the two main promoter elements (-10 and -35) and the two promoter accessory elements (-10 extended and UP-element). In Candidatus Carsonella, the RNA polymerase could recognize all the promoter elements except the -10-box extended. In Candidatus Tremblaya and Hodgkinia, due to the α carboxyl-terminal domain absence, they might not recognize the UP-promoter element. We also identified the lack of the β flap-tip helix domain in most Hodgkinia’s that suggests the inability to bind the -35-box promoter element.

Introduction

Until 2006, scientists thought that the minimum quantity of genes necessary to support life would be around 500. However, this view changed soon after when genomes from obligated endosymbiotic bacteria began to be published. Several of these genomes contained less than 500 genes, with extreme cases with less than 200 genes [1]. Some clues on how these minimal genomes managed to provide all the necessary functions to sustain life begin to emerge. For instance, transcriptome analysis of Buchnera aphidicola offers evidence that its genome has a limited ability to respond to environmental fluctuations [2]. Thus, gene expression in obligate endosymbiont is somehow stable and just active at basal levels. According to this, the density of promoter-like signals, characteristic of free-living bacteria, is not present in organisms exhibiting extreme genome reductions [3].

DNA transcription is an essential molecular process through which organisms decode genetic information into cellular functions [4]. RNA polymerase (RNAP) is the enzyme responsible for transcribing DNA to RNA. It consists of a multi-subunit protein complex present in all living organisms, from bacteria to eukaryotes [5]. In bacteria, the RNAP is responsible for synthesizing all RNAs, including messenger, ribosomal, transfer, and small RNAs. In free-living bacteria, the RNAP holoenzyme consists of six subunits (α2ββ’ωσ), encoded by five different genes (this includes two copies of the α subunit). The ordered assembly of these five proteins constitutes the holoenzyme with a molecular mass of around 400 kDa. Previous studies on E. coli found that the ω subunit is not essential for RNAP activity [6]. Later studies indicate that the ω subunit is absent in all endosymbionts analyzed herein [7]. The rest of the subunits are considered essential core components of RNAP (α2, β, and β’). They are well conserved in bacteria [5,8]. This RNAP core is catalytically active (transcribes DNA to RNA) but cannot initiate DNA transcription by itself. Instead, the RNAP core must bind to an additional subunit called the sigma factor (σ) to initiate transcription. The σ is responsible for recognizing and binding to gene promoters [8]. Once transcription begins, and after synthesizing a short fragment of RNA, the σ release and the RNAP core continue transcribing until it reaches a transcription terminator.

Since σ is responsible for binding and discriminating among gene promoters, it is common to find different types of σ. Those fall into two evolutionary families. One of these is called the σ54, which has a single member in free-living bacteria and is absent in endosymbionts. On the contrary, the σ70 family has several copies per genome (from 1 in endosymbionts to around 60 in free-living bacteria). One member of the σ70 family is also known as the "housekeeping σ factor," an essential gene present in all bacteria [9]. In E. coli, this housekeeping gene is precisely the σ70 or rpoD gene. One ortholog of this gene probably should encode the housekeeping σ in strict endosymbionts [7]. In E. coli, there are seven σ factors encoded in the E. coli genome, six of them correspond to the σ70 family (σ70, σ38, σ32, σ28, σ24 and σ19), and the remaining one corresponds to σ54. The architecture of transcription units and the consensus promoter sequences for each σ in E. coli was proposed previously [10].

A brief functional description of the RNAP essential subunits in E. coli

The α subunit.

In E. coli, rpoA encodes for the RNAP α subunit. This protein has a molecular weight of ~37 kDa with two folded domains (α carboxy and α amino-terminal domains, also known as α-CTD and α-NTD) connected by two flexible linkers [11]. This subunit performs three biological functions: i) it initiates the assembly of the RNAP complex through the interaction of its α-NTD with the β- and β’-subunits; ii) it participates in promoter recognition through the interaction of its α-CTD with the DNA UP promoter element, and iii) their α-CTD is also a target for the binding of many transcriptional activators. These transcription factors are bound in the following architectures: 1) where the α-CTD situates between a bound activator and the rest of the RNAP, 2) the α-CTD located upstream of a bound activator; 3) the α-CTD is flanking a bound activator, and 4) combinations of the above arrangements in the presence of multiple activators [1218]. For instance, dimers of the cAMP receptor protein (dual transcriptional regulator CRP) interact with one of the two α-CTD [14,15]. In addition, the integration host factor (IHF) interacts with α-CTD to activate XylR [16]. Also, FIS activates transcriptionally through the α-CTD that is β’-associated, linked to the DNA on the promoter distal side of FIS [17].

The β and β’ subunits.

rpoB and rpoC encode for the β and β’ subunits of RNAP in E. coli. The β subunit forms a pincer, called "clamp." The β’ subunit constitutes the other pincer. In between give place to a 27 Å-wide internal channel, where the catalytic site of the RNAP enzyme is located [18]. The β subunit binds to the α and β’ subunits through 24 and 73 amino acids distributed in the region between the amino acids 540 to 1340 [18]. The interaction of σ to the RNAP core is via the β subunit also, through their β flap-tip helix domain [19].

In the β’ subunit, the amino acids present in the region from 917 to 1361 are essential for their interaction with the β subunit. And their β’ coiled-coil domain for their interaction with the σ factor. Additionally, in the β’ amino-terminal domain, several functional amino acids, typical of Zinc fingers, form the RNAP catalytic site and stabilizes the RNAP-DNA complex. The active site is composed of around 500 amino acids retaining two Mg2+ ions. However, only thirteen amino acid residues are the most conserved conforming to the catalytic site containing three aspartic amino acids. On the other hand, in the β’ CTD domain, three polar residues and the G-loop domain conform to the cavity where the DNA fits and contact the RNAP [20].

Once assembled the RNAP core (α2ββ’), σ joins it by interacting with the β and β’subunits. First, σ binds to β’ through the β’ coiled-coil domain. Such binding places σ to contact the -10-box element to form the open promoter complex [21]. Likewise, the β subunit binds to the σ through the β flap-tip helix domain at the σ4 region (see below). This contact gives σ the capacity to adapt to variation in nucleotide distances between the -10 and the -35-box promoter elements [19].

The σ subunit.

The σ factor is not a permanent component of the RNAP core but a transitory-associated subunit. The σ factor is necessary for promoter recognition and transcription initiation. In E. coli, σ70 consists of a protein with four helical domains (σ1, σ2, σ3, and σ4). Each of them interacts with different promoter elements and domains of the different RNAP core subunits. The σ1 domain prevents σ70 from interacting with the DNA strand without a complete RNAP core [22,23]. The σ2 domain is the most conserved of the σ70 family and consist of four subdomains: σ2.1, σ2.2, σ2.3, and σ2.4. The σ2.1 and σ2.2 subdomains are involved in the binding to the RNAP core [24]. The σ2.2 subdomain contains sites for binding the β’ coiled-coil domain. In contrast, the σ2.3 subdomain participates in DNA melting. It has seven conserved aromatic amino acids, whose replacement results in defects in DNA melting [25,26]. The specific recognition of the -10-box promoter element is a function of the σ2.4 subdomain. The amino acids involved in DNA melting and binding of the -10-box promoter element interact on the same DNA helices’ faces. Deletion analyses have determined that the σ2 domain seems necessary for the correct functioning of the RNAP [26,27]. The σ3 region interacts with the -10-box extended promoter element. It stabilizes the short nascent DNA-RNA hybrids during the early stages of gene transcription [28] because of their interaction with the 5’-triphosphate of the nascent RNA [29]. Finally, the σ4 domain is formed by two subdomains, σ4.1 and σ4.2. The σ4.1 subdomain interacts with the β flap-tip helix of the β subunit to allow their correct binding to the -35-box promoter element. More specifically, the amino acids R518 and R516 in σ4.2 recognize the guanine and cytosine at the -34 and -32 nucleotide positions concerning the transcription start (+1) [27]. Additionally, the σ4.1 subdomain is a point of contact with transcriptional activators that bind upstream of the -35-box promoter element. One study reports that the binding of the σ4.1 subdomain of σ38 with the β flap-tip helix domain is more potent than with this subdomain in σ70 [30].

Endosymbiotic bacteria

Endosymbiotic bacteria live inside other organisms (usually eukaryotes). These show genomic features resulting from several million years of co-evolution [31]. Strict endosymbiotic bacteria cannot outside their host and have lost most of their genes and large fragments of amino acids in their remaining proteins [7,10]. The comparison of E. coli genome, with approximately 4.5 thousand genes to the endosymbiotic bacterium Candidatus Carsonella ruddii, which has retained just around 150 genes, exhibits this dramatic process [32,33]. In addition, almost all genes found in Candidatus Carsonella ruddii are considerably shorter than their free-living orthologues.

Furthermore, genes encoding proteins with multiple domains in obligate endosymbionts commonly have lost some regions or complete domains. These, in some cases, are essential for their activity in free-living bacteria [34,35]. For instance, in previous work, we determined that obligate endosymbionts had lost all the transcription factors that interact with the promoters and RNAP to activate or inhibit gene expression [36]. Additionally, partial loss of the α and σ subunits is present in Candidatus Hodgkinia sp. and Candidatus Carsonella ruddii [35,37]. Both are considered essential components of the RNAP in free-living bacteria.

Gene erosion of the transcriptional machinery in endosymbionts with highly reduced genomes raises the critical question of how gene transcription could be happening in these bacteria. Here, we address this question from the perspective of comparative genomics. With this purpose, we investigated the RNAP subunits in four bacterial lineages of obligate endosymbionts: Candidatus Hodgkinia cicadicola, Candidatus Tremblaya phenacola, and princeps, Candidatus Nasuia deltocephalinicola, and Candidatus Carsonella ruddii, which exhibit extreme genome-reduction, within the proteobacteria phylum.

Materials and methods

Genome data of selected endosymbiont

To recover sequenced genomes, we use the NCBI database (http://www.ncbi.nlm.nih.gov/). In this database, we identified 37 endosymbionts with extreme genome reduction. All these pertain to the four lineages of proteobacteria. Candidatus Hodgkinia cicadicola belongs to α-proteobacteria. Candidatus Tremblaya phenacola; Candidatus Tremblaya princeps; and Candidatus Nasuia deltocephalinicola, which belong to β-proteobacteria. Moreover, Candidatus Carsonella ruddii, which belongs to γ-proteobacteria. The genomes of these bacteria are completely sequenced. Further characteristics of these genomes are in the supplementary material S1 Table.

Amino acid alignment of the RNAP subunits in strict endosymbiotic bacteria and their comparison to E. coli

We compared the amino acid sequences for each RNAP subunit against their corresponding E. coli orthologous. We use the T-Coffee program in a multiple sequence alignment with standard parameters [38]. For simplicity, the amino acid positions will from now on be referred to their location in the corresponding E. coli protein or subunit, accompanied by the abbreviation "ECO." With this alignment, we define domains, subdomains, and relevant amino acids. We also utilized the Pfam database [39], the proteins superfamily classification [40], and the NCBI’s conserved domains [41]. In addition, we did bibliographic research to gather relevant information regarding the sites and amino acid regions for each of the RNAP [1113,1821,2328,30,4245].

Structural analysis of the RNAP subunits and inference of their 3-D structural-functional models

We first recreated 3-D structural models for each RNAP subunit using the I-TASSER server with standard parameters [46]. I-TASSER generates 3-D models for a given sequence by collecting high-score structural templates from PDB (Protein Data Bank) with full-length atomic models constructed by iterative template-based fragment assembly simulations. The structural model of Candidatus Hodgkinia TETUND2 and Dsem strains was obtained from their homologous proteins in E. coli by using the multiple threading alignments.

Then we superpose the resulting 3-D models with the crystal structure of the RNAP ECO (4YLP) [47]. Graphic representations of each structure were prepared with the PyMOL Molecular Graphic System software version 1.3 [48]. For the interaction of RNAP with the promoter sequence, we used the predicted models of the RNAP subunits obtained by I-TASSER for Candidatus Hodgkinia Dsem and TETUND2. We did a structural alignment with the homologous structure of the holoenzyme RNAP of E. coli (ECO 4YLP) in PyMOL.

We find that Candidatus Hodgkinia TETUND2 has lost a fragment of the α-NTD involved in dimer formation. To investigate the possible dimer association of α subunit monomers in this bacterium, we used ClusPro v.2.0 [4952]. This tool is an automatic protein docking tool based on CAPRI (Critical Assessment of Predicted Interactions) [50,53]. As a result, we get three models of α dimer subunits with different cluster sizes. We map sites under putative positive selection with these models and evaluate free-energy changes in the protein-protein interactions. With this strategy, we recreated single mutations along with the amino acids positively selected using the BindProfX server [54]. Following this, we exchanged the putative positive selected amino acid on the dimers predicted in the different strains of Hodgkinia. Then we determined the changes in protein binding affinity. The binding affinities between each pair of proteins were measured as Gibbs free-energy change ΔG = G (complex) -G (monomers). When two monomers form a complex, the more negative a ΔG is, the more stable the complex results. Finally, we calculate the effect of mutations on binding affinity by the differences in free-energy changes between the mutant and the wild type ΔΔGwt->mut = ΔGmut-ΔGwt. The criteria to consider a strongly favorable mutation was to have ΔΔG≤-1kcal/mol.

Natural selection analysis of the RNAP subunits from endosymbionts

To understand the putative mechanisms of RNAP subunits’ molecular evolution, first, we need to know the selective pressure acting on each of the subunits of this protein. For this purpose, we estimated the DN/DS ratio (ω) on the protein-coding sequences studied here. The ratio considers non-synonymous substitutions per non-synonymous sites (DN), divided by the number of synonymous substitutions per synonymous sites (DS). The DN/DS ratio can result in three evolutionary processes: i) if DN/DS < 1, we infer purifying selection; ii) if DN/DS > 1, then we infer positive selection; and iii) if the DN/DS = 1, it indicates a neutral evolution [55]. We perform these selection analyses to RNAP subunits in each group of bacteria.

Natural selection was inferred with CodeML from PAML v.4.6 package [56]. This software requires codons alignment. For those, we used PAL2NAL v.2.1.0 program [57]. To graph the phylogenetic trees, we use PhyML [58]. We previously aligned the set of amino acid sequences of each subunit with T-Coffee [38]. We used Gblocks to recover the informative codons [59]. To identify specific genes and amino acids under positive selection, we use branch and branch-site models. In the case of a branch, we use three models: "M0" one-ratio model (DN/DS0), free model (DN/DS1), and two-ratio model DN/DS2. The DN/DS0 model assumes the same DN/DS ratio for all the branches. The DN/DS1 assumes an independent DN/DS for each branch. The DN/DS2 assumes that the branch of interest (foreground branch) has a DN/DS2 ratio different than the background ratio [60].

The level of significance for the Likelihood Ratio Test (LRT) was estimated using the x2 distribution with degrees of freedom (df). These degrees of freedom are equal to the difference in the number of parameters between the models. The statistic considers twice the difference of log-likelihood between the models (2ΔlnL = 2[lnL1-lnL0]): where L1 and L0 are the likelihoods for the alternative and null models [61]. We compared one-ratio and free-ratio models to know whether DN/DS were different among the lineages. In contrast, we examine whether the lineage of interest has a different ratio than the other lineages with one-ratio and two-ratio models.

We approached a model where the DN/DS ratio was 1, 0.2, and 1.2 for the foreground branch to detect positive or negative selection in specific lineages. First, we compared DN/DS2 against the DN/DS = 1, where the null hypothesis is that models are not significantly different. Suppose the null hypothesis is rejected (p<0.05) and the two-ratio model is greater than 1. In that case, it indicates the possibility of positive selection in the foreground. Otherwise, if the two-ratio model estimate is smaller than 1, it is indicative of negative selection.

On the other hand, if the null hypothesis is accepted, it is evidence that the foreground branch is under neutral evolution. Additionally, we compared DN/DS2 against the DN/DS = 0.2; the null hypothesis is that models are not significantly different. For example, suppose the null hypothesis is (p<0.05), and the two-ratio model estimate is more significant than DN/DS = 0.2. It indicates a weaker negative selection, while a value smaller than DN/DS = 0.2 indicates a more substantial negative selection. Finally, when DN/DS2 is greater than one and is significantly different from DN/DS = 1, it means positive selection. To get more evidence about the likely positive selection, we compared DN/DS2 against DN/DS = 1.2; the null hypothesis was a non-significant difference between DN/DS2 and DN/DS = 1.2. Therefore, accepting the null hypothesis indicates that the foreground branch is possibly under positive selection. At the same time, a rejection means that the foreground branch might be subjected to a relaxed selection.

We performed a branch-site test for positive selection to identify individual codons under positive selection along specific branches [62]. In these models, positive selection was allowed on a particular "foreground" branch. We compared the LRTs (df = 1) against null models that assume no positive selection is happening. This test results in four classes of sites: 0, 1, 2a, and 2b. For the site classes 0 and 1, all codons are under purifying selection (0< DN/DS0<1) and neutral evolution (DN/DS1 = 1) for all branches. For sites in classes 2a and 2b, positive selection is allowed on the foreground branches (DN/DS2>1). For the rest, the "background branches" are under purifying selection (0<DN/DS0<1) and neutral evolution (DN/DS1 = 1). For the null model, DN/DS2 is 1. We test all the RNAP subunits in each endosymbiont in these ways. Each branch is considered as the foreground to reconstruct the phylogenies. We compared the two models using LRT. The calculus of significance between the models was twice the log-likelihood difference following an x2 distribution. With a df number equal to the difference of the number of parameters between the models. Positively selected amino acids were identified based on Empirical Bayes, and posterior probabilities were employed in CodeML [63]. We did not test the Nasuia RNAP subunits because there were only three sequenced strains. However, CodeML requires at least 4 to get reliable results.

Results

To study the evolution of RNAP subunits in genomes exhibiting extreme reduction, first, we identified orthologous to the E. coli RNAP in the 37 obligate endosymbiotic bacteria (Fig 1A and S1 Table). The initial genomes included nine strains of Candidatus Carsonella ruddii, seventeen of Candidatus Hodgkinia cicadicola, only one of Candidatus Tremblaya phenacola, seven of Candidatus Tremblaya princeps, and three of Nasuia deltocephalinicola.

thumbnail
Fig 1. Conservation of RNAP subunits in endosymbiotic bacteria showing extremely reduced genomes.

a) The bars represent the size (in amino acids) of RNAP subunits found in each endosymbiont; the different blue colors represent the relative size contribution of each RNAP subunits. b) Colors squares represents the degree conservation of functional domains in the RNAP subunits of Candidatus Tremblaya princeps (lime green), Candidatus Tremblaya phenacola (dark green), Candidatus Hodgkinia cicadicola (magenta), Candidatus Nasuia deltocephalinicola (yellow), Candidatus Carsonella ruddii (orange), and E. coli (olive green).

https://doi.org/10.1371/journal.pone.0239350.g001

Nine of the seventeen Hodgkinia strains (52%) lack α subunits but were considered in the remainder analyses for the rest of the RNAP subunits. The rest (28 genomes) conserve orthologous genes for each of the α, β, β’ subunits, as well as a single gene coding for the σ factor. The total amino acid sequences encoding for each of the RNAP subunits exhibit a reduction of 16% on average, compared to those in E. coli. Also, we observed that these genes had lost DNA regions encoding important functional protein domains in E. coli. In some cases, with the loss of total domains (see below) (Fig 1B). In the following sections, we describe the structure and amino acid diversity found in each of the RNAP subunits of these endosymbiotic bacteria.

The α-NTD is more conserved than α-CTD in the α subunit

In Carsonella and Nasuia, their α subunits conserved all the functional domains known in E. coli. At the same time, Hodgkinia and Tremblaya mostly retain the α-NTD (for self-homodimerization and the interaction with β and β’ subunits) (Fig 2A and S1 Fig). In vitro and in vivo experiments revealed that the α-NTD is essential for RNAP to get basal transcription [64,65]. In addition, studies have shown that the α-CTD is not necessary for RNAP assembly and basal transcription. However, the α subunit requires α-CTD to interact with the UP-promoter elements and transcriptional activators in E. coli [66,67]. The loss of the α-CTD is also present in Parcubacteria. These are ectosymbiont bacteria that live in mixed groups [68]. Lack of α-CTD is also present in microalgae chloroplasts [69].

thumbnail
Fig 2. Domains and amino acid conservation in the α-subunit.

a) The white and grey rectangle represents the α-NTD and α-CTD. Dark blue represents the α subunit dimerization region. The regions to form the H1 and H3 helices are shown in black rectangles. b) The amino acid alignment of the α subunits shows the conserved, functional amino acids involved in the dimer formation and the H1 and H3 α-helices. Therefore, darker backgrounds are showing those amino acids that differ from (E. coli). c) 3-D E. coli α-NTD (4YLP) crystal structure shows red regions absent in Candidatus Hodgkinia TETUND2. d) The predicted 3-D model obtained for the α subunit of Hodgkinia TETUND2. e) Structural comparison between the E. coli α-NTD (4YLP) crystal structure (violet) and the predicted 3-D model obtained for the α subunit of Hodgkinia TETUND2 (black). In red, it shows the α-NTD regions absent in the α subunit of Hodgkinia TETUND2.

https://doi.org/10.1371/journal.pone.0239350.g002

Two Tremblaya princeps strains, TPPLON1 and TPPMAR1, showed incomplete α-NTD. However, they preserved the essential regions for homodimerization and those for interaction with the β and β’ subunits. A particular case is Candidatus Hodgkinia cicadicola TETUND 2. This strain conserves just the region of α-NTD for interaction with the β’ subunit and only some amino acids for homodimerization (Fig 2B, red line). It is necessary to mention that the α-NTD consists of two well-conserved subdomains. Subdomain 1 contains two orthogonal α-helices (H1 and H3) called the homodimerization region (Fig 2C). Subdomain 2 includes the interfaces for interactions with the β and β’ subunits [4143]. The first step towards RNAP core formation is the homodimerization between two monomers of α subunits. This dimer proceeds by the interaction of H1 and H3 helices in the subdomains 1 of each monomer. The 3-D predicted model of this subunit of Candidatus Hodgkinia TETUND2 shows that it does not conserve the H1 and just some amino acids of H3 helices and other motifs necessary to form the dimers interface (Fig 2D and 2E). Based on E. coli, we cannot infer if homodimerization is happening in the α subunits of Candidatus Hodgkinia TETUND2. The rpoA is not the unique case of genes losing a significant fragment in Candidatus Hodgkinia cicadicola TETUND 2. The DNA gene that encodes the ε subunit of DNA polymerase III has also lost large fragments. In such a way that it is no more considered a functional protein [33]. Finally, rest the nine Hodgkinias strains that lack a complete α subunit. The authors who reported these genomes say these bacteria were the most prevailing among several coexisting strains, all with fragmented genomes [66]. These nine bacteria conserve the other RNAP subunits, but it is difficult to infer if their RNAP remains functional.

Strict endosymbionts conserve the β and β’ subunits except for the β flap-tip helix domain in Hodgkinias

We identified that the β- and β’-subunits are the most conserved among all the RNAP subunits in these endosymbionts (Figs 3A and 4A). The reason could be their critical role in the RNAP complex formation and activity. Furthermore, all the endosymbionts preserve the catalytic core and its assembly domains within the β- and β’-subunits. Nevertheless, these present some changes in the domains involved in the binding with the σ and with other RNAP core subunits (S2 and S3 Figs).

thumbnail
Fig 3. Amino acids sequence conservation in the β subunit.

a) The upper figure represents the structural domains of the β subunit in E. coli. Besides the β flap-tip helix domain (grey region), the interaction regions with the two α and the β’ subunits (orange and pea-green). b) Amino acid alignment shows that most Hodgkinias lost the β flap-tip helix domain (red box). Also observed in Candidatus Zinderia cicadicola (blue box). c) Crystallographic structure of the E. coli β flap-tip helix and their interaction with σ4 (4YLP grey and blue). d) Predicted 3-D model for the β and σ subunits of Hodgkinia Dsem shows the interaction between the β flap-tip helix (grey) and the σ4 subdomain (blue). e) The predicted model for the β and σ subunits of Hodgkinia TETUND2 shows that, like the rest of Hodgkinias, the β flap-tip helix (grey) is not present. As a result, the interaction between the β subunit and the σ4 subdomain (blue) might be deficient.

https://doi.org/10.1371/journal.pone.0239350.g003

thumbnail
Fig 4. Amino acids sequence conservation in the β’ subunit.

a) The figure in the upper part represents the position of functional domains in the β’ subunit, such as the β’ coiled-coil (brown), Zinc fingers (orange), the catalytic site (yellow), the G-loop (red), and the DNA-binding site (olive). The figure also shows the β’ interaction interface with the β subunit (beige) b) The alignment of β’ subunits shows substitution in the essential amino acids ECO: R275, E295, and A302 in the β’ coiled-coil domain. Darker backgrounds show those functional residues that differ from the reference E. coli. c) The crystallographic structure of the β’ coiled-coil domain of E. coli (brown). It shows the amino acid residues (in yellow) involved in the interaction with the σ2.2 domain (magenta). d) Predicted 3-D model for the β’ subunit in Candidatus Hodgkinia Dsem shows that the β’ coiled-coil domain could occur (brown). The amino acids A287, G312, and S319 (yellow) are involved in the interaction with the σ2.2 domain (magenta). e) Predicted 3D model for the β’ subunit in Hodgkinia TETUND2 shows that the β’ coiled-coil domain formation could also occur (brown region). The amino acids K268 and Q292 (yellow) could interact with the σ2.2 domain (magenta).

https://doi.org/10.1371/journal.pone.0239350.g004

The β flap-tip helix domain is incomplete in most Hodgkinia strains (Fig 3B, red box). This loss is evident in the comparisons of Candidatus Hodgkinia Dsem and TETUND2 with the 3-D crystal structure of the β-σ subunits complex in E. coli (Fig 3C–3E). Unlike Candidatus Hodgkinia Dsem, we can observe that TETUND2 does not present a complete β flap-tip helix domain (Fig 3D and 3E). These changes in the β flap-tip helix suggest that the RNAP core in these Hodgkinia cannot bind to the σ4 domain. Consequently, the σ factor should not bind to the -35-box promoter element properly. Mutants in E. coli lacking the β flap-tip helix result in an inability of the σ subunit to attach to the -35-box promoter without affecting the RNAP core to bind to DNA. Furthermore, these mutants adequately recognize the -10-box and the -10-extend promoter elements [19]. The absence of the β flap-tip helix domain, although not observed in bacteria, is common in archaea [70].

Candidatus Tremblaya phenacola PAVE and Carsonella ruddii CE conserve the β’ coiled-coil domain in the β’ subunit. In contrast, the rest of the endosymbionts display substitutions in at least one of the three necessary amino acids in the β’ coiled-coil (Fig 4B in brown, and S3 Fig). In vitro studies involving single amino acid substitutions in the β’ coiled-coil domain in the three residues ECO: R275Q, E295K, and A302D result in a deficient holoenzyme formation and a subsequent lack of promoter specificity [26]. Unlike Hodgkinia Dsem, the rest of Hodgkinias strains show the same substitutions in the two positions ECO: 275 and 302, and a deletion in the residue ECO 295. Furthermore, we observed that these substitutions could not affect the β’ coiled-coil domain formation in the re-created 3-D structures. However, the exposed residues and the orientation for the interaction with the σ2 domain are different from those in the E. coli β’ coiled-coil domain (Fig 4C–4E).

The σ subunit shows the most erosive evolution in these endosymbionts

σ is the subunit that exhibits the most differentiated conservation among endosymbionts with extreme genome reduction (Fig 5A and 5B). In the case of Candidatus Tremblaya phenacola PAVE, it conserves whole the σ2 and σ4 domains. On the other side, Candidatus Hodgkinia, Nasuia deltocephalinicola, and Carsonella ruddii conserve the σ4 and, to a lesser extent, the σ2 domain (Fig 5B). The main variations happen inside the σ2 domain, whose amino acids interact with the -10-box promoter element and define the promoter specificity (Fig 5B, magenta, and orange amino acids).

thumbnail
Fig 5. Domain conservation in the σ subunit and their variations in DNA-promoter recognition.

a) The upper part shows the distribution of functional domains in the σ subunit: σ2 (red), σ2.1 (pink), σ2.2 (magenta), σ2.3 (violet), σ2.4 (orange), σ3 (green) and, σ4 (light blue), σ4.1 (cyan) and σ4.2 (dark blue). b) Amino acids alignment of σ subunits shows variations in the domains σ2 and σ4. Darker backgrounds show the functional residues that differ from those in the reference organism (E. coli). c) Crystallographic structure of the E. coli σ subunit bound to DNA (4YLP). The σ2.2, σ2.3, σ2.4 subdomains and the σ3 domain are shown in magenta, violet, orange, and green, respectively. d) Predicted 3-D model of the σ subunit of Hodgkinia Dsem showing the subdomains and amino acids involved in recognizing and binding to the DNA. e) Predicted 3-D model of the σ subunit of Hodgkinia TETUND2 shows the subdomains and amino acids involved in recognizing and binding to DNA. The colors in d) and e) are the same that the homologs corresponding domains in c) for E. coli.

https://doi.org/10.1371/journal.pone.0239350.g005

For the σ2.2 subdomain, the Hodgkinia strains exhibit substitutions in all the four amino acids involved in the interaction with the β’ coiled-coil domain. The seventeen strains have the same substitutions for ECO E407A and 14 of them in ECO N409R. At the same time, each presents different amino acids at the positions ECO: 403 and ECO 406 (Fig 5B, magenta). Mutagenesis on these three amino acids in σ2.2 has shown that they cause just a weakening in their binding to the β’ coiled-coil domain [21]. Besides, thermal denaturation experiments indicate that these mutants folded differently concerning the E. coli wild type. Suggesting that these mutations’ principal effect is the allosteric regulation of the subdomains σ2.3 and σ2.4 who participate in DNA-melting and recognition of the -10-box promoter element [21]. On the other hand, all the endosymbionts have distinct variations in the σ2.3 subdomain. However, all of them conserve the essential aromatic residues necessary for DNA melting and the correct folding of the σ2 domain (Fig 5B, violet).

The σ2.4 subdomain presents substitutions on the amino acids involved in recognizing nucleotides at position -12 of the -10-box promoter element. Hodgkinia strains show a different amino acid at position ECO: 437. It consists of histidine instead of glutamine. Likewise, Candidatus Carsonella ruddii PV, PC, HT, HC, DC, YCCR, and BC have replaced the ECO: T440 with leucine or isoleucine. Tremblaya strains and Candidatus Tremblaya phenacola PAVE, conserve these two amino acids (ECO Q437 and T440) (Fig 5B, orange). Previous works studied punctual mutations in these regions of σ. More precisely, in the subdomains σ2.4 of E. coli σ70 and SigA from Bacillus subtilis (homologous to σ70). Changes in amino acids at these positions affect the specificity for their respective promoters [25]. In E. coli, substitutions in ECO: Q437H and T440I of σ70 result in conserving the capacity to recognize the nucleotide at the -12 position. However, promoters with cytosine in this position were significantly better (in specificity) than with another nucleotide [21]. Compared with E. coli, the substitutions observed in the subdomains of the σ2 domain seem not to affect the conformation of this domain. Nevertheless, the amino acids responsible for contacting the promoter can differ from those in E. coli (Fig 5C–5E).

The σ3 domain is well preserved in Tremblaya phenacola PAVE. However, it presents amino acid substitutions in other endosymbionts and is absent in Carsonella ruddii. In Hodgkinia, this domain is partially present in the Dsem strain, although more conserved in the rest (Fig 5D and 5E). The partial or total loss of the σ3 domain might suggest a deficient or null binding of σ at the -10-box extended promoter element.

Most of the endosymbionts show strong purifying selection on the RNAP subunits core

We made two selection pressure analyses to investigate the effects of amino acid substitutions observed in genes encoding for the RNAP subunits. First, considering that bacterial endosymbionts are subject to an accelerated rate of molecular evolution [71]. We estimated the ratio of non-synonymous to synonymous substitutions (DN/DS) using phylogenetic codon-substitution models (S2 Table).

Candidatus Hodgkinia cicadicola.

The Candidatus Hodgkinia cicadicola shows that their rpoB (β subunit) and rpoC) were subject to purifying selection. With lower DN/DS values (0.3), most of the nucleotide substitutions in these genes were synonymous. On the other hand, the rpoA (α subunit) genes had a higher purifying selection in most Hodgkinia strains (DN/DS<0.2). In contrast, Tetund 2, TETLON, and TETMLI1 strains present neutral selection (DN/DS = 1). Finally, the rpoD gene (σ factor) shows an increased DN/DS value (DN/DS<0.5). It means that the purifying selection is less rigorous in some Hodgkinia strains. This not uniform selection pressure could result in greater diversity in this subunit (Fig 6).

thumbnail
Fig 6. Selective pressure on the RNAP subunits by branch analysis.

The figure shows the relationship DN/DS for each subunit. Most genes are under negative selection, and only in few cases, they display values greater than 1 (represented as 1.2). However, this does not mean that they are under positive selection (level of significance greater than 0.05); in fact, the RNAP subunits show a relaxed selection in all the cases.

https://doi.org/10.1371/journal.pone.0239350.g006

Candidatus Tremblaya.

The strain PAVE, rpoA, rpoB, rpoC, and rpoD genes had ω values less than 0.02. Conversely, PCVAL shows a neutral selection for all the subunits (DN/DS = 1). So, a generalized relaxation of selective pressure, to a different extent, is present in these strains (Fig 6).

Candidatus Carsonella ruddii.

The rpoB and rpoC genes in Candidatus Carsonella ruddii strains are more conserved than the other RNAP subunits. They had DN/DS values less than 0.2 (except for DC and YCCR strains). The rpoA gene had a neutral selection in five strains (CE, CS, DC, HT, and YCCR) and a strong purifying selection in the rest of the Candidatus Carsonella strains (DN/DS<0.2). On the contrary, the rpoD gene had a neutral selection except for the BC strain (DN/DS <0.01). This neutral selection might explain why the σ factor contains more amino acid variations concerning the other RNAP subunits (Fig 6).

Positively selected amino acids are present in the α-NTD of the α subunit in the Hodgkinia strains

Positively selected amino acids are present in the α-NTD of Candidatus Hodgkinia Dsem and TETUND2 (S3 Table). We mapped the selected amino acids with a high level of support (BEB p>0.95) in the structural models of the α subunit of Candidatus Hodgkinia TETUND2 and Dsem (Fig 7C, red amino acids, and S4 Fig, respectively). In Candidatus Hodgkinia Dsem, the selected amino acids V55, Q95, and H115 (S4 Fig) would be necessary for the correct folding of the α-NTD. In Candidatus Hodgkinia TETUND2, the amino acid residues V68, S69, and E70 allow adopting a similar structure to maintain the α subunit interactions with the β and β’ subunits in E. coli (Fig 7A and 7C).

thumbnail
Fig 7. Structural comparison between the E. coli and the Candidatus Hodgkinia TETUND2 α-subunits.

a) 3-D structure of monomers and b) for homodimers of the α-NTD subunit in E. coli (4YLP). c) 3-D model of α-subunit monomer in Candidatus Hodgkinia TETUND2 with the amino acids under positive selection in red and the H116 in green. d), e), and f) show predictions of the α-subunit homodimer formation in Candidatus Hodgkinia TETUND2. The amino acids under positive selection are in red and yellow in each monomer, H116 in green. White regions in a) and b) structures are not conserved in the α-subunit of Candidatus Hodgkinia TETUND2. The amino acids in blue, dark, and turquoise are necessary for the RNAP core formation. In the b) structure, the dark green amino acids are the same as the blue in a). S1 and S2 indicate subdomains 1 and 2 in the α-NTD. Moreover, the letter A and B correspond to each monomer that forms the homodimer.

https://doi.org/10.1371/journal.pone.0239350.g007

As previously mentioned, the Candidatus Hodgkinia TETUND2 α subunit has lost part of the α-NTD involved in the homodimer formation. Therefore, we evaluated in silico if the Candidatus Hodgkinia TETUND2 α subunit can still form the homodimer, essential for RNAP core formation. We obtained three models with different clustered amino acids involved in homodimer formation (Fig 7D–7F). First, we mapped the sites under positive selection (Fig 7D–7F, amino acids red and yellow). Then, we performed in silico amino acid substitutions of the sites under positive selection. It was changing them by amino acids present in the same locations of different Candidatus Hodgkinia strains and E. coli (Fig 7B). We observed a substitution of histidine 116 by proline in the three models. This substitution changed the homodimer formation to be energetically unfavorable (S4 Table). Although H116 seems not under positive selection, it would have an essential role in stabilizing this strain’s homodimer formation.

Variations on the conservation of the subunits involved in promoter recognition are independent of the CG content

Endosymbionts with reduced genomes carry out variable proportions of GC in their genomes. For example, while Candidatus Carsonella and Candidatus Nasuia contain less than 18% GC, Candidatus Hodgkinia and Candidatus Tremblaya contain above 40%. We want to know if such variations of GC content in genomes relate to changes in σ factors. Then, we carried out a comparative analysis that involved a phylogenetic tree (S5 Fig). We include the 37 strict endosymbiont bacteria of this study. However, we also include 13 homologs of σ70 coming from six endosymbionts and seven free-living bacteria. These other bacteria have a lesser extent of genome reduction. Still, similar, less, or more extensive GC contents than the endosymbionts studied (S5 Table).

Except for C. Zinderia cicadicola, the rest of the bacteria preserve the β flap-tip helix and the β’ coiled-coil domains. These also conserve the σ2.2 and σ2.2 subdomains involved in the promoter recognition in E. coli (Figs 3A, 4A and 5A). Thus, this analysis may suggest no relationship between the GC content and changes in the σ factors.

Discussions

Here we approach the study of the evolution of RNAP subunits in vastly reduced bacteria genomes. We found that the β and β’ subunits are the most conserved in all the studied endosymbionts. These have just some differences in the regions involved in the interactions with the σ factor, possibly because of significant changes in σ. On the other side, the α subunit is more conserved in Candidatus Carsonella ruddii and Nasuia. In contrast, in the other endosymbionts, the α subunit has lost its α-CTD.

Furthermore, studies report the absence of a recognized gene encoding for the α subunit in Hodgkinia strains [37]. It is unknown how to perform an RNAP without the α subunit if the transcription is present in these Hodgkinias. It might mean that α subunits can follow ω as dispensable RNAP subunits. Hodgkinia strains inhabit their host as consortia with other bacteria. Then, it is attractive that the community consortia contribute to cellular activities [72]. Still, it isn’t easy to know if these complementary activities include gene transcription.

Furthermore, Candidatus Hodgkinia Dsem, CHOCRA, and TETULN strains preserve an α subunit and are not known to share their host with other Hodgkinia strains [37]. We also observed several amino acids under putative positive selection in the α subunit of Candidatus Hodgkinia TETUND2. Thus, they suggest compensation for the loss of critical amino acid regions for homodimer formation in this subunit.

Given the importance of σ in promoter recognition for transcription initiation, the substitutions observed in the σ2.4 subdomain might correspond to variations in the specificity of σ for promoters. Previous studies indicate that the substitution of some amino acids does not compromise their affinity to DNA. These include lysine, asparagine, serine, methionine, and phenylalanine [73]. However, these changes might exert mild effects on σ affecting its specificity for promoters. The differences observed in the σ2.4 subdomain of Candidatus Hodgkinia and Candidatus Nasuia correspond to mutations already experimentally found in the E. coli σ70 and Bacillus SigA [25,26,52].

So far, we can suggest the functionality of the RNAP of the seventeen Hodgkinia strains that conserve an α subunit, to the exception of the Dsem strain. These can recognize only the -10-box and the -10-box extended promoter elements. They lack the fragment required to form the β flap-tip helix domain that recognizes the -35-box promoter element and neither recognize the UP element (Figs 3D and 4D). Candidatus Tremblaya phenacola PAVE, Princeps PCIT, and PCVAL are the endosymbionts with a σ nearest to E. coli σ38 instead of the housekeeping σ70 (S4 Fig). σ38 pertains to the σ70 family, but in E. coli, it transcribes stationary phase genes [9]. The σ4.1 subdomain present in σ38 has some amino acid changes concerning those σ70. These changes make the β flap-tip helix domain of σ38 with more affinity and increased performance for the binding to -35-box promoter elements. This higher affinity to -35-box promoter elements and the stationary phase transcription factors could displace the main transcription activity from σ70 to σ38 in the stationary phase in E. coli.

In endosymbiotic bacteria with exceedingly reduced genomes, it has not been possible to locate σ70 canonical promoters. Not even for the most conserved, like the ribosomal genes [74]. This inability to find promoters might be partly due to the high A+T percentage and the lack of intergenic regions in these genomes. However, this may not be the whole explanation since Hodgkinia and Tremblaya have relatively high G+C %. More than 90% of their genome comprises coding sequences, and neither presents a recognized promoter [7]. Hence, these results suggest that RNAP in endosymbionts can conserve some sequence recognition capacity, but this should differ from the σ70 consensus promoters in E. coli. Thus, it seems that some promoter elements are unnecessary in endosymbiont. With this, shorter promoter sequences might be sufficient for gene transcription. This fact can explain the difficulty of recovering consensus promoter sequences as we know in free-living bacteria. Besides, transcription factors that assist in gene regulation are also absent in these bacteria with highly reduced genomes, being the last to be lost the nucleoid-associated proteins [36]. Therefore, it makes sense that regions for gene activation, such as the UP-promoter element and the contact region for these activators in the α subunit, are absent. Variations in recognition regions of promoters might not be the only ones in these bacteria. For example, previous reports indicate significant changes in the 16S ribosomal 3’ tail and its binding sequence with the corresponding changes in the Shine-Dalgarno element localized upstream of the protein-encoding genes [75]. Then, the observations of this study can be a more generalized phenomenon in these bacteria.

Conclusions

DNA sequences encoding for each of the RNAP subunits exhibit a reduction of 16% on average compared to those in E. coli. The gene reductions present in RNAP subunits are independent of the CG content (18–40% GC in these genomes). Most endosymbionts experiment strong purifying selection on the RNAP subunit genes, particularly on the β and β’ subunits. In the case of σ, the type of selection determined was less uniform among the endosymbionts.

A closer inspection in the α subunit reveals that the α-NTD is more conserved than α-CTD. Additionally, some amino acid changes in homodimer assembly are under positive selection in the Candidatus Hodgkinia TETUND2 and Dsem strains. On the other hand, the β and β’ subunits are more conserved in strict endosymbionts except for the β flap-tip helix domain in Hodgkinia strains. Furthermore, the σ subunit presents the more variated erosion in these endosymbionts. These unequal losses result in promoter elements’ differential recognition.

To better illustrate our inferences, we present a functional conclusion based on the conservation of RNAP subunits. We offer drawing models with the inferred regions of promoters where RNAP for each endosymbiont should be recognizing (Fig 8). We can deduce that the RNAP of Nasuia conserved the more significant similarity to the E. coli σ70. According to this, Nasuia RNAP should recognize the two main promoter elements (-10-box and -35-box) and the two promoter accessory elements (-10-box extended and even the UP element). Another way, Carsonella RNAP seems to maintain recognition of the promoter elements except for the -10-box extended element. This limited recognition can result in promoters with shorter regions between the -10-box and the -35-box promoter elements. In another case, the σ of Tremblaya resembles more to σ38 instead of the canonical σ70. In Tremblaya and Hodgkinia, due to the absence of the α-CTD, they might not recognize the UP element. And in the case of the strain Candidatus Hodgkinia Dsem neither the -10 extended promoter element. Additional studies, ideally experimental ones, should generate new knowledge about what is happening with the functioning of shorter proteins in this fascinating field of highly reduced genomes.

thumbnail
Fig 8. Proposed functional RNAP models of bacterial endosymbionts with reduced genomes.

The conserved domains in each group of bacteria are present in each illustration. Figures correspond to E. coli RNAP (Eco), Nasuia strains RNAP, and Carsonella strains RNAP. In Hodgkinia strains in orange, the TETULN, TETUND1, and TETUND1 strains use the -10 extended region. In purple, Dsem strains recognize the -35 element. Candidatus Tremblaya phenacola PAVE, Princeps PCIT, and PCVAL RNAP model (Based on [10, 76]).

https://doi.org/10.1371/journal.pone.0239350.g008

Supporting information

S1 Fig. Interaction domains in α subunit alignment.

https://doi.org/10.1371/journal.pone.0239350.s001

(TIF)

S2 Fig. Interaction domains in the β subunit alignment.

https://doi.org/10.1371/journal.pone.0239350.s002

(TIF)

S3 Fig. Functional domains and amino acids in β’ subunit alignment.

https://doi.org/10.1371/journal.pone.0239350.s003

(TIF)

S4 Fig. Selected sites mapped in Hodgkinia Dsem α subunit.

https://doi.org/10.1371/journal.pone.0239350.s004

(TIF)

S5 Fig. Phylogenetic tree of sigma 70 proteins in endosymbionts with extreme genome reduction and other symbionts and free-living bacteria.

https://doi.org/10.1371/journal.pone.0239350.s005

(TIF)

S1 Table. Complete list of studied endosymbionts.

https://doi.org/10.1371/journal.pone.0239350.s006

(PDF)

S2 Table. Bacteria with similar %GC to endosymbiotic bacteria with reduced genomes.

https://doi.org/10.1371/journal.pone.0239350.s007

(PDF)

S3 Table. Results obtained by selective pressure analysis by branch model.

https://doi.org/10.1371/journal.pone.0239350.s008

(PDF)

S4 Table. Results of selective pressure obtained by the branch-site model.

https://doi.org/10.1371/journal.pone.0239350.s009

(PDF)

S5 Table. Changes of free energy by in silico mutations in selected sites of α subunit homodimer predicted for Hodgkinia TETUND2.

https://doi.org/10.1371/journal.pone.0239350.s010

(PDF)

Acknowledgments

CPR-C has a Ph.D. fellowship (380338) from CONACYT. México. Thank Rafael Montiel for guiding the selective pressure analyses and Diego Andrés López Castro and Paola Isabel Angulo-Bejarano for reading the manuscript.

References

  1. 1. McCutcheon J. P., & Moran N. A. Extreme genome reduction in symbiotic bacteria. Nature Reviews Microbiology. 2012; 10(1), 13–26.
  2. 2. Moran N. A., Dunbar H. E., & Wilcox J. L. Regulation of transcription in a reduced bacterial genome: nutrient-provisioning genes of the obligate symbiont Buchnera aphidicola. Journal of Bacteriology. 2005; 187(12), 4229–4237. pmid:15937185
  3. 3. Huerta A. M., Francino M. P., Morett E., & Collado-Vides J. Selection for unequal densities of [[sigma]. sup. 70] promoter-like signals in different regions of large bacterial genomes. PLoS Genetics. 2006; 2(11), 1740–1751.
  4. 4. Ptashne M. Regulation of transcription: from lambda to eukaryotes. Trends Biochem Sci. 2005; 30 (6): 275–279. pmid:15950866
  5. 5. Ebright RH. RNA polymerase: Structural similarities between bacterial RNA polymerase and eukaryotic RNA polymerase II. J Mol Biol. 2000; 304: 687–698. pmid:11124018
  6. 6. Gentry D, Burgess RR. rpoZ, encoding the omega subunit of Escherichia coli RNA polymerase, is in the same operon as spot. J Bacteriol. 1989; 171(3): 1271–1277. pmid:2646273
  7. 7. Moran NA, Bennett GM. The tiniest tiny genomes. Ann Rev Microbiol. 2014; 68: 195–215. pmid:24995872
  8. 8. Sweetser D, Nonet M, Young RA. Prokaryotic and eukaryotic RNA polymerases have homologous core subunits. 1987; Proc Natl Acad Sci USA. 84: 1192–1196. pmid:3547406
  9. 9. Ishihama A. Functional modulation of Escherichia coli RNA polymerase. Annu Rev Microbiol. 2000; 54: 499–518. pmid:11018136
  10. 10. Rangel-Chavez C., Galan-Vasquez E., & Martinez-Antonio A. Consensus architecture of promoters and transcription units in Escherichia coli: design principles for synthetic biology. Molecular bioSystems. 2017; 13(4), 665–676. pmid:28256660
  11. 11. Blatter EE, Ross W, Tang H, Gourse RL, Ebright RH. Domain organization of RNA polymerase α subunit: C-terminal 85 amino acids constitute a domain capable of dimerization and DNA binding. Cell. 1994; 78(5): 889–896. pmid:8087855
  12. 12. Ross W, Ernst A, Gourse RL. Fine structure of E. coli RNA polymerase-promoter interactions: α subunit binding to the UP element minor groove. Genes Dev. 2001; 15(5): 491–506. pmid:11238372
  13. 13. Busby S, Ebright RH. Promoter structure, promoter recognition and transcription activation in prokaryotes. Cell. 1994; 79(5): 743–74. pmid:8001112
  14. 14. Murakami K., Owens J. T., Belyaeva T. A., Meares C. F., Busby S. J., & Ishihama A. Positioning of two alpha subunit carboxy-terminal domains of RNA polymerase at promoters by two transcription factors. Proceedings of the National Academy of Sciences. 1997; 94(21), 11274–11278. pmid:9326599
  15. 15. Belyaeva T. A., Rhodius V. A., Webster C. L., & Busby S. J. Transcription activation at promoters carrying tandem DNA sites for the Escherichia coli cyclic AMP receptor protein: organisation of the RNA polymerase α subunits. Journal of molecular biology. 1998; 277(4), 789–804. pmid:9545373
  16. 16. Bertoni G., Fujita N., Ishihama A., & de Lorenzo V. Active recruitment of σ54‐RNA polymerase to the Pu promoter of Pseudomonas putida: role of IHF and αCTD. The EMBO journal. 1998; 17(17), 5120–5128. pmid:9724648
  17. 17. McLeod S. M., Aiyar S. E., Gourse R. L., & Johnson R. C. The C-terminal domains of the RNA polymerase α subunits: contact site with Fis and localization during co-activation with CRP at the Escherichia coli proP P2 promoter. Journal of molecular biology. 2002; 316(3), 517–529. pmid:11866515
  18. 18. Cramer P., Bushnell D. A., & Kornberg R. D. Structural basis of transcription: RNA polymerase II at 2.8 Ångstrom resolution. Science. 2001; 292(5523): 1863–1876. pmid:11313498
  19. 19. Geszvain K., Gruber T. M., Mooney R. A., Gross C. A., & Landick R. A hydrophobic patch on the flap-tip helix of E. coli RNA polymerase mediates σ70 region 4 function. Journal of molecular biology. 2004; 343(3): 569–587. pmid:15465046
  20. 20. Vassylyev D. G., Sekine S. I., Laptenko O., Lee J., Vassylyeva M. N., Borukhov S., et al. Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 Å resolution. Nature. 2002; 417(6890): 712–719. pmid:12000971
  21. 21. Young B. A., Anthony L. C., Gruber T. M., Arthur T. M., Heyduk E., Lu C. Z., et al. A coiled-coil from the RNA polymerase β′ subunit allosterically induces selective nontemplate strand binding by σ70. Cell. 2001; 105(7): 935–944. pmid:11439189
  22. 22. Bae B., Davis E., Brown D., Campbell E. A., Wigneshweraraj S., & Darst S. A. Phage T7 Gp2 inhibition of Escherichia coli RNA polymerase involves misappropriation of σ70 domain 1.1. Proceedings of the National Academy of Sciences. 2013; 110(49): 19772–19777. pmid:24218560
  23. 23. Camarero J. A., Shekhtman A., Campbell E. A., Chlenov M., Gruber T. M., Bryant D. A., et al. Autoregulation of a bacterial σ factor explored by using segmental isotopic labeling and NMR. Proceedings of the National Academy of Sciences. 2002; 99(13): 8536–8541. pmid:12084914
  24. 24. Cathleen L Chan, Michael A Lonetto, Carol A Gross. Sigma domain structure: one down, one to go. Structure. 1996; 4(11):235–1238.
  25. 25. Waldburger C., Gardella T., Wong R., & Susskind M. M. Changes in conserved region 2 of Escherichia coli σ70 affecting promoter recognition. Journal of molecular biology. 1990; 215(2): 267–276. pmid:2213883
  26. 26. Siegel D. A., Hu J. C., Walter W. A., & Gross C. A. Altered promoter recognition by mutant forms of the σ70 subunit of Escherichia coli RNA polymerase. Journal of molecular biology. 1989; 206(4): 591–603. pmid:2661828
  27. 27. Lesley S. A., & Burgess R. R. Characterization of the Escherichia coli transcription factor. sigma. 70: localization of a region involved in the interaction with core RNA polymerase. Biochemistry. 1989; 28(19): 7728–7734. pmid:2692703
  28. 28. Malhotra A., Severinova E., & Darst S. A. Crystal structure of a σ70 subunit fragment from E. coli RNA polymerase. Cell. 1996; 87(1): 127–136. pmid:8858155
  29. 29. Zuo Y., & Steitz T. A. Crystal structures of the E. coli transcription initiation complexes with a complete bubble. Molecular cell. 2015; 58(3), 534–540. pmid:25866247
  30. 30. Kuznedelov K., Minakhin L., Niedziela-Majka A., Dove S. L., Rogulja D., Nickels B. E., et al. A role for interaction of the RNA polymerase flap domain with the σ subunit in promoter recognition. Science. 2002; 295(5556): 855–857. pmid:11823642
  31. 31. Eleftherios I., Atri J., Accetta J., & Castillo J. C. Endosymbiotic bacteria in insects: guardians of the immune system? Frontiers in physiology. 2013; 4, 46. pmid:23508299
  32. 32. McCutcheon JP, Moran NA. Extreme genome reduction in symbiotic bacteria. Nat Rev Microbiol. 2012; 10 (1): 13–26.
  33. 33. Blattner F. R., Plunkett G., Bloch C. A., Perna N. T., Burland V., Riley M., et al. The complete genome sequence of Escherichia coli K-12. Science. 1997; 277(5331), 1453–1462. pmid:9278503
  34. 34. Sloan D. B., & Moran N. A. Genome reduction and co-evolution between the primary and secondary bacterial symbionts of psyllids. Molecular biology and evolution. 2012; 29(12), 3781–3792. pmid:22821013
  35. 35. Tamames J, Gil R, Latorre A, Peretó J, Silva FJ, Moya A. The frontier between cell and organelle: genome analysis of Candidatus Carsonella ruddii. BMC Evol Biol. 2007; 7: 181. pmid:17908294
  36. 36. Galán-Vásquez E, Sánchez-Osorio I, Martínez-Antonio A. Transcription Factors Exhibit Differential Conservation in Bacteria with Reduced Genomes. PLOS ONE. 2016; 11(1): e0146901. pmid:26766575
  37. 37. Lukasik P., Nazario K., Van Leuven J. T., Campbell M. A., Meyer M., Michalik A., et al. Multiple origins of interdependent endosymbiotic complexes in a genus of cicadas. Proceedings of the National Academy of Sciences. 2018. 115(2): E226–E235.
  38. 38. Notredame C., Higgins D.G., Heringa J. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302(1): 205–217. pmid:10964570
  39. 39. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016; 44(D1): D279–85. pmid:26673716
  40. 40. Gough J, Karplus K, Hughey R, Chothia C. Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol. 2001; 13(4): 903–19. pmid:11697912
  41. 41. Marchler-Bauer A., Bao Y., Han L., He J., Lanczycki C. J., Lu S., et al. CDD/SPARKLE: functional classification of proteins via subfamily domain architectures. Nucleic acids research. 2016; 45(D1): D200–D203. pmid:27899674
  42. 42. Murakami K. S., Masuda S., Campbell E. A., Muzzin O., & Darst S. A. Structural basis of transcription initiation: an RNA polymerase holoenzyme-DNA complex. Science. 2002; 296(5571): 1285–1290. pmid:12016307
  43. 43. Ebright RH, Busby S. The Escherichia coli RNA polymerase subunit: structure and function. Curr Opin Genet Dev. 1995; 5(2): 197–203. pmid:7613089
  44. 44. Zhang G, Darst SA. Structure of the Escherichia coli RNA polymerase subunit Amino-terminal domain. Science. 998; 281(5374): 262–266. pmid:9657722
  45. 45. Kenney T. J., York K., Youngman P., & Moran C. P. Genetic evidence that RNA polymerase associated with sigma A factor uses a sporulation-specific promoter in Bacillus subtilis. Proceedings of the National Academy of Sciences. 1989; 86(23): 9109–9113. pmid:2512576
  46. 46. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC bioinformatics. 2008; 9(1): 40. pmid:18215316
  47. 47. Zuo Y., & Steitz T. A. Crystal structures of the E. coli transcription initiation complexes with a complete bubble. Molecular cell. 2015; 58(3): 534–540. pmid:25866247
  48. 48. The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC.
  49. 49. Kozakov D., Hall D. R., Xia B., Porter K. A., Padhorny D., Yueh C., et al. The ClusPro web server for protein-protein docking. Nature protocols. 2017; 12(2): 255. pmid:28079879
  50. 50. Vajda S, Yueh C, Beglov D, Bohnuud T, Mottarella SE, Xia B, et al. New additions to the ClusPro server motivated by CAPRI. Proteins: Structure, Function, and Bioinformatics. 2017; 85(3): 435–444.
  51. 51. Kozakov D, Beglov D, Bohnuud T, Mottarella S, Xia B, Hall DR, et al. How good is automated protein docking. Proteins: Structure, Function, and Bioinformatics. 2013; 81(12): 2159–66.
  52. 52. Yueh C, Hall DR, Xia B, Padhorny D, Kozakov D, Vajda S. ClusPro-DC: Dimer Classification by the Cluspro Server for Protein-Protein Docking. Journal of Molecular Biology. 2017; 429(3): 372–381. pmid:27771482
  53. 53. Lensink M. F., Méndez R., & Wodak S. J. Docking and scoring protein complexes: CAPRI 3rd Edition. Proteins: Structure, Function, and Bioinformatics. 2007; 69(4): 704–718.
  54. 54. Xiong P., Zhang C., Zheng W., & Zhang Y. BindProfX: assessing mutation-induced binding affinity change by protein interface profiles with pseudo-counts. Journal of molecular biology. 2017; 429(3): 426–434. pmid:27899282
  55. 55. Yang Ziheng, Nielsen Rasmus, Synonymous Estimating and Non-synonymous Substitution Rates Under Realistic Evolutionary Models, Molecular Biology and Evolution. 2000; 17(1): 32–43. pmid:10666704
  56. 56. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution. 2007; 24(8): 1586–1591. pmid:17483113
  57. 57. Suyama M., Torrents D., & Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic acids research. 2006; 34(Web Server issue): W609–W612. pmid:16845082
  58. 58. Guindon S., Dufayard J. F., Lefort V., Anisimova M., Hordijk W., & Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic biology. 2010; 59(3): 307–321. pmid:20525638
  59. 59. Notredame C., Higgins D.G., Heringa J. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302(1): 205–217. pmid:10964570
  60. 60. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular Biology and Evolution. 2000; 17: 540–552. pmid:10742046
  61. 61. Yang Z., & Nielsen R. Synonymous and non-synonymous rate variation in nuclear genes of mammals. Journal of molecular evolution. 1998; 46(4), 409–418. pmid:9541535
  62. 62. Zhang J., Nielsen R., & Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Molecular biology and evolution. 2005; 22(12): 2472–2479. pmid:16107592
  63. 63. Yang Z., Wong W. S., & Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Molecular biology and evolution. 2005; 22(4): 1107–1118. pmid:15689528
  64. 64. Kimura M., Fujita N., & Ishihama A. Functional map of the α subunit of Escherichia coli RNA polymerase: deletion analysis of the amino-terminal assembly domain. Journal of molecular biology. 1994; 242(2), 107–115. pmid:8089834
  65. 65. Igarashi K., & Ishihama A. Bipartite functional map of the E. coli RNA polymerase α subunit: involvement of the C-terminal region in transcription activation by cAMP-CRP. Cell. 1991; 65(6): 1015–1022. pmid:1646077
  66. 66. Hayward R. S., Igarashi K., & Ishihama A. Functional specialization within the α-subunit of Escherichia coli RNA polymerase. Journal of molecular biology. 1991; 221(1): 23–29. pmid:1920407
  67. 67. Ross W, Gosink KK, Salomon J, Igarashi K, Zou C, Ishihama A, et al. A third recognition element in bacterial promoters: DNA binding by the α subunit of RNA polymerase. Science. 1993; 262(5138): 1407–1413. pmid:8248780
  68. 68. Nelson W. C., & Stegen J. C. The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle. Frontiers in microbiology. 2015; 6, 713. pmid:26257709
  69. 69. Sheveleva E. V., Giordani N. V., & Hallick R. B. Identification and comparative analysis of the chloroplast α-subunit gene of DNA-dependent RNA polymerase from seven Euglena species. Nucleic acids research. 2002; 30(5): 1247–1254. pmid:11861918
  70. 70. Hirata A., & Murakami K. S. Archaeal RNA polymerase. Current opinion in structural biology. 2009; 19(6): 724–731. pmid:19880312
  71. 71. Dufresne A., Garczarek L. & Partensky F. Accelerated evolution associated with genome reduction in a free-living prokaryote. Genome Biol. 2005; 6(2): R14. pmid:15693943
  72. 72. Husnik F., Nikoh N., Koga R., Ross L., Duncan R. P., Fujie M., et al. Horizontal gene transfer from diverse bacteria to an insect genome enables a tripartite nested mealybug symbiosis. Cell. 2013; 153(7): 1567–1578. pmid:23791183
  73. 73. Luscombe N. M., Laskowski R. A., & Thornton J. M. Amino acid-base interactions: a three-dimensional analysis of protein–DNA interactions at an atomic level. Nucleic acids research. 2001; 29(13): 2860–2874. pmid:11433033
  74. 74. Clark M. A., Baumann L., Thao M. L., Moran N. A., & Baumann P. Degenerative minimalism in the genome of a psyllid endosymbiont. Journal of Bacteriology. 2001; 183(6): 1853–1861. pmid:11222582
  75. 75. Lim K., Furuta Y., & Kobayashi I. Large variations in bacterial ribosomal RNA genes. Molecular biology and evolution. 2012; 29(10), 2937–2948. pmid:22446745
  76. 76. Browning DF, Busby SJ. The regulation of bacterial transcription initiation. Nat Rev Microbiol. 2004; 2(1):57–65. pmid:15035009