Using multi-objective computational design to extend protein promiscuity
Introduction
Enzymes are natural protein catalysts that can enhance reaction rates up to 23 orders of magnitude [1] with an impressive affinity and/or specificity. Their applications are enormous, and enzyme design holds the promise to provide important impact in society areas like medicine (i.e. treatment of neurodegenerative diseases [2], design of therapeutically antibodies to bind tumor-associated antigenic determinants while maintaining a small immunogenicity, development of peptide-based vaccines), biotechnology (biosensors [3], [4], biocatalysts with activity in non-natural environments) and bioremediation (design of enzymes that would reduce waste by-products and toxicity [5]).
In the last few years, studies on enzyme evolvability have given convincing evidence on the mechanism of evolution of protein functions from cross-reactive proteins [6], [7], [8], [9], [10], starting from the observation that most proteins possess, besides their native function, additional promiscuous functions with specificities in the range of kcat/Km ∼ 10− 2–106 [10]. In the proposed mechanism, a weak promiscuous function arises due to neutral evolution, protein robustness (the ability of proteins to tolerate mutations without compromising fitness) and plasticity (the ability of gaining new functions by a reduced number of mutations). Under the right selection pressure, natural selection can improve the new function once it has arisen until, at some point, the protein may become specialized for the new function.
In the classic view, selection constraints on the native function were believed to be determinant in the process (if a given function appeared by natural selection, there must be a large penalty associated to the loss of that function) reducing the allowed evolution scenario to gene duplication at an early stage of the specialization. However, modern studies have shown that in most cases the coupling of the nascent and the original functions is smaller than expected (see [10] and references therein). This allows for several generations of “generalist” proteins [11] that are able to perform both functions, and suggests that gene duplication acts after the new function has appeared [8] and not before.
Proteins with several activities (multipurpose enzymes) may have a wide range of biotechnological applications related (but not limited) to industrial organic synthesis and metabolic engineering [12], [13], [14], [15], [16], [17]. However, natural protein promiscuity (as described in the two preceding paragraphs) may perhaps be of limited use in the development of these multipurpose catalysts. First of all, natural promiscuous activities are often related to the evolved ones, sharing the same active sites and even the basic chemical mechanisms and, in general, bearing a significant resemblance to the original function [10]. Secondly, development of the promiscuous activity upon suitable mutations usually brings about a decrease in the evolved activity (see Table 2 in [10]). We explore in this work the use of computational design to overcome these limitations.
We have selected thioredoxin from E. coli (PDB code 2trx, 1.5 Å resolution [18]) as scaffold for our studies. It is a small (108 residues) general disulphide oxidoreductase found in all the kingdoms of living organisms; it is a common model for protein design studies because of its high stability and good expression properties [19]. We aim at introducing an esterase activity [the nucleophilic hydrolysis of the p-nitrophenyl acetate (PNPA) into p-nitrophenol (PNP) and acetate] in E. coli thioredoxin, following an approach similar to that used in ref [20]. Unlike these previous studies [20], however, we intend to preserve the natural thioredoxin activity (see below for details).
Natural promiscuous activities appear to be shaped by residues at the “wall and perimeter” of the native active site [10]. These residues show high-plasticity, likely because they do not belong to the protein scaffold or the native catalytic machinery [10] and provide a suitable target for the introduction of new, non-natural promiscuous activities [21], [22]. However, designing a new active-site implies introduces “unsatisfied” destabilizing interactions (which will hopefully be satisfied upon ligand and/or transition-state binding). Designing a new active-site in close proximity to the native one poses the additional problem that the introduced destabilizing interactions may disrupt the native active site and affect the original activity. Our computational design approach, therefore, is based on a multi-objective optimization. Both a measure of protein stability and de-novo catalytic activity are simultaneously optimized by using two competing score functions for folding free energy and binding free energy of the protein-ligand (i.e. transition-state-model) complex , obtaining the Pareto Set [23] of optimal stability/promiscuous-function solutions. The goal of this procedure is the development of the new activity with the lowest possible stability cost, which has two advantages. First of all, the need to introduce additional stabilizing mutations (to compensate the destabilizing effect of the new active-site mutations) is minimized. This is an important point, since the original (native) active-site would be one obvious target for stabilizing mutations (active-site residues are optimized for function, not for stability) and, in this case, we aim at preserving the original activity. Indeed, in a previous work [20], the catalytic D26 residue was mutated to isoleucine, a change which enhances stability but which will impair the natural thioredoxin oxido-reductase activity. Secondly and most important, since our designed active-site is spatially close to the original one, the low-stability–cost strategy guarantees the minimal perturbation required for maintaining the original activity.
Our computational design approach is based upon the DESIGNER software [24], [25], which optimizes protein sequence for a given target structure. This procedure uses atomic models and rotamer libraries to represent side-chain conformations. The free energies of the different models are calculated with the CHARMM force-field [26] and a free-energy solvation term proportional to the surface area [27]. Calculations on a reference state (taken to represent the unfolded state) allow a quantity akin to the unfolding free energy to be computed for each model.
The DESIGNER program was originally developed to address the inverse folding problem. However, as we show in this work, it can also be used to design new active-sites. This requires that the design is approached as an optimization in sequence-space of a protein structure which includes a model of the transition state of the chemical reaction (the tetrahedral PNPA intermediate in the case of interest here) with reference to the structure without the transition-state model bound, thus yielding a quantity akin to the free energy barrier of the reaction. Furthermore, since DESIGNER also leads to a quantity akin to the unfolding free-energy, the multi-objective stability-function optimization is indeed feasible.
Section snippets
General approach
We aim at introducing a promiscuous esterase activity in E. coli thioredoxin: the nucleophilic hydrolysis of the p-nitrophenyl acetate (PNPA) into p-nitrophenol (PNP) and acetate. The general approach we use is similar to that described in ref. [20]; i.e., we choose a histidine residue as a nuclophile for the reaction and we model the tetrahedral transition state for the reaction as PNPA-histidine structure constructed as a generalized rotamer of the histidine. An initial exploration of the
Multi-objective optimization and construction of the Pareto Set
We approach enzyme design as a two-objective problem. We consider as one objective the effect on stability of the designed region, which is coupled to the primary function of thioredoxin due to spatial proximity. It is estimated from an approximation to the folding free energy as defined previously [24] and does not involve the presence of the PNPA tetrahedral intermediate. The other objective is a quantity akin to the activation free energy of the reaction which is estimated from calculations
Acknowledgments
P.T. acknowledges an EMBO long-term fellowship. A.J. acknowledges support from HPC-EUROPA2 (project 228398) and the use of the BSC and IDRIS supercomputer facilities to perform the calculations reported here. This research was supported by Feder Funds and Grant BIO2006-07332 (Spanish Ministry of Education and Science) to J.M.S.-R. Grant CVI-1668 (Junta de Andalucía) to B.I.-M and Grants BioModularH2 (FP6-NEST-043340), ATIGE (Genopole), TARPOL (FP7-KBBE-212894) to A.J.
References (42)
- et al.
Catalytic promiscuity and the evolution of new enzymatic activities chemistry and biology
Chem. Biol.
(1999) Enzymes with extra talents: moonlighting functions and catalytic promiscuity
Curr. Opin. Chem. Biol.
(2003)- et al.
Enzyme promiscuity: evolutionary and mechanistic aspects
Curr. Opin. Chem. Biol.
(2006) - et al.
How does an enzyme evolved in vitro compare to naturally occurring homologs possessing the targeted function? Tyrosine aminotransferase from aspartate aminotransferase
J. Mol. Biol.
(2003) Enhancing catalytic promiscuity for biocatalysis
Curr. Opin. Chem. Biol.
(2005)- et al.
Mechanistic studies on the alkyltransferase activity of serotonin N-acetyl transferase
Chem. Biol.
(2001) - et al.
Crystal structure of thioredoxin from Escherichia coli at 1.68 Å resolution
J. Mol. Biol.
(1990) - et al.
Automatic protein design with all atom force-fields by exact and heuristic optimization
J. Mol. Biol.
(2000) - et al.
Backbone-dependent rotamer library for proteins: application to side-chain prediction
J. Mol. Biol.
(1993) - et al.
The efficiency of different salts to screen charge interactions in proteins: a Hofmeister effect?
Biophys. J.
(2004)
Thioredoxin catalyzes the reduction of insulin disulfides by dithiothreitol and dihydrolipoamide
J. Biol. Chem.
Some factors in the interpretation of protein denaturation
Adv. Protein Chem.
Challenges in enzyme mechanism and energetics
Annu. Rev. Biochem.
The rational design of allosteric interactions in a monomeric protein and its applications to the construction of biosensors
Proc. Natl. Acad. Sci. U. S. A.
Computational design of receptor and sensor proteins with novel functions
Nature
Recent advances in the bioremediation of persistent organic pollutants via biomolecular engineering
Enzyme Microb. Technol.
Enzyme recruitment in evolution of new function
Annu. Rev. Microbiol.
The ‘evolvability’ of promiscuous protein functions
Nat. Genet.
Directed evolution of a (βα)8-barrel enzyme to catalyze related reactions in two different metabolic pathways
Proc. Natl. Acad. Sci. U. S. A.
Directed evolution of toluene ortho-monooxygenase for enhanced 1-naphthol synthesis and chlorinated ethene degradation
J. Bacteriol.
Cited by (16)
Green biomanufacturing promoted by automatic retrobiosynthesis planning and computational enzyme design
2022, Chinese Journal of Chemical EngineeringThe E. Coli thioredoxin folding mechanism: The key role of the C-terminal helix
2015, Biochimica et Biophysica Acta - Proteins and ProteomicsCitation Excerpt :This fact supports the idea that EcTRX is also a kinetically stable protein [31]. In addition, there is invaluable thermodynamic and kinetic information for an extended list of EcTRX point mutants [13,14,26,31–33]. Godoy-Ruiz and coworkers have found, based on results from urea-induced folded/unfolded experiments, that there is a large fraction of residues that occupy unstructured regions in the EcTRX TSE, yielding a high energy barrier, presumably as the result of the evolution towards a highly kinetically stable conformation [31].
De novo computational enzyme design
2014, Current Opinion in BiotechnologyProgrammable bacterial catalysis - Designing cells for biosynthesis of value-added compounds
2012, FEBS LettersCitation Excerpt :Extensive computational analysis is gradually adopted into the design process of novel proteins. For example, a multi-objective computational design approach is able to extend protein promiscuity and to endow the thioredoxin from E. coli with a promiscuous esterase function while maintaining the native oxidoreductase activity [51]. For the production of compounds for which no natural pathways have been elucidated, feasible solutions can be predicted through a retro-biosynthetic approach similar to the retro-synthesis method developed in organic chemistry that the metabolic pathway leading to the synthesis of a target compound is specified by considering the biotransformation of functional groups rather than the entire structure, assuming the availability of enzymes for the desired transformation.
Dynamic causal modeling with genetic algorithms
2011, Journal of Neuroscience MethodsCitation Excerpt :In addition, GAs are often applied in bioinformatics or physical research to get approximations in adequate time. In bioinformatics, GAs are used, for instance, in peptide and protein design (Gronwald et al., 2008; Suarez et al., 2010). The basis of a GA is a population of solutions and a fitness function.
Pareto optimization of combinatorial mutagenesis libraries
2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics
- 1
Present address: Department of Chemistry and Biochemistry and Center for Biomolecular Structure and Organization, University of Maryland, College Park, Maryland 20742, USA.