HaloTag7: A genetically engineered tag that enhances bacterial expression of soluble proteins and improves protein purification
Introduction
Recombinant DNA technologies have greatly expanded our access to the broad diversity of proteins represented in living organisms, and potentially to an even broader range of mutant proteins not found in nature. Accordingly, the expression and purification of recombinant proteins has become fundamental to many aspects of life science research. Yet, owing to complexities in protein structures and interactions within the host organism, success with these techniques often remains frustratingly elusive. Successful purification of functional proteins generally requires efficient expression of these proteins in soluble form followed by their separation from the highly complex crude lysate of the host. The most frequently used host for protein expression is Escherichia coli due to its ease of use, rapid cell growth, low cost of culturing and well documented protocols [1], [2]. However, over-expression of heterologous proteins in E. coli, particularly human proteins, often yields inadequate levels of soluble protein [1], [2], [3], [4].
One approach for overcoming this limitation is to optimize expression conditions such as temperature, growth media, induction parameters, promoters and E. coli expression strain [1], [5]. Systematic screening of such variables can be simplified by using reporter fusion tags such as GFP1 [6], [7] or S-tag [8]. Another common strategy is to use solubility fusion tags for boosting expression of soluble protein, presumably by promoting proper folding of the fusion partner and suppressing proteolysis [1], [9]. A variety of different solubility tags are available, yet not all are equally efficient as solubility enhancers. The most commonly used include GST [10], [11], TRX [12], MBP [13], [14] and NusA [15], [16].
Once adequate expression of soluble protein is achieved, the next step is to purify the target protein from the biological mixture. Affinity tags are widely used to simplify the purification process and to provide a generic method that is straightforward and adaptable to all target proteins. Many affinity tags have been developed, ranging in size from a few amino acids to entire proteins, that are capable of selective interaction with their corresponding ligands coupled to a chromatography matrix [17]. His6Tag [18], [19] is most widely used due to its small size and ability to frequently provide sufficient protein yield and purity for many applications. Other common affinity tags include GST and MBP, which are often favored for their additional ability to enhance protein solubility [3], [4], [20].
Although appending a fusion tag onto a protein of interest can improve expression and purification, the tag can also interfere with protein structure or function [13], [21], [22]. Consequently, it is commonly recommended to remove the tag after purification [18], [21], [22]. Tag removal can be performed by proteolytic cleavage at a defined sequence in the interconnecting polypeptide, i.e. linker separating the tag and the target protein [18], [23], [24], [25]. This approach can be problematic due to non-specific or inefficient cleavage or loss of protein stability and solubility following tag removal [25], [26], [27]. Furthermore, this step often requires additional effort to separate the free target protein from the affinity tag and the protease.
While a variety of fusion tags is available to facilitate aspects of protein expression, solubility, detection, or purification, most tags are lacking or inefficient in some of these features. Many proteins are poorly expressed with available solubility tags, or are difficult to purify with existing affinity technologies due to low binding onto the purification matrix [25], [26], [28], [29]. These shortcomings are addressed by a new tag, HaloTag7, designed to support efficient expression of soluble protein and bind rapidly and covalently to a unique synthetic ligand. HaloTag7 is a catalytically inactive derivative of DhaA, a bacterial haloalkane dehalogenase from Rhodococcus [30] present only among selected microbial groups. This 34 kDa monomeric protein was engineered through rational design and molecular evolution to rapidly form a covalent attachment to synthetic chloroalkane ligands [31], [32], and to provide enhanced expression and solubility when fused to a protein partner.
The synthetic ligands comprise a chloroalkane linker attached to a variety of functional groups including fluorophores, affinity handles and solid supports. These features enable both fluorescent labeling of fusion proteins in cell lysate for expression screening and irreversible capture of fusion proteins onto a purification matrix. The rapid, specific, and covalent capture offered by HaloTag7 overcomes the inherent limitation of affinity tags by effectively eliminating protein loss associated with equilibrium-based binding. This feature is especially important for purification of poorly expressed proteins. Following immobilization on the purification matrix, the target protein can be released by cleavage at an optimized TEV protease recognition site contained within the interconnecting polypeptide separating HaloTag7 and the fusion partner. The HaloTag7-based protein purification method yields highly pure proteins in solution while the fusion tag remains covalently attached to the matrix, eliminating contamination by free tag or un-cleaved fusion protein.
To demonstrate the efficacy of HaloTag7 for protein expression and purification with E. coli, we compared its performance to the commonly used affinity tags, GST, MBP, and His6Tag (see Table 1 for tags characteristics). We chose GST and MBP as they are used in a manner similar to HaloTag7; both promote expression of soluble protein in E. coli and both provide a means for protein purification. Although His6Tag does not assist in protein expression or solubilization, this tag was also chosen because of its widespread use for protein purification. The relative performance of these tags was evaluated using a panel of cDNA clones encoding 23 human proteins that are difficult to express in E. coli [33]. The set of proteins ranges broadly in both size (∼9–155 kDa) and function (e.g. kinases, membrane proteins and transcription factors). Our results showed that HaloTag7 delivered superior performance for protein expression, solubility, purification yield and purity. Furthermore, using two additional model proteins, we found that HaloTag7 produced proteins with higher specific activity.
Section snippets
Bacterial strain and materials
Single Step E. coli KRX ([F′,traD36, ΔompP, proA+B+, lacIq, Δ(lacZ)M15] ΔompT, endA1, recA1, gyrA96 (Nalr), thi-1, hsdR17(rk−,mk+), e14− (McrA−) relA1, supE44, Δ(lac-proAB), ΔrhaBAD)::T7 RNA polymerase) [34] (Promega, Madison, WI) was used for both cloning and expression. Precession Plus protein MW markers were from BioRad (Hercules, CA). All enzymes and other reagents were from Promega unless otherwise noted.
Expression vectors
Bacterial T7 promoter-based Flexi vectors pFN18K and pFN2K expressing HaloTag7 and
Assembly of test constructs
The test panel for the comparison of HaloTag7, GST, MBP and His6Tag contained full-length cDNAs encoding 23 human proteins that were previously shown to express poorly in E. coli in the absence of a tag [33]. The test panel sequences, summarized in Table 2, represent proteins of varying size (∼9–155 kDa) and function (e.g. kinases, membrane proteins, and transcription factors). These 23 coding regions, previously available as Flexi vectors clones [33], were transferred to four different Flexi
Discussion
Purification of functional recombinant proteins requires the ability to express adequate levels of soluble protein in an appropriate host and then efficiently isolate the protein to homogeneity. Although many protein fusion tags are available to assist in this process, none is ideal when applied to the diversity of proteins routinely studied. Moreover, available tags are often better suited to specific aspects of the overall process, such as expression, solubilization, protein capture,
Acknowledgments
We thank Nidhi Nath and Jim Hartnett for help with experiments and Robin Hurst for helpful discussion.
References (50)
Recombinant protein expression in Escherichia coli
Current Opinion in Biotechnology
(1999)- et al.
Systematic optimization of active protein expression using GFP as a folding reporter
Protein Expr. Purif.
(2004) - et al.
Gene fusion expression systems in Escherichia coli
Curr. Opin. Biotechnol.
(1995) Generating fusions to glutathione S-transferase for protein studies
Methods Enzymol.
(2000)- et al.
Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S-transferase
Gene
(1988) - et al.
Thioredoxin as a fusion partner for production of soluble recombinant proteins in Escherichia coli
Methods Enzymol.
(2000) - et al.
Fusions to maltose-binding protein: control of folding and solubility in protein purification
Methods Enzymol.
(2000) - et al.
The solubility and stability of recombinant proteins are increased by their fusion to NusA
Biochem. Biophys. Res. Commun.
(2004) - et al.
Affinity fusion strategies for detection, purification, and immobilization of recombinant proteins
Protein Expr. Purif.
(1997) - et al.
A novel purification method for histidine-tagged proteins containing a thrombin cleavage site
Anal. Biochem.
(2001)
Making the most of affinity tags
Trends Biotechnol.
Enhancement of soluble protein expression through the use of fusion tags
Curr. Opin. Biotechnol.
Solubility-enhancing proteins MBP and NusA play a passive role in the folding of their fusion partners
Protein Expr. Purif.
From gene to protein: a review of new and enabling technologies for multi-parallel protein expression
Protein Expr. Purif.
High-level expression of soluble protein in Escherichia coli using a His6-tag and maltose-binding-protein double-affinity fusion system
Protein Expr. Purif.
Evolving haloalkane dehalogenases
Curr. Opin. Chem. Biol.
High-throughput proteomics: protein expression and purification in the postgenomic world
Protein Expr. Purif.
The protein Id: a negative regulator of helix-loop-helix DNA binding proteins
Cell
One-step purification of recombinant proteins using a nanomolar-affinity streptavidin-binding peptide, the SBP-Tag
Protein Expr. Purif.
Investigating the effects of mutations on protein aggregation in the cell
J. Biol. Chem.
Expression and purification of SARS coronavirus proteins using SUMO-fusions
Protein Expr. Purif.
Protein production: feeding the crystallographers and NMR spectroscopists
Nat. Struct. Mol. Biol.
Proteome-scale purification of human proteins from bacteria
Proc. Natl. Acad. Sci. USA
Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli
Protein Sci.
Production of recombinant proteins in Escherichia coli
Genetics and Molecular Biology
Cited by (112)
Rim aperture of yeast autophagic membranes balances cargo inclusion with vesicle maturation
2024, Developmental CellHUH Endonuclease: A Sequence-specific Fusion Protein Tag for Precise DNA-Protein Conjugation
2024, Bioorganic ChemistryProteolysis-targeting chimeras in biotherapeutics: Current trends and future applications
2023, European Journal of Medicinal ChemistryEnzymatic bioconjugation to nanoparticles
2023, Encyclopedia of NanomaterialsTarget protein localization and its impact on PROTAC-mediated degradation
2022, Cell Chemical BiologyCitation Excerpt :Protein tag-based PROTAC technologies have been developed to determine the effectiveness of UPS-mediated POI degradation and its effect on POI function (Roth et al., 2019). Halo/VHL-based PROTACs that bind the HaloTag (Los et al., 2008; Ohana et al., 2009) using a chloroalkane group, and VHL through either a hydroxyproline derivative (HaloPROTAC3) (Buckley et al., 2015) or VH298 (HaloPROTAC-E) (Tovell et al., 2019a), have been described for the degradation of various Halo-tagged POIs. The degradation tag (dTAG) system uses an AP1867 ligand, which selectively binds an FKBP12 mutant protein (FKBP12F36V) and not wild-type (WT) FKBP12 (Clackson et al., 1998; Nabet et al., 2018, 2020).