Copyright © 2004 Elsevier Ltd All rights reserved.
Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds
Received 28 November 2003;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Proteins employ a wide variety of folds to perform their biological functions. How are these folds first acquired? An important step toward answering this is to obtain an estimate of the overall prevalence of sequences adopting functional folds. Since tertiary structure is needed for a typical enzyme active site to form, one way to obtain this estimate is to measure the prevalence of sequences supporting a working active site. Although the immense number of sequence combinations makes wholly random sampling unfeasible, two key simplifications may provide a solution. First, given the importance of hydrophobic interactions to protein folding, it seems likely that the sample space can be restricted to sequences carrying the hydropathic signature of a known fold. Second, because folds are stabilized by the cooperative action of many local interactions distributed throughout the structure, the overall problem of fold stabilization may be viewed reasonably as a collection of coupled local problems. This enables the difficulty of the whole problem to be assessed by assessing the difficulty of several smaller problems. Using these simplifications, the difficulty of specifying a working β-lactamase domain is assessed here. An alignment of homologous domain sequences is used to deduce the pattern of hydropathic constraints along chains that form the domain fold. Starting with a weakly functional sequence carrying this signature, clusters of ten side-chains within the fold are replaced randomly, within the boundaries of the signature, and tested for function. The prevalence of low-level function in four such experiments indicates that roughly one in 1064 signature-consistent sequences forms a working domain. Combined with the estimated prevalence of plausible hydropathic patterns (for any fold) and of relevant folds for particular functions, this implies the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 1077, adding to the body of evidence that functional folds require highly extraordinary sequences.
Keywords: functional constraints; sequence-function relationship; sequence-structure relationship; function landscape; sequence space
Abbreviations: MIC, minimum inhibitory concentration; indels, insertions and deletions
Article Outline
- 1. Introduction
- 2. Experimental Approach
- 3. Results and Discussion
- 3.1. Identification of lower-bound selection threshold
- 3.2. Homologous sequence alignment
- 3.3. Finding a reference sequence
- 3.4. The hydropathy signature as a plausible fold-specific pattern
- 3.5. Local side-chain randomization
- 3.6. Implications
- 4. Materials and Methods
- 4.1. Large-domain sequence alignment
- 4.2. Obtaining the hydropathy signature
- 4.3. Estimating the proportion of sequences carrying the signature
- 4.4. Plasmids and strains
- 4.5. Quantitative ampicillin selection protocol
- 4.6. Insertion mutagenesis
- 4.7. Production of the reference large-domain sequence
- 4.8. Local side-chain randomization
- 4.9. Isolation of functional set 3 variants
- Acknowledgements
- Appendix. Supplementary data
- Appendix. Supplementary data
- References






E-mail Article
Add to my Quick Links

Cited By in Scopus (9)






