ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Journal of Molecular Biology
Volume 341, Issue 5, 27 August 2004, Pages 1295-1315
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (2239 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.jmb.2004.06.058    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2004 Elsevier Ltd All rights reserved.

Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds

Douglas D. AxeCorresponding Author Contact Information, E-mail The Corresponding Author

The Babraham Institute, Structural Biology Unit, Babraham Research Campus, Cambridge CB2 4AT, UK

Received 28 November 2003; 
revised 2 May 2004; 
accepted 18 June 2004. 
Edited by J. Thornton. 
Available online 14 July 2004.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Proteins employ a wide variety of folds to perform their biological functions. How are these folds first acquired? An important step toward answering this is to obtain an estimate of the overall prevalence of sequences adopting functional folds. Since tertiary structure is needed for a typical enzyme active site to form, one way to obtain this estimate is to measure the prevalence of sequences supporting a working active site. Although the immense number of sequence combinations makes wholly random sampling unfeasible, two key simplifications may provide a solution. First, given the importance of hydrophobic interactions to protein folding, it seems likely that the sample space can be restricted to sequences carrying the hydropathic signature of a known fold. Second, because folds are stabilized by the cooperative action of many local interactions distributed throughout the structure, the overall problem of fold stabilization may be viewed reasonably as a collection of coupled local problems. This enables the difficulty of the whole problem to be assessed by assessing the difficulty of several smaller problems. Using these simplifications, the difficulty of specifying a working β-lactamase domain is assessed here. An alignment of homologous domain sequences is used to deduce the pattern of hydropathic constraints along chains that form the domain fold. Starting with a weakly functional sequence carrying this signature, clusters of ten side-chains within the fold are replaced randomly, within the boundaries of the signature, and tested for function. The prevalence of low-level function in four such experiments indicates that roughly one in 1064 signature-consistent sequences forms a working domain. Combined with the estimated prevalence of plausible hydropathic patterns (for any fold) and of relevant folds for particular functions, this implies the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 1077, adding to the body of evidence that functional folds require highly extraordinary sequences.

Keywords: functional constraints; sequence-function relationship; sequence-structure relationship; function landscape; sequence space

Abbreviations: MIC, minimum inhibitory concentration; indels, insertions and deletions

Article Outline

1. Introduction
2. Experimental Approach
3. Results and Discussion
3.1. Identification of lower-bound selection threshold
3.2. Homologous sequence alignment
3.3. Finding a reference sequence
3.4. The hydropathy signature as a plausible fold-specific pattern
3.5. Local side-chain randomization
3.6. Implications
4. Materials and Methods
4.1. Large-domain sequence alignment
4.2. Obtaining the hydropathy signature
4.3. Estimating the proportion of sequences carrying the signature
4.4. Plasmids and strains
4.5. Quantitative ampicillin selection protocol
4.6. Insertion mutagenesis
4.7. Production of the reference large-domain sequence
4.8. Local side-chain randomization
4.9. Isolation of functional set 3 variants
Acknowledgements
Appendix. Supplementary data
Appendix. Supplementary data
References











Journal of Molecular Biology
Volume 341, Issue 5, 27 August 2004, Pages 1295-1315
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.