ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Journal of Molecular Biology
Volume 378, Issue 3, 2 May 2008, Pages 749-758
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (696 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.jmb.2008.02.043    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2008 Elsevier Ltd All rights reserved.

Assessing Side-Chain Perturbations of the Protein Backbone: A Knowledge-Based Classification of Residue Ramachandran Space

David B. Dahl1, Zach Bohannan2, Qianxing Mo3, Marina Vannucci4 and Jerry Tsai5, Corresponding Author Contact Information, E-mail The Corresponding Author

1Department of Statistics, Texas A&M University, College Station, TX 77843, USA 2Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA 3Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA 4Department of Statistics, Rice University, Houston, TX 77251, USA 5Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA

Received 8 November 2007; 
revised 20 February 2008; 
accepted 21 February 2008. 
Edited by M. Sternberg. 
Available online 29 February 2008.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Grouping the 20 residues is a classic strategy to discover ordered patterns and insights about the fundamental nature of proteins, their structure, and how they fold. Usually, this categorization is based on the biophysical and/or structural properties of a residue's side-chain group. We extend this approach to understand the effects of side chains on backbone conformation and to perform a knowledge-based classification of amino acids by comparing their backbone phi,ψ distributions in different types of secondary structure. At this finer, more specific resolution, torsion angle data are often sparse and discontinuous (especially for nonhelical classes) even though a comprehensive set of protein structures is used. To ensure the precision of Ramachandran plot comparisons, we applied a rigorous Bayesian density estimation method that produces continuous estimates of the backbone phi,ψ distributions. Based on this statistical modeling, a robust hierarchical clustering was performed using a divergence score to measure the similarity between plots. There were seven general groups based on the clusters from the complete Ramachandran data: nonpolar/β-branched (Ile and Val), AsX (Asn and Asp), long (Met, Gln, Arg, Glu, Lys, and Leu), aromatic (Phe, Tyr, His, and Cys), small (Ala and Ser), bulky (Thr and Trp), and, lastly, the singletons of Gly and Pro. At the level of secondary structure (helix, sheet, turn, and coil), these groups remain somewhat consistent, although there are a few significant variations. Besides the expected uniqueness of the Gly and Pro distributions, the nonpolar/β-branched and AsX clusters were very consistent across all types of secondary structure. Effectively, this consistency across the secondary structure classes implies that side-chain steric effects strongly influence a residue's backbone torsion angle conformation. These results help to explain the plasticity of amino acid substitutions on protein structure and should help in protein design and structure evaluation.

Keywords: Ramachandran plot; Torsion angles; Bayesian density estimation; Clustering; Residue backbone similarity

Abbreviations: H, helix; E, sheet; T, turn; C, coil; PDB, Protein Data Bank

Article Outline

Introduction
Results and Discussion
Binning versus density estimation
Density-estimated distributions
H class
T class
E class
C class
Clustering within classes
Clustering between classes
Conclusion
Materials and Methods
Data set
Structural analysis
Data binning
Nonparametric Bayesian density estimation
Parameter settings
Predictive inference
Distance of divergence
Clustering
Acknowledgements
References




 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.