Copyright © 2008 Elsevier Ltd All rights reserved.
Assessing Side-Chain Perturbations of the Protein Backbone: A Knowledge-Based Classification of Residue Ramachandran Space
Received 8 November 2007;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
Grouping the 20 residues is a classic strategy to discover ordered patterns and insights about the fundamental nature of proteins, their structure, and how they fold. Usually, this categorization is based on the biophysical and/or structural properties of a residue's side-chain group. We extend this approach to understand the effects of side chains on backbone conformation and to perform a knowledge-based classification of amino acids by comparing their backbone
,ψ distributions in different types of secondary structure. At this finer, more specific resolution, torsion angle data are often sparse and discontinuous (especially for nonhelical classes) even though a comprehensive set of protein structures is used. To ensure the precision of Ramachandran plot comparisons, we applied a rigorous Bayesian density estimation method that produces continuous estimates of the backbone
,ψ distributions. Based on this statistical modeling, a robust hierarchical clustering was performed using a divergence score to measure the similarity between plots. There were seven general groups based on the clusters from the complete Ramachandran data: nonpolar/β-branched (Ile and Val), AsX (Asn and Asp), long (Met, Gln, Arg, Glu, Lys, and Leu), aromatic (Phe, Tyr, His, and Cys), small (Ala and Ser), bulky (Thr and Trp), and, lastly, the singletons of Gly and Pro. At the level of secondary structure (helix, sheet, turn, and coil), these groups remain somewhat consistent, although there are a few significant variations. Besides the expected uniqueness of the Gly and Pro distributions, the nonpolar/β-branched and AsX clusters were very consistent across all types of secondary structure. Effectively, this consistency across the secondary structure classes implies that side-chain steric effects strongly influence a residue's backbone torsion angle conformation. These results help to explain the plasticity of amino acid substitutions on protein structure and should help in protein design and structure evaluation.
Keywords: Ramachandran plot; Torsion angles; Bayesian density estimation; Clustering; Residue backbone similarity
Abbreviations: H, helix; E, sheet; T, turn; C, coil; PDB, Protein Data Bank
Article Outline
- Introduction
- Results and Discussion
- Binning versus density estimation
- Density-estimated distributions
- Clustering within classes
- Clustering between classes
- Conclusion
- Materials and Methods
- Data set
- Structural analysis
- Data binning
- Nonparametric Bayesian density estimation
- Parameter settings
- Predictive inference
- Distance of divergence
- Clustering
- Acknowledgements
- References






E-mail Article
Add to my Quick Links

Cited By in Scopus (0)

2.0 Å), non-homologous, protein crystal structures. The backbone dihedral angles (φ, ψ) of the terminating residue (T) were found to cluster either in the left-handed helical region (αL: φ = 20° to 125° and ψ = −45° to 90°; 469 helices (44%)) or in the extended region (E: φ = −180° to −30° and ψ = 60° to 180° and −180° to −150°; 459 helices (43%)) of the Ramachandran map. These two broad categories of helix stop signals, αL and E-terminated helices, were further examined for sequence preferences. Gly residues were found to have an overwhelming preference to occur as the “αL-terminator (T)” resulting in the classical Schellman motif, with a strong preference for hydrophobic residues at position T − 4 and T + 1. In the case of E-terminated helices His, Asn, Leu and Phe were found to occur with high propensity at position T. Quite remarkably Pro residues, with single exception, were absent at position T, but had the highest propensity at position T + 1. Examination of the frequencies of hydrophobic (h) and polar (p) residues at positions flanking Gly/Pro permitted delineation of exclusive patterns and predictive rules for Gly-terminated helices and Pro-terminated helices. The analysis reveals that Pro residues flanked by polar amino acids have a very strong tendency to terminate helices. Examination of a segment ranging from T − 4 to T + 3 appeared to be necessary to determine whether helix termination or continuation occur at Gly residues. The two types of helix termination (αL, E) signals also differed dramatically in their solvent accessibility. Gly and Pro residues at helix termini appeared to be strongly conserved in homologous sequences.




