Abstract
Proteins are large organic compounds made of amino acids arranged in a linear chain (primary structure). Most proteins fold into unique three-dimensional (3D) structures called interchangeably tertiary, folded, or native structures. Discovering the tertiary structure of a protein (Protein Folding Problem) can provide important clues about how the protein performs its function and it is one of the most important problems in Bioinformatics. A contact map of a given protein P is a binary matrix M such that M i,j= 1 iff the physical distance between amino acids i and j in the native structure is less than or equal to a pre-assigned threshold t. The contact map of each protein is a distinctive signature of its folded structure. Predicting the tertiary structure of a protein directly from its primary structure is a very complex and still unsolved problem. An alternative and probably more feasible approach is to predict the contact map of a protein from its primary structure and then to compute the tertiary structure starting from the predicted contact map. This last problem has been recently proven to be NP-Hard [6]. In this paper we give a heuristic method that is able to reconstruct in a few seconds a 3D model that exactly matches the target contact map. We wish to emphasize that our method computes an exact model for the protein independently of the contact map threshold. To our knowledge, our method outperforms all other techniques in the literature [5,10,17,19] both for the quality of the provided solutions and for the running times. Our experimental results are obtained on a non-redundant data set consisting of 1760 proteins which is by far the largest benchmark set used so far. Average running times range from 3 to 15 seconds depending on the contact map threshold and on the size of the protein. Repeated applications of our method (starting from randomly chosen distinct initial solutions) show that the same contact map may admit (depending on the threshold) quite different 3D models. Extensive experimental results show that contact map thresholds ranging from 10 to 18 Ångstrom allow to reconstruct 3D models that are very similar to the proteins native structure. Our Heuristic is freely available for testing on the web at the following url: http://vassura.web.cs.unibo.it/cmap23d/
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)
Andreeva, A., et al.: SCOP database in,: refinements integrate structure and sequence family data. Nucleic Acids Res. 32(Database issue), D226–299 (2004)
Bartoli, L., et al.: The pros and cons of predicting protein contact maps.
Blumental, L.M.: Theory and applications of distance geometry. Chelsea, New York (1970)
Bohr, J., et al.: Protein structures from distance inequalities. J. Mol. Biol. 231, 861–869 (1993)
Breu, H., Kirkpatrick, D.G.: Unit disk graph recognition is NP-hard. Computational Geometry 9, 3–24 (1998)
Cormen, T., Leiserson, C.E., Rivest, R.L.: Introduction to algorithms, 2nd edn. MIT Press, Cambridge (2001)
Crippen, G.M., Havel, T.F.: Distance geometry and molecular conformation. John Wiley & Sons, Chichester (1988)
Fariselli, P., et al.: Progress in predicting inter- residue contacts of proteins with neural networks and correlated mutations. Proteins 45(Suppl. 5), 157–162 (2001)
Galaktionov, S.G., Marshall, G.R.: Properties of intraglobular contacts in proteins: an approach to prediction of tertiary structure. In: System Sciences, 1994, Vol.V: Proceedings of the Twenty-Seventh Hawaii International Conference on Biotechnology Computing, vol. 5, 4-7 Jan. 1994, pp. 326–335 (1994)
de Groot, B.L., et al.: Prediction of protein conformational freedom from distance constraints. Proteins 29, 240–251 (1997)
Havel, T.F.: Distance Geometry: Theory, Algorithms, and Chemical Applications. In: The Encyclopedia of Computational Chemistry (1998)
Lesk, A.: Introduction to Bioinformatics. Oxford University Press, Oxford (2006)
Margara, L., et al.: Reconstruction of the Protein Structures from Contact Maps. Technical report UBLCS-2006-24, University of Bologna, Department of Computer Science (October 2006)
Moré, J., Wu, Z.: [epsilon]-Optimal solutions to distance geometry problems via global continuation. In: Pardalos, P.M., Shalloway, D., Xue, G. (eds.) Global Minimization of Nonconvex Energy Functions: Molecular Conformation and Protein Folding, pp. 151–168. American Mathemtical Society (1995)
Moré, J., Wu, Z.: Distance geometry optimization for protein structures. Journal on Global Optimization 15, 219–234 (1999)
Pollastri, G., et al.: Modular DAG-RNN Architectures for Assembling Coarse Protein Structures. J. Comp. Biol. 13(3), 631–650 (2006)
Saxe, J.B.: Embeddability of weighted graphs in k-space is strongly NP-hard. In: Proc. 17th Allerton Conf. Commun. Control Comput., pp. 480–489 (1979)
Vendruscolo, M., Kussell, E., Domany, E.: Recovery of protein structure from contact maps. Folding and Design 2(5), 295–306 (1997)
Vendruscolo, M., Domany, E.: Protein folding using contact maps. Vitam. Horm. 58, 171–212 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vassura, M., Margara, L., Medri, F., di Lena, P., Fariselli, P., Casadio, R. (2007). Reconstruction of 3D Structures from Protein Contact Maps. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_53
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)