Abstract
Protein structure alignment algorithms play an important role in the studies of protein structure and function. In this paper, a novel approach for structure alignment is presented. Specifically, core regions in two protein structures are first aligned by identifying connected components in a network of neighboring geometrically compatible aligned fragment pairs. The initial alignments then are refined through a multi-objective optimization method. The algorithm can produce both sequential and non-sequential alignments. We show the superior performance of the proposed algorithm by the computational experiments on several benchmark datasets and the comparisons with the well-known structure alignment algorithms such as DALI, CE and MATT. The proposed method can obtain accurate and biologically significant alignment results for the case with occurrence of internal repeats or indels, identify the circular permutations, and reveal conserved functional sites. A ranking criterion of our algorithm for fold similarity is presented and found to be comparable or superior to the Z-score of CE in most cases from the numerical experiments. The software and supplementary data of computational results are available at http://zhangroup.aporc.org/bioinfo/SANA.
Similar content being viewed by others
Abbreviations
- PDB:
-
Protein data bank
- RMSD:
-
Root mean square distance
- AFP:
-
Aligned fragment pair
- SANA:
-
Protein structure alignment based on sequence neighborhood alignment
References
Aung Z, Tan KL (2006) MatAlign: precise protein structure comparison by matrix alignment. J Bioinform Comput Biol 4:1197–1216
Bachar O, Fischer D, Nussinov R, Wolfson H (1993) A computer version based technique for 3-d sequence independent structural comparison of proteins. Protein Eng 6:279–288
Betancourt MR, Skolnick J (2001) Universal similarity measure for comparing protein structures. Biopolymers 59:305–309
Bhattacharya S, Bhattacharyya C, Chandra NR (2007) Comparison of protein structures comparison by growing neighborhood alignments. BMC Bioinformatics 8:77
Birzele F, Gewehr JE, Csaba G, Zimmer R (2007) Vorolign-fast structural alignment using Voronoi contacts. Bioinformatics 23:e205–e211
Chen L, Wu LY, Wang Y, Zhang SH, Zhang XS (2006) Revealing divergent evolution, identifying circular permutations and detecting active sites by protein structure comparison. BMC Struct Biol 6:18
Fischer D, Elofsson A, Rice DW, Eisenberg D (1996) Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In: Proceedings of 1996 Pacific Symposium on Biocomputing, pp 300–318
Gazit H (1991) An optimal randomized parallel algorithm for finding connected components in a graph. SIAM J Comput 20:1046–1067
Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–128
Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr Sect D 60:2256–2268
Lo WC, Lyu PC (2008) CPSARST: an efficient circular permutation search tool applied to the detection of novel protein structural relationships. Genome Biol 9:R11
Mayr G, Domingues FS, Lackner P (2007) Comparative analysis of protein structure alignments. BMC Struct Biol 7:50
Menke M, Berger B, Cowen L (2008) Matt: local flexibility aids protein multiple structure alignment. PLoS Comput Biol 4:1–12
Novotny M, Madsen D, Kleywegt GJ (2004) Evaluation of protein fold comparison servers. Proteins Struct Funct Bioinform 54:260–270
Shatsky M, Nussinov R, Wolfson H (2004) A method for simultaneous alignment of multiple protein structures. Proteins Struct Funct Bioinform 56:143–156
Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11:739–747
Van Walle I, Lasters I, Wyns L (2005) SABmark—a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21:1267–1268
Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19:e246–e255
Acknowledgments
We are grateful to the anonymous referees for many helpful comments that greatly improved the paper. This work is partly supported by the National Natural Science Foundation of China (NSFC) under Key Research Grant No. 10631070, Research Grant No. 60503004, and JSPS-NSFC collaborative project (No. 10711140116). This work is also partially supported by the Chief Scientist Program of Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences with the grant no. 2009CSP002.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Wang, L., Wu, LY., Wang, Y. et al. SANA: an algorithm for sequential and non-sequential protein structure alignment. Amino Acids 39, 417–425 (2010). https://doi.org/10.1007/s00726-009-0457-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-009-0457-y