PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory☆
Introduction
Network paradigm has been gaining popularity in studying different biological systems like metabolic networks, gene networks, as well as protein–protein interaction networks owing to simplicity in representation and accessibility to the variety of methods to quantitatively describe system’s behavior and architecture.
Protein as a complex system also has benefited significantly from the network notion and the useful methods available in the graph theory. According to the integrative nature of proteins and cooperative behavior of a large fraction of residues in function at some level, it is relevant to consider the whole structure as a completely connected network [1]. Protein Structure Network or PSN reduces the 3D structure of the protein in a two-dimensional graph and makes a model in which constituent nodes denote structural units (atoms, amino acids, or secondary structural elements) and edges show the contact between them. To combine topological information and global connectivity, network perspective provides the opportunity to better understand possible relevance of various topological and global parameters causing to the generation of folds, dynamics and function in proteins [2], [3]. PSNs have contributed substantially to the survey of several challenging problems in structural biology such as protein stability, intra and inter-molecular communications, identifying key residues involved in catalytic processes and protein folding, mapping allosteric pathways, protein folding kinetics, protein dynamics [1], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15] and so on.
Identification of critical nodes in the network aids to interpret clearly how the node’s topological position affects its cross-talk in the network and what would be the direct consequence of this on functional features of the protein. Hubs (nodes with the highest links in the network) and the distance or position-based central nodes are placed in regions of the PSN where highly correlate to functionally and structurally critical points of the protein. Therefore, small-world organization and unusually high assortativity values of the PSNs are important determinants which delineate the key role of topology in efficiency, accuracy and rapidity of information transfer in proteins [16].
There are some tools and web interfaces available for biological network investigations in different platforms. Depending on the goal of design, these tools apply different methods and formats for generating and exporting of graphs respectively, represent the graph in various layouts and calculate numerous features of the network [17], [18], [19]. While each of these tools provides useful information about protein structure and topology, the output format may not be supported in other software for further analyses; Additionally, exploring and practicing with different software is a time consuming process for non-expert users.
There are number of useful tools in Matlab for exploring and topological analysis of metabolic networks [20], genomics networks [21], [22] and sequences [23] there is no comprehensive tool exclusively devoted to protein structure network studies. Our objective is to offer a stand-alone, complementary graphical toolbox with ability to construct different graph types and convert standard formats to each other, perform topological analyses, compute graph statistical features, represent high quality visualization of graphs as well as being easy-to-use enough for users without programming expertize. To meet this need, PDB2Graph designed as an interactive, and user-friendly GUI for PSN construction, visualization, and analysis. Moreover, due to converting different graph formats to each other, it provides the possibility of further analysis by other packages. Based on calculation of centrality indices of the network, this toolbox also recommends a practical guideline to select proper amino acids for mutagenesis-based methods in protein engineering.
Section snippets
Program overview
Matlab is a high-level language which is intended for interactive programming, visualization and technical computing; moreover there are other advantages like applied functions and toolboxes, simple syntax, powerful graphics and complicated plotting library which make Matlab a convenient, popular environment for dealing with and manipulating biological data. PDB2Graph is a graphical interface written in Matlab package. It uses some of the built-in functions of Matlab Bioinformatics Toolbox (The
Results and discussion
Several papers have demonstrated the remarkable capability of the topological and global features in identifying critical nodes in the network and also the close correspondence of high centrality of a node and its functionality in the biological networks [12], [30], [32], [33], [34], [35].
To verify the essentiality and importance of central nodes in the PSN and their contribution to protein structure robustness and functionality, four model proteins with comprehensive, experimental data were
Conclusion
We introduced PDB2Graph as a user-friendly toolbox which not only supports construction and visualization of different kinds of PSN but also can be used to uncover some structural and functional properties of the proteins. The other application of this toolbox is identification of important amino acids for maintaining structural and functional integrity of the protein based on network centrality indices. Several studies have demonstrated that central amino acids are associated with key amino
Author disclosure statement
The authors of this manuscript declare no competing financial interests exist.
Conflict of interest
The authors declare that they do not have any financial, professional or personal conflict of interest.
Summary
In this paper, we introduce PDB2Graph a comprehensive, user-friendly toolbox for protein structure network (PSN) construction, visualization and analysis. In this interactive interface, PDB serves as input. The output is an undirected, coarse-grained, distance-based graph which can be exported in standard graph formats of popular packages like Cytoscape (.sif), Pajek (.net) and UCINET (.dl). Different types of formats enable users to take advantages of other packages for complementary analyses.
References (52)
- et al.
The protein folding network
J. Mol. Biol.
(2004) - et al.
Network analysis of protein dynamics
Febs Lett.
(2007) - et al.
Computational approaches to mapping allosteric pathways
Curr. Opin. Struct. Biol.
(2014) - et al.
Analyzing and visualizing residue networks of protein structures
Trends Biochem. Sci.
(2011) A human protein-protein interaction network: a resource for annotating the proteome
Cell
(2005)- et al.
Network analysis of protein structures identifies functional residues
J. Mol. Biol.
(2004) Creative elements: network-based predictions of active centres in proteins and cellular and social networks
Trends Biochem. Sci.
(2008)- et al.
A network representation of protein structures: implications for protein stability
Biophys. J.
(2005) - et al.
Intra-and interdomain effects due to mutation of calcium-binding sites in calmodulin
J. Biol. Chem.
(2010) - et al.
A coupled equilibrium shift mechanism in calmodulin-mediated signal transduction
Structure
(2008)
Systematic mutation of bacteriophage T4 lysozyme
J. Mol. Biol.
Similar hydrophobic replacements of Leu99 and Phe153 within the core of T4 lysozyme have different structural and thermodynamic consequences
J. Mol. Biol.
Crystal structures of ribonuclease HI active site mutants from Escherichia coli
J. Biol. Chem.
Protein folding and stability: the pathway of folding of barnase
FEBS Lett.
A network representation of protein structures: implications for protein stability
Biophys. J.
The construction of an amino acid network for understanding protein structure and function
Amino Acids
Proteins as networks: usefulness of graph theory in protein science
Curr. Protein Pept. Sci.
Intra and Inter-molecular communications through protein structure network
Curr. Protein Pept. Sci.
Novel feature for catalytic protein residues reflecting interactions with other residues
Plos One
Topological determinants of protein folding
Proc. Natl. Acad. Sci. USA
Efficient identification of critical residues based only on protein structure by network analysis
Plos One
Protein flexibility predictions using graph theory
Proteins: Struct. Funct. Bioinform.
Low-frequency normal modes that describe allosteric transitions in biological nanomachines are robust to sequence variations
Proc. Natl. Acad. Sci. USA
Residues crucial for maintaining short paths in network communication mediate signaling in proteins
Mol. Syst. Biol.
Detection of allosteric signal transmission by information-theoretic analysis of protein dynamics
FASEB J.
Assortative mixing in protein contact networks and protein folding kinetics
Bioinformatics
Cited by (6)
De novo design and synthesis of biomolecules
2022, New Frontiers and Applications of Synthetic BiologyRole of a high centrality residue in protein dynamics and thermal stability
2021, Journal of Structural BiologyCitation Excerpt :Among the different perspectives developed to tackle this question, the Protein Residue Interaction Networks (RIN) is an interesting approach that deserves attention. Protein RIN has been applied to identify functional residues in protein structures (Amitai et al., 2004; del Sol et al., 2006; Böde et al., 2007; Li et al., 2011; Niknam et al., 2016; Mallik and Kundu, 2017), to find residues that act as nucleation points in the protein folding process (Vendruscolo et al., 2002), to map dynamic allosteric pathways (Atilgan et al., 2004; Szalay et al., 2013; Fokas et al, 2016; Negre et al, 2018), and to detect traits linked to thermostability (Brinda and Vishveshwara, 2005; Vijayabaskar and Vishveshwara, 2010; Souza et al., 2016). Protein Residue Interaction Networks are constructed using Cα atoms as nodes in the network and non-covalent interactions between pairs of residues as edges.
A comparative study on structural proteins of viruses that belong to the identical family
2023, European Physical Journal: Special TopicsA Study on the Variants and Subvariants of a Solitary Virus
2023, Fractal Signatures in the Dynamics of an Epidemiology: an Analysis of COVID-19 TransmissionRecent advances in user-friendly computational tools to engineer protein function
2021, Briefings in Bioinformatics
- ☆
Availability: The source code and user guide of PDB2Graph are available free for academic users at: http://bioinf.modares.ac.ir/software/pdb2graph