PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory

https://doi.org/10.1016/j.compbiomed.2016.03.012Get rights and content

abstract

The integrative and cooperative nature of protein structure involves the assessment of topological and global features of constituent parts. Network concept takes complete advantage of both of these properties in the analysis concomitantly. High compatibility to structural concepts or physicochemical properties in addition to exploiting a remarkable simplification in the system has made network an ideal tool to explore biological systems. There are numerous examples in which different protein structural and functional characteristics have been clarified by the network approach. Here, we present an interactive and user-friendly Matlab-based toolbox, PDB2Graph, devoted to protein structure network construction, visualization, and analysis. Moreover, PDB2Graph is an appropriate tool for identifying critical nodes involved in protein structural robustness and function based on centrality indices. It maps critical amino acids in protein networks and can greatly aid structural biologists in selecting proper amino acid candidates for manipulating protein structures in a more reasonable and rational manner.

To introduce the capability and efficiency of PDB2Graph in detail, the structural modification of Calmodulin through allosteric binding of Ca2+ is considered. In addition, a mutational analysis for three well-identified model proteins including Phage T4 lysozyme, Barnase and Ribonuclease HI, was performed to inspect the influence of mutating important central residues on protein activity.

Introduction

Network paradigm has been gaining popularity in studying different biological systems like metabolic networks, gene networks, as well as protein–protein interaction networks owing to simplicity in representation and accessibility to the variety of methods to quantitatively describe system’s behavior and architecture.

Protein as a complex system also has benefited significantly from the network notion and the useful methods available in the graph theory. According to the integrative nature of proteins and cooperative behavior of a large fraction of residues in function at some level, it is relevant to consider the whole structure as a completely connected network [1]. Protein Structure Network or PSN reduces the 3D structure of the protein in a two-dimensional graph and makes a model in which constituent nodes denote structural units (atoms, amino acids, or secondary structural elements) and edges show the contact between them. To combine topological information and global connectivity, network perspective provides the opportunity to better understand possible relevance of various topological and global parameters causing to the generation of folds, dynamics and function in proteins [2], [3]. PSNs have contributed substantially to the survey of several challenging problems in structural biology such as protein stability, intra and inter-molecular communications, identifying key residues involved in catalytic processes and protein folding, mapping allosteric pathways, protein folding kinetics, protein dynamics [1], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15] and so on.

Identification of critical nodes in the network aids to interpret clearly how the node’s topological position affects its cross-talk in the network and what would be the direct consequence of this on functional features of the protein. Hubs (nodes with the highest links in the network) and the distance or position-based central nodes are placed in regions of the PSN where highly correlate to functionally and structurally critical points of the protein. Therefore, small-world organization and unusually high assortativity values of the PSNs are important determinants which delineate the key role of topology in efficiency, accuracy and rapidity of information transfer in proteins [16].

There are some tools and web interfaces available for biological network investigations in different platforms. Depending on the goal of design, these tools apply different methods and formats for generating and exporting of graphs respectively, represent the graph in various layouts and calculate numerous features of the network [17], [18], [19]. While each of these tools provides useful information about protein structure and topology, the output format may not be supported in other software for further analyses; Additionally, exploring and practicing with different software is a time consuming process for non-expert users.

There are number of useful tools in Matlab for exploring and topological analysis of metabolic networks [20], genomics networks [21], [22] and sequences [23] there is no comprehensive tool exclusively devoted to protein structure network studies. Our objective is to offer a stand-alone, complementary graphical toolbox with ability to construct different graph types and convert standard formats to each other, perform topological analyses, compute graph statistical features, represent high quality visualization of graphs as well as being easy-to-use enough for users without programming expertize. To meet this need, PDB2Graph designed as an interactive, and user-friendly GUI for PSN construction, visualization, and analysis. Moreover, due to converting different graph formats to each other, it provides the possibility of further analysis by other packages. Based on calculation of centrality indices of the network, this toolbox also recommends a practical guideline to select proper amino acids for mutagenesis-based methods in protein engineering.

Section snippets

Program overview

Matlab is a high-level language which is intended for interactive programming, visualization and technical computing; moreover there are other advantages like applied functions and toolboxes, simple syntax, powerful graphics and complicated plotting library which make Matlab a convenient, popular environment for dealing with and manipulating biological data. PDB2Graph is a graphical interface written in Matlab package. It uses some of the built-in functions of Matlab Bioinformatics Toolbox (The

Results and discussion

Several papers have demonstrated the remarkable capability of the topological and global features in identifying critical nodes in the network and also the close correspondence of high centrality of a node and its functionality in the biological networks [12], [30], [32], [33], [34], [35].

To verify the essentiality and importance of central nodes in the PSN and their contribution to protein structure robustness and functionality, four model proteins with comprehensive, experimental data were

Conclusion

We introduced PDB2Graph as a user-friendly toolbox which not only supports construction and visualization of different kinds of PSN but also can be used to uncover some structural and functional properties of the proteins. The other application of this toolbox is identification of important amino acids for maintaining structural and functional integrity of the protein based on network centrality indices. Several studies have demonstrated that central amino acids are associated with key amino

Author disclosure statement

The authors of this manuscript declare no competing financial interests exist.

Conflict of interest

The authors declare that they do not have any financial, professional or personal conflict of interest.

Summary

In this paper, we introduce PDB2Graph a comprehensive, user-friendly toolbox for protein structure network (PSN) construction, visualization and analysis. In this interactive interface, PDB serves as input. The output is an undirected, coarse-grained, distance-based graph which can be exported in standard graph formats of popular packages like Cytoscape (.sif), Pajek (.net) and UCINET (.dl). Different types of formats enable users to take advantages of other packages for complementary analyses.

References (52)

  • D. Rennell et al.

    Systematic mutation of bacteriophage T4 lysozyme

    J. Mol. Biol.

    (1991)
  • A. Eriksson et al.

    Similar hydrophobic replacements of Leu99 and Phe153 within the core of T4 lysozyme have different structural and thermodynamic consequences

    J. Mol. Biol.

    (1993)
  • K. Katayanagi et al.

    Crystal structures of ribonuclease HI active site mutants from Escherichia coli

    J. Biol. Chem.

    (1993)
  • A.R. Fersht

    Protein folding and stability: the pathway of folding of barnase

    FEBS Lett.

    (1993)
  • S. Vishveshvara et al.

    A network representation of protein structures: implications for protein stability

    Biophys. J.

    (2005)
  • W. Yan et al.

    The construction of an amino acid network for understanding protein structure and function

    Amino Acids

    (2014)
  • A. Giuliani et al.

    Proteins as networks: usefulness of graph theory in protein science

    Curr. Protein Pept. Sci.

    (2008)
  • S. Vishveshwara et al.

    Intra and Inter-molecular communications through protein structure network

    Curr. Protein Pept. Sci.

    (2009)
  • Y. Li et al.

    Novel feature for catalytic protein residues reflecting interactions with other residues

    Plos One

    (2011)
  • N.V. Dokholyan et al.

    Topological determinants of protein folding

    Proc. Natl. Acad. Sci. USA

    (2002)
  • M.P. Cusack et al.

    Efficient identification of critical residues based only on protein structure by network analysis

    Plos One

    (2007)
  • D.J. Jacobs et al.

    Protein flexibility predictions using graph theory

    Proteins: Struct. Funct. Bioinform.

    (2001)
  • W. Zheng et al.

    Low-frequency normal modes that describe allosteric transitions in biological nanomachines are robust to sequence variations

    Proc. Natl. Acad. Sci. USA

    (2006)
  • A. del Sol et al.

    Residues crucial for maintaining short paths in network communication mediate signaling in proteins

    Mol. Syst. Biol.

    (2006)
  • A. Pandini et al.

    Detection of allosteric signal transmission by information-theoretic analysis of protein dynamics

    FASEB J.

    (2012)
  • G. Bagler et al.

    Assortative mixing in protein contact networks and protein folding kinetics

    Bioinformatics

    (2007)
  • Cited by (6)

    • De novo design and synthesis of biomolecules

      2022, New Frontiers and Applications of Synthetic Biology
    • Role of a high centrality residue in protein dynamics and thermal stability

      2021, Journal of Structural Biology
      Citation Excerpt :

      Among the different perspectives developed to tackle this question, the Protein Residue Interaction Networks (RIN) is an interesting approach that deserves attention. Protein RIN has been applied to identify functional residues in protein structures (Amitai et al., 2004; del Sol et al., 2006; Böde et al., 2007; Li et al., 2011; Niknam et al., 2016; Mallik and Kundu, 2017), to find residues that act as nucleation points in the protein folding process (Vendruscolo et al., 2002), to map dynamic allosteric pathways (Atilgan et al., 2004; Szalay et al., 2013; Fokas et al, 2016; Negre et al, 2018), and to detect traits linked to thermostability (Brinda and Vishveshwara, 2005; Vijayabaskar and Vishveshwara, 2010; Souza et al., 2016). Protein Residue Interaction Networks are constructed using Cα atoms as nodes in the network and non-covalent interactions between pairs of residues as edges.

    • A Study on the Variants and Subvariants of a Solitary Virus

      2023, Fractal Signatures in the Dynamics of an Epidemiology: an Analysis of COVID-19 Transmission

    Availability: The source code and user guide of PDB2Graph are available free for academic users at: http://bioinf.modares.ac.ir/software/pdb2graph

    View full text