Prediction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: An approach from discrete wavelet transform

https://doi.org/10.1016/j.ab.2009.04.009Get rights and content

Abstract

Being the largest family of cell surface receptors, G-protein-coupled receptors (GPCRs) are among the most frequent targets. The functions of many GPCRs are unknown, and it is both time-consuming and expensive to determine their ligands and signaling pathways by experimental methods. It is of great practical significance to develop an automated and reliable method for classification of GPCRs. In this study, a novel method based on the concept of Chou’s pseudo amino acid composition has been developed for predicting and recognizing GPCRs. The discrete wavelet transform was used to extract feature vectors from the hydrophobicity scales of amino acid to construct pseudo amino acid (PseAA) composition for training support vector machine. The prediction accuracies by the current method among the major families of GPCRs, subfamilies of class A, and types of amine receptors were 99.72%, 97.64%, and 99.20%, respectively, showing 9.4% to 18.0% improvement over other existing methods and indicating that the proposed method is a useful automated tool in identifying GPCRs.

Section snippets

Data sets

Three data sets were used in this work. The first data set contains 1238 GPCR sequences that can be classified into three major families: 1103 class A–rhodopsin like, 84 class B–secretin like, and 51 class C–metabotropic/glutamate/pheromone [13]. The average sequence identity percentages for classes A, B, and C are 18.05%, 22.67%, and 26.94%, respectively [13]. The second data set used to recognize the subfamilies of class A–rhodopsin like was generated by Strope and Moriyama [42]. It contains

Selecting wavelet functions

Based on different basis functions, the wavelets have different families; every family has its quality fit for different signal and has different results. Because the characteristics of the analyzing wavelet control the performance of the WT, the better the analyzing wavelet matches the underlying structure in the signal, the more concise and sparse the WT representation. It has been clearly stated that the amount of signal compression and the reconstruction quality are highly dependent on the

Conclusion

In this work, a novel predictive method has been proposed for the prediction of GPCRs by coupling SVM with DWT. The predictive results demonstrate that WT can reduce dimension of input vector, improve calculating efficiency, and effectively extract important classified information. In comparison with previous literature methods, the predictive performance was significantly enhanced, indicating that the current method is an effective tool for the prediction of GPCRs. The establishment of such a

Acknowledgments

This work was supported by grants from the National Natural Science Foundation of China (20605010, 20865003, and 20805023), the Jiangxi Province Natural Science Foundation (2007JZH2644), and the Opening Foundation of State Key Laboratory of Chem/Biosensing and Chemometrics of Hunan University (2006022).

References (71)

  • K.C. Chou

    Low-frequency motions in protein molecules: β-Sheet and β-barrel

    Biophys. J.

    (1985)
  • A. Kandaswamy et al.

    Neural classification of lung sounds using wavelet coefficients

    Comput. Biol. Med

    (2004)
  • K.C. Chou et al.

    Recent progress in protein subcellular location prediction

    Anal. Biochem.

    (2007)
  • B. Rost et al.

    Prediction of protein secondary structure at better than 70% accuracy

    J. Mol. Biol.

    (1993)
  • V.I. Lim

    Algorithms for prediction of α-helical, β-structural regions in globular proteins

    J. Mol. Biol.

    (1974)
  • P.K. Ponnuswamy et al.

    Hydrophobic packing, spatial arrangement of amino acid residues in globular proteins

    Biochim. Biophys. Acta

    (1980)
  • R.M. Sweet et al.

    Correlation of sequence hydrophobicities measure similarity in three dimensional protein structure

    J. Mol. Biol.

    (1983)
  • J. Kyte et al.

    A simple method for displaying the hydropathic character of a protein

    J. Mol. Biol.

    (1982)
  • K.T. Attwood et al.

    Deriving structural and functional insights from a ligand-based hierarchical classification of G protein-coupled receptors

    Protein Eng.

    (2002)
  • F. Horm et al.

    GPCRDB: an information system for G protein-coupled receptors

    Nucleic Acids Res.

    (1998)
  • D.C. Teller et al.

    Advances in determination of a high-resolution three-dimensional structure of rhodopsin, a model of G-protein-coupled receptors

    Biochemistry

    (2001)
  • N. Vaidehi et al.

    Prediction of structure and function of G protein-coupled receptors

    Proc. Natl. Acad. Sci. USA

    (2002)
  • S.F. Altschul et al.

    Gapped BLAST and PSI–BLAST: a new generation of protein database search programs

    Nucleic Acids Res.

    (1997)
  • L.P. Miller et al.

    Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm

    Bioinformatics

    (1991)
  • M. Lapinsh et al.

    Classification of G-protein coupled receptors by alignment-independent extraction of principal chemical properties of primary amino acid sequences

    Protein Sci.

    (2002)
  • M.I. Sadowski et al.

    Automated generation and refinement of protein signatures: case study with G-protein coupled receptors

    Bioinformatics

    (2003)
  • Y. Yabuki et al.

    GRIFFIN: a system for predicting GPCR–G-protein coupling selectivity using a support vector machine and a hidden Markov model

    Nucleic Acids Res.

    (2005)
  • K.C. Chou et al.

    Bioinformatical analysis of G-protein-coupled receptors

    J. Proteome Res.

    (2002)
  • D.W. Elrod et al.

    A study on the correlation of G-protein coupled receptor types with amino acid composition

    Protein Eng.

    (2002)
  • K.C. Chou

    Prediction of G-protein-coupled receptor classes

    J. Proteome Res.

    (2005)
  • R. Karchin et al.

    Classifying G-protein coupled receptors with support vector machines

    Bioinformatics

    (2002)
  • M. Bhasin et al.

    GPCRsclass: a web tool for the classification of amino type of G protein-coupled receptors

    Nucleic Acids Res.

    (2005)
  • P.K. Papasaikas et al.

    PRED–GPCR: GPCR recognition and family classification server

    Nucleic Acids Res.

    (2004)
  • A. Zien et al.

    Engineering support vector machine kernels that recognize translation initiation sites

    Bioinformatics

    (2000)
  • Y.D. Cai et al.

    Support vector machine for predicting membrane protein types by incorporating quasi-sequence-order effect

    Internet Electron. J. Mol. Des.

    (2002)
  • Cited by (136)

    • Genomics of Crucifer’s Host-Pathosystem

      2023, Genomics of Crucifer’s Host-Pathosystem
    View all citing articles on Scopus
    View full text