Abstract
The purpose of this article is to identify protein structural classes by using support vector machine (SVM) ensemble classifier, which is very efficient in enhancing prediction performance. Firstly, auto covariance (AC) and pseudo-amino acid composition (PseAAC) were used in protein representation. AC focuses on adjacent effects and PseAA composition takes sequence order patterns into account. Secondly, SVMs were trained on the datasets represented by different descriptors. The last, ensemble classifier, which constructed on the individual classifiers through a voting strategy, gave the final prediction results. Meanwhile, very promising prediction accuracy 93.14% was obtained by Jackknife test. The experimental results showed that the ensemble system can improve the prediction performance greatly and generate more stable and safer predictors. The current method featured by fusing the protein primary sequence information transferred by AC and described by protein PseAA composition may play an important complementary role in other related applications.
Similar content being viewed by others
Abbreviations
- AC:
-
Auto covariance
- SVM:
-
Support vector machine
- PseAA:
-
Pseudo-amino acid
References
Baldi P, Brunak S, Frasconi P, Soda G, Pollastri G (1999) Bioinformatics 15(11):937–946
Cai YD, Liu XJ, Xu XB, Chou KC (2002) Comput Chem 26:293–296
Cai YD, Feng KY, Lu WC, Chou KC (2006) J Theor Biol 238:172–176
Charton M, Charton BI (1982) J Theor Biol 99(4):629–644
Chang CC, Lin CJ. LIBSVM: a library for support vector machines, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chen C, Tian YX, Zou XY, Cai PX, Mo JY (2006) J Theor Biol 243:444–448
Chou KC (2001) Proteins 43:246–255
Chou KC, Cai YD (2003) J Cell Biochem 90:1250–1260
Chou KC (2005) Bioinformatics 21(1):10–19
Chou KC (1999) Biochem Biophys Res Commun 264:216–224
Chou KC, Shen HB (2007) Biochem Biophys Res Commun 360:339–345
Doytchinova IA, Flower DR (2007) BMC Bioinformatics 8:4
Grantham R (1974) Science 185:862–864
Guo YZ, Yu LZ, Wen ZN, Li ML (2008) Nucleic Acids Res 39(9):3025–3030
Guo YZ, Li ML, Lu MC, Wen ZN, Huang ZT (2006) Proteins 65:55–60
Hopp TP, Woods KR (1981) Proc Natl Acad 78:3824–3828
Krigbaum WR, Komoriya A (1979) Biochim Biophys Acta 576:204–228
Levitt M, Chothia C (1976) Nature 261:552–558
Liu LR, Fang YP, Li ML, Wang CC (2009) Protein J 28:175–181
Lin H, Li QZ (2007) J Comput Chem 28(9):1463–1466
Nanni L, Lumini A (2008) Expert Syst Appl Doi:10.1016/j.eswa.2008.09.036
Riis SK, Krogh A (1996) J Comput Biol 3:163–183
Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Science 229:834–838
Shen HB, Yang J, Liu XJ, Chou KC (2005) Biochem Biophys Res Commun 334:577–581
Shen HB, Chou KC (2007) Peds 20(11):561–567
Shen HB, Chou KC (2006) Bioinformatics 22(14):1717–1722
Shen HB, Yang J, Chou KC (2006) J Theor Biol 240:9–13
Tanford C (1962) J Am Chem 84:4240–4274
Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York
Vapnik V (1998) Statistical learning theory. Wiley, New York
Wold S, Jonsson J, Sjörström M, Sandberg M, Rannar S (1993) Anal Chim Acta 277:239–253
Xiao X, Shao SH, Huang ZD, Chou KC (2006) J Comput Chem 27:478–482
Zhou XB, Chen C, Li ZC, Zou XY (2007) J Theor Biol 248:546–551
Zhou P, Tian FF, Li B, Wu SR, Li ZL (2006) Acta Chimi Sin 64:691–697
Zhang TL, Ding YS (2007) Amino Acids 33:623–629
Acknowledgments
This article was supported by the National Natural Science Foundation of China (No. 20775052).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, J., Li, ML., Yu, LZ. et al. An Ensemble Classifier of Support Vector Machines Used to Predict Protein Structural Classes by Fusing Auto Covariance and Pseudo-Amino Acid Composition. Protein J 29, 62–67 (2010). https://doi.org/10.1007/s10930-009-9222-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-009-9222-z