Skip to main content

Advertisement

Log in

TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition

  • Published:
The Journal of Membrane Biology Aims and scope Submit manuscript

Abstract

Antifreeze proteins (AFPs) are indispensable for living organisms to survive in an extremely cold environment and have a variety of potential biotechnological applications. The accurate prediction of antifreeze proteins has become an important issue and is urgently needed. Although considerable progress has been made, AFP prediction is still a challenging problem due to the diversity of species. In this study, we proposed a new sequence-based AFP predictor, called TargetFreeze. TargetFreeze utilizes an enhanced feature representation method that weightedly combines multiple protein features and takes the powerful support vector machine as the prediction engine. Computer experiments on benchmark datasets demonstrate the superiority of the proposed TargetFreeze over most recently released AFP predictors. We also implemented a user-friendly web server, which is openly accessible for academic use and is available at http://csbio.njust.edu.cn/bioinf/TargetFreeze. TargetFreeze supplements existing AFP predictors and will have potential applications in AFP-related biotechnology fields.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  • Ahmad S, Gromiha MM et al (2004) Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 20:477–486

    Article  CAS  PubMed  Google Scholar 

  • Block RJ, Bolling D (1951) The amino acid composition of proteins and foods. Analytical methods and results. Charles C Thomas, Springfield

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Breton G, Danyluk J et al (2000) Biotechnological applications of plant freezing associated proteins. Biotechnol Annu Rev 6:59–101

    Article  CAS  PubMed  Google Scholar 

  • Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 27:1–27

    Article  CAS  Google Scholar 

  • Chen W, Feng PM et al (2013) iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res 41:e68

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Chen W, Feng P-M et al (2014) iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal Biochem 462:76–83

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (1992) Energy-optimized structure of antifreeze protein and its binding mechanism. J Mol Biol 223:509–517

    Article  CAS  PubMed  Google Scholar 

  • Chou K (2001a) Using subsite coupling to predict signal peptides. Protein Eng 14:75–79

    Article  CAS  PubMed  Google Scholar 

  • Chou KC (2001b) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43:246–255

    Article  CAS  PubMed  Google Scholar 

  • Chou K-C (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247

    Article  CAS  PubMed  Google Scholar 

  • Chou KC (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol Biosyst 9:1092–1100

    Article  CAS  PubMed  Google Scholar 

  • Davies PL, Hew CL (1990) Biochemistry of fish antifreeze proteins. FASEB J 4:2460–2468

    CAS  PubMed  Google Scholar 

  • Dehzangi A, Heffernan R et al (2015) Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's general PseAAC. J Theor Biol 364:284–294

    Article  CAS  PubMed  Google Scholar 

  • Ding H, Deng E-Z et al (2014) iCTX-type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int. doi:10.1155/2014/286419

    Google Scholar 

  • Fan RE, Chen PH et al (2005) Working set selection using second order information for training SVM. J Mach Learn Res 6:1889–1918

    Google Scholar 

  • Feeney RE, Yeh Y (1998) Antifreeze proteins: current status and possible food uses. Trends Food Sci Technol 9:102–106

    Article  CAS  Google Scholar 

  • Fletcher GL, Hew CL et al (2001) Antifreeze proteins of teleost fishes. Annu Rev Physiol 63:359–390

    Article  CAS  PubMed  Google Scholar 

  • Graham LA, Lougheed SC et al (2008) Lateral transfer of a lectin-like antifreeze protein gene in fishes. PLoS One 3:e2616

    Article  PubMed Central  PubMed  Google Scholar 

  • Griffith M, Ewart KV (1995) Antifreeze proteins and their potential use in frozen foods. Biotechnol Adv 13:375–402

    Article  CAS  PubMed  Google Scholar 

  • Griffith M, Yaish MW (2004) Antifreeze proteins in overwintering plants: a tale of two activities. Trends Plant Sci 9:399–405

    Article  CAS  PubMed  Google Scholar 

  • Guo SH, Deng EZ et al (2014) iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 30:1522–1529

    Article  CAS  PubMed  Google Scholar 

  • Huang C, Yuan J-Q (2013) A multilabel model based on Chou’s pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types. J Membr Biol 246:327–334

    Article  CAS  PubMed  Google Scholar 

  • Huang W-L, Tung C-W et al (2009) Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 98:73–79

    Article  CAS  PubMed  Google Scholar 

  • Jahandideh S, Mahdavi A (2012) RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest. J Theor Biol 306:115–119

    Article  CAS  PubMed  Google Scholar 

  • Jia Z, Davies PL (2002) Antifreeze proteins: an unusual receptor–ligand interaction. Trends Biochem Sci 27:101–106

    Article  CAS  PubMed  Google Scholar 

  • Kandaswamy KK, Chou K-C et al (2011) AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties. J Theor Biol 270:56–62

    Article  CAS  PubMed  Google Scholar 

  • Kecman V (2001) Learning and soft computing: support vector machines, neural networks, and fuzzy logic models. MIT press, Cambridge

    Google Scholar 

  • Khan ZU, Hayat M et al (2015) Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 365:197–203

    Article  CAS  PubMed  Google Scholar 

  • Kim S-K (2013) Marine proteins and peptides: biological activities and applications. Wiley, Chichester

    Book  Google Scholar 

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324

    Article  Google Scholar 

  • Levitt J (1980) Responses of plants to environmental stresses, vol II., Water, radiation, salt, and other stressesAcademic Press, New York

    Google Scholar 

  • Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659

    Article  CAS  PubMed  Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2:18–22

    Google Scholar 

  • Lin WZ, Fang JA et al (2013) iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins. Mol Biosyst 4:634–644

    Article  Google Scholar 

  • Lin H, Deng E-Z et al (2014) iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res 42:12961–12972

    Article  PubMed Central  PubMed  Google Scholar 

  • Liu T, Geng X et al (2012) Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles. Amino Acids 42:2243–2249

    Article  CAS  PubMed  Google Scholar 

  • Liu B, Xu J et al (2014) iDNA-Prot|dis: Identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition. PLoS One 9:e106691

    Article  PubMed Central  PubMed  Google Scholar 

  • Liu B, Fang L et al (2015) Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS One 10:e0121501

    Article  PubMed Central  PubMed  Google Scholar 

  • Mandal M, Mukhopadhyay A et al (2015) Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou’s PseAAC. Med Biol Eng Compu 53:331–344

    Article  Google Scholar 

  • Mondal S, Pai PP (2014) Chou's pseudo amino acid composition improves sequence-based antifreeze protein prediction. J Theor Biol 356:30–35

    Article  CAS  PubMed  Google Scholar 

  • Roy S, Martinez D et al (2009) Exploiting amino acid composition for predicting protein-protein interactions. PLoS One 4:e7813

    Article  PubMed Central  PubMed  Google Scholar 

  • Schäffer AA, Aravind L et al (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 29:2994–3005

    Article  PubMed Central  PubMed  Google Scholar 

  • Sformo T, Kohl F et al (2009) Simultaneous freeze tolerance and avoidance in individual fungus gnats, Exechia nugatoria. J Comp Physiol B 179:897–902

    Article  PubMed  Google Scholar 

  • Shen HB, Chou KC (2008) PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. Anal Biochem 373:386–388

    Article  CAS  PubMed  Google Scholar 

  • Sonnhammer EL, Eddy SR et al (1997) Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28:405–420

    Article  CAS  PubMed  Google Scholar 

  • Vapnik VN (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  • Wold S, Jonsson J et al (1993) DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 277:239–253

    Article  CAS  Google Scholar 

  • Xiao X, Wang P et al (2013) iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal Biochem 436:168–177

    Article  CAS  PubMed  Google Scholar 

  • Xu Y, Shao XJ, Wu LY, Deng NY, Chou KC (2013) iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ 1:e171

    Article  PubMed Central  PubMed  Google Scholar 

  • Xu Y, Wen X et al (2014) iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One 9:e105018

    Article  PubMed Central  PubMed  Google Scholar 

  • Yu C-S, Lu C-H (2011) Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions. PLoS One 6:e20445

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Yu D, Wu X et al (2012) Enhancing membrane protein subcellular localization prediction by parallel fusion of multi-view features. IEEE Trans Nanobioscience 11:375–385

    Article  PubMed  Google Scholar 

  • Yu D-J, Hu J et al (2013) Learning protein multi-view features in complex space. Amino Acids 44:1365–1379

    Article  CAS  PubMed  Google Scholar 

  • Zhao X, Ma Z et al (2012) Using support vector machine and evolutionary profiles to predict antifreeze protein sequences. Int J Mol Sci 13:2196–2207

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Zou H-L (2014) A multi-label classifier for prediction membrane protein functional types in animal. J Membr Biol 247:1141–1148

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61373062, 61222306, 61202134, and 61233011), the Natural Science Foundation of Jiangsu (No. BK20141403), Jiangsu Postdoctoral Science Foundation (No. 1201027C), the Jiangsu University Graduate Research and Innovation Project (No. KYZZ_0123), the China Postdoctoral Science Foundation (No. 2013M530260, 2014T70526), “The Six Top Talents” of Jiangsu Province (No. 2013-XXRJ-022), the Fundamental Research Funds for the Central Universities (No. 30920130111010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong-Jun Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, X., Han, K., Hu, J. et al. TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition. J Membrane Biol 248, 1005–1014 (2015). https://doi.org/10.1007/s00232-015-9811-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00232-015-9811-z

Keywords