Abstract
Antifreeze proteins (AFPs) are indispensable for living organisms to survive in an extremely cold environment and have a variety of potential biotechnological applications. The accurate prediction of antifreeze proteins has become an important issue and is urgently needed. Although considerable progress has been made, AFP prediction is still a challenging problem due to the diversity of species. In this study, we proposed a new sequence-based AFP predictor, called TargetFreeze. TargetFreeze utilizes an enhanced feature representation method that weightedly combines multiple protein features and takes the powerful support vector machine as the prediction engine. Computer experiments on benchmark datasets demonstrate the superiority of the proposed TargetFreeze over most recently released AFP predictors. We also implemented a user-friendly web server, which is openly accessible for academic use and is available at http://csbio.njust.edu.cn/bioinf/TargetFreeze. TargetFreeze supplements existing AFP predictors and will have potential applications in AFP-related biotechnology fields.


Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Ahmad S, Gromiha MM et al (2004) Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 20:477–486
Block RJ, Bolling D (1951) The amino acid composition of proteins and foods. Analytical methods and results. Charles C Thomas, Springfield
Breiman L (2001) Random forests. Mach Learn 45:5–32
Breton G, Danyluk J et al (2000) Biotechnological applications of plant freezing associated proteins. Biotechnol Annu Rev 6:59–101
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 27:1–27
Chen W, Feng PM et al (2013) iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res 41:e68
Chen W, Feng P-M et al (2014) iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Anal Biochem 462:76–83
Chou K-C (1992) Energy-optimized structure of antifreeze protein and its binding mechanism. J Mol Biol 223:509–517
Chou K (2001a) Using subsite coupling to predict signal peptides. Protein Eng 14:75–79
Chou KC (2001b) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43:246–255
Chou K-C (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247
Chou KC (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol Biosyst 9:1092–1100
Davies PL, Hew CL (1990) Biochemistry of fish antifreeze proteins. FASEB J 4:2460–2468
Dehzangi A, Heffernan R et al (2015) Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's general PseAAC. J Theor Biol 364:284–294
Ding H, Deng E-Z et al (2014) iCTX-type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int. doi:10.1155/2014/286419
Fan RE, Chen PH et al (2005) Working set selection using second order information for training SVM. J Mach Learn Res 6:1889–1918
Feeney RE, Yeh Y (1998) Antifreeze proteins: current status and possible food uses. Trends Food Sci Technol 9:102–106
Fletcher GL, Hew CL et al (2001) Antifreeze proteins of teleost fishes. Annu Rev Physiol 63:359–390
Graham LA, Lougheed SC et al (2008) Lateral transfer of a lectin-like antifreeze protein gene in fishes. PLoS One 3:e2616
Griffith M, Ewart KV (1995) Antifreeze proteins and their potential use in frozen foods. Biotechnol Adv 13:375–402
Griffith M, Yaish MW (2004) Antifreeze proteins in overwintering plants: a tale of two activities. Trends Plant Sci 9:399–405
Guo SH, Deng EZ et al (2014) iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 30:1522–1529
Huang C, Yuan J-Q (2013) A multilabel model based on Chou’s pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types. J Membr Biol 246:327–334
Huang W-L, Tung C-W et al (2009) Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 98:73–79
Jahandideh S, Mahdavi A (2012) RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest. J Theor Biol 306:115–119
Jia Z, Davies PL (2002) Antifreeze proteins: an unusual receptor–ligand interaction. Trends Biochem Sci 27:101–106
Kandaswamy KK, Chou K-C et al (2011) AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties. J Theor Biol 270:56–62
Kecman V (2001) Learning and soft computing: support vector machines, neural networks, and fuzzy logic models. MIT press, Cambridge
Khan ZU, Hayat M et al (2015) Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 365:197–203
Kim S-K (2013) Marine proteins and peptides: biological activities and applications. Wiley, Chichester
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Levitt J (1980) Responses of plants to environmental stresses, vol II., Water, radiation, salt, and other stressesAcademic Press, New York
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659
Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2:18–22
Lin WZ, Fang JA et al (2013) iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins. Mol Biosyst 4:634–644
Lin H, Deng E-Z et al (2014) iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res 42:12961–12972
Liu T, Geng X et al (2012) Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles. Amino Acids 42:2243–2249
Liu B, Xu J et al (2014) iDNA-Prot|dis: Identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition. PLoS One 9:e106691
Liu B, Fang L et al (2015) Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS One 10:e0121501
Mandal M, Mukhopadhyay A et al (2015) Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou’s PseAAC. Med Biol Eng Compu 53:331–344
Mondal S, Pai PP (2014) Chou's pseudo amino acid composition improves sequence-based antifreeze protein prediction. J Theor Biol 356:30–35
Roy S, Martinez D et al (2009) Exploiting amino acid composition for predicting protein-protein interactions. PLoS One 4:e7813
Schäffer AA, Aravind L et al (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 29:2994–3005
Sformo T, Kohl F et al (2009) Simultaneous freeze tolerance and avoidance in individual fungus gnats, Exechia nugatoria. J Comp Physiol B 179:897–902
Shen HB, Chou KC (2008) PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. Anal Biochem 373:386–388
Sonnhammer EL, Eddy SR et al (1997) Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28:405–420
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Wold S, Jonsson J et al (1993) DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 277:239–253
Xiao X, Wang P et al (2013) iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal Biochem 436:168–177
Xu Y, Shao XJ, Wu LY, Deng NY, Chou KC (2013) iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ 1:e171
Xu Y, Wen X et al (2014) iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One 9:e105018
Yu C-S, Lu C-H (2011) Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions. PLoS One 6:e20445
Yu D, Wu X et al (2012) Enhancing membrane protein subcellular localization prediction by parallel fusion of multi-view features. IEEE Trans Nanobioscience 11:375–385
Yu D-J, Hu J et al (2013) Learning protein multi-view features in complex space. Amino Acids 44:1365–1379
Zhao X, Ma Z et al (2012) Using support vector machine and evolutionary profiles to predict antifreeze protein sequences. Int J Mol Sci 13:2196–2207
Zou H-L (2014) A multi-label classifier for prediction membrane protein functional types in animal. J Membr Biol 247:1141–1148
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61373062, 61222306, 61202134, and 61233011), the Natural Science Foundation of Jiangsu (No. BK20141403), Jiangsu Postdoctoral Science Foundation (No. 1201027C), the Jiangsu University Graduate Research and Innovation Project (No. KYZZ_0123), the China Postdoctoral Science Foundation (No. 2013M530260, 2014T70526), “The Six Top Talents” of Jiangsu Province (No. 2013-XXRJ-022), the Fundamental Research Funds for the Central Universities (No. 30920130111010).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
He, X., Han, K., Hu, J. et al. TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition. J Membrane Biol 248, 1005–1014 (2015). https://doi.org/10.1007/s00232-015-9811-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00232-015-9811-z