Abstract
Allergy is an overreaction by the immune system to a previously encountered, ordinarily harmless substance —typically proteins—resulting in skin rash, swelling of mucous membranes, sneezing or wheezing, or other abnormal conditions. The use of modified proteins is increasingly widespread: their presence in food, commercial products, such as washing powder, and medical therapeutics and diagnostics, makes predicting and identifying potential allergens a crucial societal issue. The prediction of allergens has been explored widely using bioinformatics, with many tools being developed in the last decade; many of these are freely available online. Here, we report a set of novel models for allergen prediction utilizing amino acid E-descriptors, auto- and cross-covariance transformation, and several machine learning methods for classification, including logistic regression (LR), decision tree (DT), naïve Bayes (NB), random forest (RF), multilayer perceptron (MLP) and k nearest neighbours (kNN). The best performing method was kNN with 85.3 % accuracy at 5-fold cross-validation. The resulting model has been implemented in a revised version of the AllerTOP server (http://www.ddg-pharmfac.net/AllerTOP).
Similar content being viewed by others
References
FAO/WHO Agriculture and Consumer Protection (2001) Evaluation of allergenicity of genetically modified foods. Report of a Joint FAO/WHO Expert Consultation on Allergenicity of Foods Derived from Biotechnology, Rome
FAO/WHO Codex Alimentarius Commission (2003) Codex principles and guidelines on foods derived from biotechnology. Joint FAO/WHO Food Standards Programme, Rome
Stadler MB, Stadler BM (2003) FASEB J 17:1141–1143
Zorzet A, Gustafsson M, Hammerling U (2002) In Silico Biol 2:525–534
Cui J, Han LY, Li H, Ung CY, Tang ZQ, Zheng CJ, Cao ZW, Chen YZ (2007) Mol Immunol 44:514–520
Wang J, Yu Y, Zhao Y, Zhang D, Li J (2013) BMC Bioinforma 14(4):S1
Dimitrov I, Flower DR, Doytchinova I (2013) BMC Bioinforma 14(6):S4
Dimitrov I, Naneva L, Doytchinova I, Bangov I (2014) Bioinformatics 30(6):846–851
Nyström Å, Andersson PM, Lundstedt T (2000) Quant Struct-Act Relat 19:264–269
Venkatarajan MS, Braun W (2001) J Mol Model 7:445–453
Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJ (2009) Bioinformatics 25:1422–1423
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) SIGKDD Explorations 11:10–18
Dayhoff MO, Schwartz RM, Orcutt BC (1978) In: Dayhoff MO (ed) Atlas of protein sequence and structure. National Biomedical Research Foundation, Washington, DC, pp 345–352
Henikoff S, Henikoff J (1992) Proc Natl Acad Sci USA 89:10915–10919
Schein CH, Ozgun N, Izumi T, Braun W (2002) BMC Bionformatics 3:37
Venkatarajan MS, Schein CH, Braun W (2003) Bioinformatics 19:1381–1390
Schein CH, Zhou B, Braun W (2005a) Virol J 2:40
Schein CH, Zhou B, Oezguen N, Mathura VS, Braun W (2005b) Proteins 58:200–210
Negi SS, Braun W (2007) J Mol Model 13:1157–1167
Ivanciuc O, Braun W (2007) Protein Pept Lett 14:903–916
Acknowledgments
We acknowledge the Bulgarian Science Fund for financial support (Grants DCVNP 02-1/2009 and IO1/7).
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper belongs to Topical Collection MIB 2013 (Modeling Interactions in Biomolecules VI)
Rights and permissions
About this article
Cite this article
Dimitrov, I., Bangov, I., Flower, D.R. et al. AllerTOP v.2—a server for in silico prediction of allergens. J Mol Model 20, 2278 (2014). https://doi.org/10.1007/s00894-014-2278-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00894-014-2278-5