Abstract
Prediction of nuclear proteins is one of the major challenges in genome annotation. A method, NcPred is described, for predicting nuclear proteins with higher accuracy exploiting n − mer statistics with different classification algorithms namely Alternating Decision (AD) Tree, Best First (BF) Tree, Random Tree and Adaptive (Ada) Boost. On BaCello dataset [1], NcPred improves about 20% accuracy with Random Tree and about 10% sensitivity with Ada Boost for Animal proteins compared to existing techniques. It also increases the accuracy of Fungal protein prediction by 20% and recall by 4% with AD Tree. In case of Human protein, the accuracy is improved by about 25% and sensitivity about 10% with BF Tree. Performance analysis of NcPred clearly demonstrates its suitability over the contemporary in-silico nuclear protein classification research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pierleoni, A., Martelli, P., Fariselli, P., Casadio, R.: Bacello a balanced subcellular localization predictor. Bioinformatics 22(14), 408–416 (2006)
Kumar, M., Verma, R., Raghvan, S.: Prediction of mitochondrial proteins using support vector machine and hidden markov model. Int. J. of Biol. Chem. 28(19), 5357–5363 (2006)
Jassem, W., Fuggle, S., Rela, M., Koo, D., Heaton, N.: The role of mitochondria in ischemia/reperfusion injury. Transplantation 73(4), 493–499 (2002)
Ganesh, A., Kenue, R., Mitra, S.: Retinoblastoma and the 13q deletion syndrome. J. of Ped. Ophth. & Strab. 38(4), 247–250 (2001)
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular Biology of Cell, 4th edn. Garland Science, New York (2000)
Reinhardt, A., Hubbard, T.: Using neural networks for prediction of the subcellular location of proteins. Nuc. Acids Res. 26(9), 2230–2236 (1998)
Emanuelson, O., Nielsen, H., Brunak, S., Heijne, G.: Predicting subcellular localization of proteins based on their n-terminal amino acid sequence. J. of Mole. Bio. 330(4), 1005–1016 (2000)
Bannai, H., Tamada, Y., Maruyama, O., Nakai, K., Miyano, S.: Extensive feature detection of n-terminal protein sorting signals. Bioinformatics 18(2), 335–338 (2002)
Marcotte, E., Xenarios, I., Bliek, A., Eisenberg, D.: Localizing proteins in the cell from their phylogenetic profiles. Proc. of Nat. Aca. of Sci. 97(12), 115–120 (2000)
Bhasin, M., Raghava, G.: ESLpred: SVM based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nuc. Acids Res., 414–419 (2004)
Garg, A., Bhasin, M., Raghva, G.: Support vector machine based method for subcellular localization of human proteins using amino acid compositions, their order and similarity search. J. of Bio. Chem. 280(14), 427–433 (2005)
Xie, D., Li, A., Wang, M., Fan, Z., Feng, H.: LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nuc. Acids Res. 110, 105–110 (2005)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. ACM SIGKDD Explorations News 11(1), 10–18 (2009)
Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for information extraction. In: Proc. of DARPA Broadcast News Workshop, pp. 249–252 (1999)
Mathews, B.: Comparison of the predicted and observed secondary structure of t4 phase lysozyme. Bio. et bioph. acta. 405(2), 442–451 (1975)
Hutchinson, G.: The prediction of vertebrate promoter regions using differential hexamer frequency analysis. Bioinformatics 12(5), 391–398 (1996)
Chan, B., Kibler, D.: Using hexamers to predict cis-regulatory motifs in drosophila. BMC Bioinformatics 6, 262 (2005)
Kumar, M., Raghava, G.: Prediction of nuclear proteins using svm and HMM models. BMC Bioinformatics 10(22) (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Islam, M.S., Kabir, A., Sakib, K., Hossain, M.A. (2011). NcPred for Accurate Nuclear Protein Prediction Using n-mer Statistics with Various Classification Algorithms. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). Advances in Intelligent and Soft Computing, vol 93. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19914-1_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-19914-1_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19913-4
Online ISBN: 978-3-642-19914-1
eBook Packages: EngineeringEngineering (R0)