Abstract
Support Vector Machines (SVMs) are widely known as an efficient supervised learning model for classification problems. However, the success of an SVM classifier depends on the perfect choice of its parameters as well as the structure of the data. Thus, the aim of this research is to simultaneously optimize the parameters and feature weighting in order to increase the strength of SVMs. We propose a novel hybrid model, the combination of genetic algorithms (GAs) and SVMs, for feature weighting and parameter optimization to solve classification problems efficiently. We call it as the GA-SVM model. Our GA is designed with a special direction-based crossover operator. Experiments were conducted on several real-world datasets using the proposed model and Grid Search, a traditional method of searching optimal parameters. The results show that the GA-SVM model achieves significant improvement in the performance of classification on all the datasets in comparison with Grid Search. In terms of accuracy, out method is competitive with some state-of-the-art techniques for feature selection and feature weighting.
Similar content being viewed by others
References
Huang W, Cheng-Lung, Chieh-Jen (2006) A ga-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240
Tahir MA, Bouridane A, Kurugollu F (2007) Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier. Pattern Recogn Lett 28(4):438–446
Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314
Li B, Chen N, Wen J, Jin X, Shi Y (2015) Text categorization system for stock prediction. Int J u-and e-Serv Sci Technol 8(2):35–44
Bautista RMJS, Navata VJL, Ng AH, Santos M, Timothy S, Albao JD, Roxas EA (2015) Recognition of handwritten alphanumeric characters using projection histogram and support vector machine. In: International conference on humanoid, nanotechnology, information technology, communication and control, environment and management (HNICEM), 2015. IEEE, pp 1–6
Foody GM (2015) The effect of mis-labeled training data on the accuracy of supervised image classification by svm. In: IEEE international geoscience and remote sensing symposium (IGARSS), 2015. IEEE, pp 4987–4990
Bejerano G (2003) Automata learning and stochastic modeling for biosequence analysis. Citeseer
Fröhlich CO, Holger, Schölkopf B (2003) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of 15th IEEE international conference on the international journal on artificial intelligence tools, 2003. IEEE, pp 142–148
Hsu CC-CLC-J, Chih-Wei et al (2003) A practical guide to support vector classification
LaValle BMSL, Steven MSR (2004) On the relationship between classical grid search and probabilistic roadmaps. Int J Robot Res 23(7–8):673–692
Gallagher S, Kerry, Malcolm (1994) Genetic algorithms: a powerful tool for large-scale nonlinear optimization problems. Comput Geosci 20(7):1229–1236
Punch III WF, Goodman ED, Pei M, Chia-Shun L, Hovland PD, Enbody RJ (1993) Further research on feature selection and classification using genetic algorithms. In: ICGA, pp 557– 564
Anirudha R, Kannan R, Patil N (2014) Genetic algorithm based wrapper feature selection on hybrid prediction model for analysis of high dimensional data. In: 9th international conference on industrial and information systems (ICIIS), 2014. IEEE, pp 1– 6
Kelly JD (1991) A hybrid genetic algorithm for classification. In: Proceedings of the 12th international joint conference on artificial intelligence. Morgan Kaufmann, pp 645–650
Min S-H, Lee H, Jumin, Ingoo (2006) Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Syst Appl 31(3):652–660
Silva DA, Silva JP, Neto ARR (2015) Novel approaches using evolutionary computation for sparse least square support vector machines. Neurocomputing 168:908–916
Wu C-H, Tzeng G-H, Goo Y-J, Fang W-C (2007) A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Syst Appl 32(2):397–408
Raymer ML, Punch WF, Goodman ED, Kuhn L, Jain AK et al (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171
Lowe DG (1995) Similarity metric learning for a variable-kernel classifier. Neural Comput 7(1):72–85
Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell 24(9):1281–1285
Paredes R, Vidal E (2000) A class-dependent weighted dissimilarity measure for nearest neighbor classification problems. Pattern Recogn Lett 21(12):1027–1036
Guvenir A, Altay H Aynur, Weighted k nearest neighbor classification on feature projections. In: Proceedings of the 12th international symposium on computer and information sciences. Antalya, p 1997
Wu J, Pan S, Zhu X, Cai Z, Zhang P, Zhang C (2015) Self-adaptive attribute weighting for naive bayes classification. Expert Syst Appl 42(3):1487–1502
Lee C-H (2015) A gradient approach for value weighted classification learning in naive bayes. Knowl-Based Syst 85:71–79
Sáez JA, Derrac J, Luengo J, Herrera F (2014) Statistical computation of feature weighting schemes through data estimation for nearest neighbor classifiers. Pattern Recogn 47(12):3941–3948
Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533
Honghai F, Guoshun C, Cheng Y, Bingru Y, Yumei C (2005) A svm regression based approach to filling in missing values. In: Knowledge-based intelligent information and engineering systems. Springer, pp 581–587
Grzymala-Busse JW, Goodwin LK, Grzymala-Busse WJ, Zheng X (2005) Handling missing attribute values in preterm birth data sets. In: Rough sets, fuzzy sets, data mining, and granular computing. Springer, pp 342–351
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Xiang Z-L, Yu X-R, Kang D-K (2015) Experimental analysis of naïve bayes classifier based on an attribute weighting framework with smooth kernel density estimations Appl Intell:1–10
Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. MIT press
Mitchell Melanie (1998) An introduction to genetic algorithms. MIT press
Goldberg H, David EJH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99
V. A. Phan, L. T. Bui (2013) Genetic algorithm and application for supporting working schedule at hospitals, LQDTU Journal of Science and Technology: The Section on Information and Communication Technology (LQDTU-JICT) 2(4):92–104
Davis L Handbook of genetic algorithms
Chang C.-C., Lin C.-J. (2011) Libsvm: A library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Fan W, Fox EA, Pathak P, Wu H (2004) The effects of fitness functions on genetic programming-based ranking discovery for web search. J Am Soc Inf Sci Technol 55(7):628–636
Hettich BCLM, Seth CJ Uci repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine
DeLeo JM, Rosenfeld SJ (2001) Essential roles for receiver operating characteristic (roc) methodology in classifier neural network applications. In: Proceedings of international joint conference on neural networks, 2001 (IJCNN’01), vol 4. IEEE, pp 2730–2731
Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor Newsl 6(1):1–6
Ganganwar Vaishali (2012) An overview of classification algorithms for imbalanced datasets. Int J Emerg Technol Adv Eng 2(4):42–47
Acknowledgments
This work was funded partly by JSPS KAKENHI Grant number 3050941 and the Ministry of Training and Education (MOET), Vietnam under the project 911 and Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2015.12.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Phan, A.V., Nguyen, M.L. & Bui, L.T. Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems. Appl Intell 46, 455–469 (2017). https://doi.org/10.1007/s10489-016-0843-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-016-0843-6