Skip to main content

Advertisement

Log in

Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Support Vector Machines (SVMs) are widely known as an efficient supervised learning model for classification problems. However, the success of an SVM classifier depends on the perfect choice of its parameters as well as the structure of the data. Thus, the aim of this research is to simultaneously optimize the parameters and feature weighting in order to increase the strength of SVMs. We propose a novel hybrid model, the combination of genetic algorithms (GAs) and SVMs, for feature weighting and parameter optimization to solve classification problems efficiently. We call it as the GA-SVM model. Our GA is designed with a special direction-based crossover operator. Experiments were conducted on several real-world datasets using the proposed model and Grid Search, a traditional method of searching optimal parameters. The results show that the GA-SVM model achieves significant improvement in the performance of classification on all the datasets in comparison with Grid Search. In terms of accuracy, out method is competitive with some state-of-the-art techniques for feature selection and feature weighting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. 1 http://www.cs.waikato.ac.nz/ml/weka

References

  1. Huang W, Cheng-Lung, Chieh-Jen (2006) A ga-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240

    Article  Google Scholar 

  2. Tahir MA, Bouridane A, Kurugollu F (2007) Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier. Pattern Recogn Lett 28(4):438–446

    Article  Google Scholar 

  3. Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314

    Article  Google Scholar 

  4. Li B, Chen N, Wen J, Jin X, Shi Y (2015) Text categorization system for stock prediction. Int J u-and e-Serv Sci Technol 8(2):35–44

    Article  Google Scholar 

  5. Bautista RMJS, Navata VJL, Ng AH, Santos M, Timothy S, Albao JD, Roxas EA (2015) Recognition of handwritten alphanumeric characters using projection histogram and support vector machine. In: International conference on humanoid, nanotechnology, information technology, communication and control, environment and management (HNICEM), 2015. IEEE, pp 1–6

  6. Foody GM (2015) The effect of mis-labeled training data on the accuracy of supervised image classification by svm. In: IEEE international geoscience and remote sensing symposium (IGARSS), 2015. IEEE, pp 4987–4990

  7. Bejerano G (2003) Automata learning and stochastic modeling for biosequence analysis. Citeseer

  8. Fröhlich CO, Holger, Schölkopf B (2003) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of 15th IEEE international conference on the international journal on artificial intelligence tools, 2003. IEEE, pp 142–148

  9. Hsu CC-CLC-J, Chih-Wei et al (2003) A practical guide to support vector classification

  10. LaValle BMSL, Steven MSR (2004) On the relationship between classical grid search and probabilistic roadmaps. Int J Robot Res 23(7–8):673–692

    Article  Google Scholar 

  11. Gallagher S, Kerry, Malcolm (1994) Genetic algorithms: a powerful tool for large-scale nonlinear optimization problems. Comput Geosci 20(7):1229–1236

    Article  Google Scholar 

  12. Punch III WF, Goodman ED, Pei M, Chia-Shun L, Hovland PD, Enbody RJ (1993) Further research on feature selection and classification using genetic algorithms. In: ICGA, pp 557– 564

  13. Anirudha R, Kannan R, Patil N (2014) Genetic algorithm based wrapper feature selection on hybrid prediction model for analysis of high dimensional data. In: 9th international conference on industrial and information systems (ICIIS), 2014. IEEE, pp 1– 6

  14. Kelly JD (1991) A hybrid genetic algorithm for classification. In: Proceedings of the 12th international joint conference on artificial intelligence. Morgan Kaufmann, pp 645–650

  15. Min S-H, Lee H, Jumin, Ingoo (2006) Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Syst Appl 31(3):652–660

    Article  Google Scholar 

  16. Silva DA, Silva JP, Neto ARR (2015) Novel approaches using evolutionary computation for sparse least square support vector machines. Neurocomputing 168:908–916

  17. Wu C-H, Tzeng G-H, Goo Y-J, Fang W-C (2007) A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Syst Appl 32(2):397–408

  18. Raymer ML, Punch WF, Goodman ED, Kuhn L, Jain AK et al (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171

  19. Lowe DG (1995) Similarity metric learning for a variable-kernel classifier. Neural Comput 7(1):72–85

    Article  Google Scholar 

  20. Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell 24(9):1281–1285

    Article  Google Scholar 

  21. Paredes R, Vidal E (2000) A class-dependent weighted dissimilarity measure for nearest neighbor classification problems. Pattern Recogn Lett 21(12):1027–1036

    Article  MATH  Google Scholar 

  22. Guvenir A, Altay H Aynur, Weighted k nearest neighbor classification on feature projections. In: Proceedings of the 12th international symposium on computer and information sciences. Antalya, p 1997

  23. Wu J, Pan S, Zhu X, Cai Z, Zhang P, Zhang C (2015) Self-adaptive attribute weighting for naive bayes classification. Expert Syst Appl 42(3):1487–1502

    Article  Google Scholar 

  24. Lee C-H (2015) A gradient approach for value weighted classification learning in naive bayes. Knowl-Based Syst 85:71–79

    Article  Google Scholar 

  25. Sáez JA, Derrac J, Luengo J, Herrera F (2014) Statistical computation of feature weighting schemes through data estimation for nearest neighbor classifiers. Pattern Recogn 47(12):3941–3948

    Article  Google Scholar 

  26. Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533

    Article  Google Scholar 

  27. Honghai F, Guoshun C, Cheng Y, Bingru Y, Yumei C (2005) A svm regression based approach to filling in missing values. In: Knowledge-based intelligent information and engineering systems. Springer, pp 581–587

  28. Grzymala-Busse JW, Goodwin LK, Grzymala-Busse WJ, Zheng X (2005) Handling missing attribute values in preterm birth data sets. In: Rough sets, fuzzy sets, data mining, and granular computing. Springer, pp 342–351

  29. Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39

    Article  Google Scholar 

  30. Xiang Z-L, Yu X-R, Kang D-K (2015) Experimental analysis of naïve bayes classifier based on an attribute weighting framework with smooth kernel density estimations Appl Intell:1–10

  31. Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. MIT press

  32. Mitchell Melanie (1998) An introduction to genetic algorithms. MIT press

  33. Goldberg H, David EJH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99

    Article  Google Scholar 

  34. V. A. Phan, L. T. Bui (2013) Genetic algorithm and application for supporting working schedule at hospitals, LQDTU Journal of Science and Technology: The Section on Information and Communication Technology (LQDTU-JICT) 2(4):92–104

  35. Davis L Handbook of genetic algorithms

  36. Chang C.-C., Lin C.-J. (2011) Libsvm: A library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  37. Fan W, Fox EA, Pathak P, Wu H (2004) The effects of fitness functions on genetic programming-based ranking discovery for web search. J Am Soc Inf Sci Technol 55(7):628–636

    Article  Google Scholar 

  38. Hettich BCLM, Seth CJ Uci repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine

  39. DeLeo JM, Rosenfeld SJ (2001) Essential roles for receiver operating characteristic (roc) methodology in classifier neural network applications. In: Proceedings of international joint conference on neural networks, 2001 (IJCNN’01), vol 4. IEEE, pp 2730–2731

  40. Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor Newsl 6(1):1–6

    Article  Google Scholar 

  41. Ganganwar Vaishali (2012) An overview of classification algorithms for imbalanced datasets. Int J Emerg Technol Adv Eng 2(4):42–47

    Google Scholar 

Download references

Acknowledgments

This work was funded partly by JSPS KAKENHI Grant number 3050941 and the Ministry of Training and Education (MOET), Vietnam under the project 911 and Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2015.12.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minh Le Nguyen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phan, A.V., Nguyen, M.L. & Bui, L.T. Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems. Appl Intell 46, 455–469 (2017). https://doi.org/10.1007/s10489-016-0843-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-016-0843-6

Keywords

Navigation