Filter-based feature selection in the context of evolutionary neural networks in supervised machine learning

Tallón-Ballesteros, Antonio J.; Riquelme, José C.; Ruiz, Roberto

doi:10.1007/s10044-019-00798-z

Filter-based feature selection in the context of evolutionary neural networks in supervised machine learning

Theoretical advances
Published: 11 March 2019

Volume 23, pages 467–491, (2020)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Antonio J. Tallón-Ballesteros ORCID: orcid.org/0000-0002-9699-1894¹,
José C. Riquelme¹ &
Roberto Ruiz²

435 Accesses
4 Citations
Explore all metrics

Abstract

This paper presents a workbench to get simple neural classification models based on product evolutionary networks via a prior data preparation at attribute level by means of filter-based feature selection. Therefore, the computation to build the classifier is shorter, compared to a full model without data pre-processing, which is of utmost importance since the evolutionary neural models are stochastic and different classifiers with different seeds are required to get reliable results. Feature selection is one of the most common techniques for pre-processing the data within any kind of learning task. Six filters have been tested to assess the proposal. Fourteen (binary and multi-class) difficult classification data sets from the University of California repository at Irvine have been established as the test bed. An empirical study between the evolutionary neural network models obtained with and without feature selection has been included. The results have been contrasted with nonparametric statistical tests and show that the current proposal improves the test accuracy of the previous models significantly. Moreover, the current proposal is much more efficient than the previous methodology; the time reduction percentage is above 40%, on average. Our approach has also been compared with several classifiers both with and without feature selection in order to illustrate the performance of the different filters considered. Lastly, a statistical analysis for each feature selector has been performed providing a pairwise comparison between machine learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

U-FLEX: Unsupervised Feature Learning with Evolutionary eXploration

Accuracy Increase on Evolving Product Unit Neural Networks via Feature Subset Selection

References

Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Google Scholar
Anderson TW (2003) An introduction to multivariate statistical analysis. Wiley, New York
MATH Google Scholar
Angeline PJ, Saunders GM, Pollack JB (1994) An evolutionary algorithm that construct recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65
Google Scholar
Bache K, Lichman M (2013) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine
Google Scholar
Battiti R, Tecchiolli G (1995) Training neural nets with the reactive tabu search. IEEE Trans Neural Netw 6(5):1185–1200
Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York
MATH Google Scholar
Boese KD, Kahng AB (1993) Simulated annealing of neural networks: the cooling strategy reconsidered. In: Proceedings of the IEEE international symposium on circuits and systems (ISCAS 1993), vol 4. IEEE, Chicago, Illinois, USA, pp 2572–2575
Bouckaert RR, Frank E, Hall MA, Holmes G, Pfahringer B, Reutemann P, Witten IH (2010) Weka—experiences with a java open-source project. J Mach Learn Res 11(1):2533–2541
MATH Google Scholar
Bridle JS (1990) Probabilistic Interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman Soulie F, Herault J (eds) Neurocomputing: algorithms, architectures and applications. Springer, Berlin, pp 227–236
Google Scholar
Bryson AE, Yu-Chi H (1969) Applied optimal control: Optimization, estimation, and control. Blaisdell Publishing Company, Waltham
Google Scholar
Caruana R, Freitag D (1994) Greedy attribute selection. In: Proceedings of the eleventh international conference on machine learning (ICML 1994). Morgan Kaufmann, New Brunswick, NJ, USA, pp 28–36
Google Scholar
Cerný V (1985) Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithm. J Optim Theory Appl 45(1):41–51
MathSciNet MATH Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
MATH Google Scholar
Cover T, Thomas J (1991) Elements of information theory. Wiley, New York
MATH Google Scholar
Curran D, O’Riordan C (2002) Applying evolutionary computation to designing neural networks: a study of the state of the art. Technical report NUIG-IT-111002, National University of Ireland, Galway, Department of Information Technology
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Google Scholar
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176
MathSciNet MATH Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
MathSciNet MATH Google Scholar
Durbin R, Rumelhart DE (1989) Product units: a computationally powerful and biologically plausible extension to backpropagation networks. Neural Comput 1(1):133–142
Google Scholar
Embrechts MJ (2001) Computational intelligence for data mining. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC 2001), vol 3. IEEE, Los Alamitos, pp 1484–1484
Ferreira CBR, Borges DL (2003) Analysis of mammogram classification using a wavelet transform decomposition. Pattern Recognit Lett 24(7):973–982
Google Scholar
Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of the fifteenth international conference on machine learning (ICML 1998). Morgan Kaufmann, Madison, Wisconsin, USA, pp 144–151
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
MATH Google Scholar
Fu KS, Min PJ, Li TJ (1970) Feature selection in pattern recognition. IEEE Trans Syst Sci Cybern 6(1):33–39
MATH Google Scholar
García-Pedrajas N, Hervás-Martínez C, Muñoz-Pérez J (2002) Multiobjetive cooperative coevolution of artificial neural networks. Neural Netw 15(10):1255–1274
Google Scholar
Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random Forests for land cover classification. Pattern Recognit Lett 27(4):294–300
Google Scholar
Glover F (1977) Heuristics for integer programming using surrogate constraints. Decis Sci 8(1):156–166
Google Scholar
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549
MathSciNet MATH Google Scholar
Gorunescu F, Belciug S, Gorunescu M, Badea R (2012) Intelligent decision-making for liver fibrosis stadialization based on tandem feature selection and evolutionary-driven neural network. Expert Syst Appl 39(17):12824–12832
Google Scholar
Hall MA, Smith LA (1997) Feature subset selection: a correlation based filter approach. In: Proceedings of the 1997 international conference on neural information processing and intelligent information systems. Springer, New Zealand, pp 855–858
Iman RL, Davenport JM (1980) Approximations of the critical region of the Friedman statistic. Commun Stat Theory Methods 9(6):571–595
MATH Google Scholar
Hervás-Martínez C, Martínez-Estudillo FJ, Gutiérrez PA (2006) Classification by means of evolutionary product-unit neural networks. In: Proceedings of the international joint conference on neural networks (IJCNN 2006). IEEE, Vancouver, BC, Canada, pp 2834–2842
Jaeger H (2002) Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the “echo state network” approach. GMD report 159, German National Research Center for Information Technology
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Google Scholar
Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44
Google Scholar
John GH, Kohavi R, Pfleger K (1994) Irrelevant feature and the subset selection problem. In: Proceedings of the eleventh international conference on machine learning (ICML 1994). Morgan Kaufmann, New Brunswick, NJ, USA, pp 121–129
Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
MathSciNet MATH Google Scholar
Krasnopolsky VM, Fox-Rabinovitz MS (2006) Complex hybrid models combining deterministic and machine learning components for numerical climate modeling and weather prediction. Neural Netw 19:122–134
Google Scholar
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the fourteenth international joint conference on artificial intelligence (IJCAI 1995), vol 2. Morgan Kaufmann, Montréal, Québec, Canada, pp 1137–1145
Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
MATH Google Scholar
Koller D, Sahami M (1996) Toward optimal feature selection. In: Proceedings of the thirteenth international conference on machine learning (ICML 1996). Morgan Kaufmann, Bari, Italy, pp 284–292
Kuncheva LI, del Rio Vilas VJ, Rodríguez JJ (2007) Diagnosing scrapie in sheep: a classification experiment. Comput Biol Med 37(8):1194–1202
Google Scholar
Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159
Google Scholar
Larson J, Newman F (2011) An implementation of scatter search to train neural networks for brain lesion recognition. Involve J Math 4(3):203–211
MathSciNet MATH Google Scholar
Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar
Liu H, Setiono R (1998) Some issues on scalable feature selection. Expert Syst Appl 15(3–4):333–339
Google Scholar
Luukka P (2011) Feature selection using fuzzy entropy measures with similarity classifier. Expert Syst Appl 38(4):4600–4607
Google Scholar
Martínez-Estudillo FJ, Hervás-Martínez C, Gutiérrez-Peña PA, Martínez-Estudillo AC, Ventura-Soto S (2006) Evolutionary product-unit neural networks for classification. In: Proceedings of the seventh international conference on intelligent data engineering and automated learning (IDEAL 2006). Springer, Burgos, Spain, pp 1320–1328
Google Scholar
Miller GF, Todd PM, Hegde SU (1989) Designing neural networks using genetic algorithms. In: Proceedings of the 3rd international conference on genetic algorithms (ICGA 1989). Morgan Kaufmann, George Mason University, Fairfax, Virginia, USA, pp 379–384
Milne L (1995) Feature selection using neural networks with contribution measures. In: Proceedings of the eighth Australian joint conference on artificial intelligence (AI 95). Canberra, Australia, pp 215–221
Murty MN, Devi VS (2011) Pattern recognition: An algorithmic approach. Springer, New York
MATH Google Scholar
Nemenyi PB (1963) Distribution-free multiple comparisons. PhD, Princeton University
Ohkura K, Yasuda T, Kawamatsu Y, Matsumura Y, Ueda K (2007) MBEANN: mutation-based evolving artificial neural networks. In: Advances in artificial life, proceedings of the 9th European conference (ECAL 2007). Springer, Lisbon, Portugal, pp 936–945
Parker DB (1985) Learning logic. Technical report TR-47, MIT Center for Research in Computational Economics and Management Science, Cambridge, MA
Prechelt L (1994) Proben1—a set of neural network benchmark problems and benchmarking rules. Technical report 21/94, Fakultat für Informatik, Univ. Karlsruhe, Karlsruhe, Germany
Quinlan J (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco
Google Scholar
Rechenberg I (1989) Evolution strategy: Nature’s way of optimization. In: Bergmann HW (ed) Optimization: Methods and applications, possibilities and limitations. Springer, Bonn, pp 106–126
Google Scholar
Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2003) Fast feature ranking algorithm. In: Proceedings of the seventh international conference on knowledge-based intelligent information and engineering systems (KES 2003). Springer, Oxford, UK, pp 325–331
Google Scholar
Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2006) Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognit 39(12):2383–2392
Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL, the PDP Research Group (eds) Parallel distributed processing: explorations in the microstructure of cognition (volume 1: foundations). MIT Press, Cambridge, MA, pp 318–362
Schaffer JD, Whitley D, Eshelman LJ (1992) Combinations of genetic algorithms and neural networks: a survey of the state of the art. In: Proceedings of the international workshop on combinations of genetic algorithms and neural networks (COGANN 1992). IEEE Society Press, Los Alamitos, CA, pp 1–37
Sethi IK, Jain AK (2014) Artificial neural networks and statistical pattern recognition: Old and new connections. Machine intelligence and pattern recognition series, vol 11. Elsevier, Amsterdam
Sexton R, Dorsey R, Johnson J (1999) Optimization of neural networks: a comparative analysis of the genetic algorithm and simulated annealing. Eur J Oper Res 114(3):589–601
MATH Google Scholar
Tallón-Ballesteros AJ, Gutiérrez-Peña PA, Hervás-Martínez C (2007) Distribution of the search of evolutionary product unit neural networks for classification. In: Proceedings of the IADIS international conference on applied computing (AC 2007). IADIS, Salamanca, Spain, pp 266–273
Tallón-Ballesteros AJ, Hervás-Martínez C (2011) A two-stage algorithm in evolutionary product unit neural networks for classification. Expert Syst Appl 38(1):743–754
Google Scholar
Tallón-Ballesteros AJ, Hervás-Martínez C, Riquelme JC, Ruiz R (2013) Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing 114:107–117
Google Scholar
Towell GG, Shavlik JW (1994) Knowledge-based artificial neural networks. Artif Intell 70(1–2):119–165
MATH Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, Heidelberg
MATH Google Scholar
Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. PhD thesis, Harvard University, Boston
Xing EP, Jordan MI, Karp RM (2001) Feature selection for high-dimensional genomic microarray data. In: Proceedings of the international conference on machine learning (ICML 2001). Morgan Kaufmann, San Francisco, CA, pp 601–608
Yao X, Liu Y (1997) A new evolutionary system for evolving artificial neural networks. IEEE Trans Neural Netw 8(3):694–713
Google Scholar
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224
MathSciNet MATH Google Scholar
Zhen S, Jianlin C, Di T, Zhou YCT (2004) Comparison of steady state and elitist selection genetic algorithms. In: Proceedings of international conference on intelligent mechatronics and automation (ICMA 2004). IEEE, pp 495–499

Download references

Acknowledgements

This work has been partially subsidised by TIN2011-28956-C02-02 and TIN2014-55894-C2-R projects of the Spanish Inter-Ministerial Commission of Science and Technology (MICYT) and FEDER funds.

Author information

Authors and Affiliations

Department of Languages and Computer Systems, University of Seville, Reina Mercedes Avenue, 41012, Seville, Spain
Antonio J. Tallón-Ballesteros & José C. Riquelme
Area of Computer Science, Pablo de Olavide University, Km. 1, Utrera Road, 41013, Seville, Spain
Roberto Ruiz

Authors

Antonio J. Tallón-Ballesteros
View author publications
You can also search for this author in PubMed Google Scholar
José C. Riquelme
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio J. Tallón-Ballesteros.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tallón-Ballesteros, A.J., Riquelme, J.C. & Ruiz, R. Filter-based feature selection in the context of evolutionary neural networks in supervised machine learning. Pattern Anal Applic 23, 467–491 (2020). https://doi.org/10.1007/s10044-019-00798-z

Download citation

Received: 29 August 2017
Accepted: 14 February 2019
Published: 11 March 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10044-019-00798-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Filter-based feature selection in the context of evolutionary neural networks in supervised machine learning

Abstract

Access this article

Similar content being viewed by others

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

U-FLEX: Unsupervised Feature Learning with Evolutionary eXploration

Accuracy Increase on Evolving Product Unit Neural Networks via Feature Subset Selection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Filter-based feature selection in the context of evolutionary neural networks in supervised machine learning

Abstract

Access this article

Similar content being viewed by others

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

U-FLEX: Unsupervised Feature Learning with Evolutionary eXploration

Accuracy Increase on Evolving Product Unit Neural Networks via Feature Subset Selection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation