Ant colony and particle swarm optimization for financial classification problems
Introduction
Modern finance is a broad field often involved with hard decision-making problems related to risk management. In several cases, financial decision-making problems require the assignment of the available options into predefined groups/classes. Credit risk analysis, bankruptcy prediction, and country risk assessment, among other are some typical examples (Doumpos, Zopounidis, & Pardalos, 2000). In this context the development of reliable classification models is clearly of major importance to researchers and practitioners.
The development of financial classification models is a complicated process, involving careful data collection and pre-processing, model development, validation and implementation. Focusing on model development, several methods have been used, including statistical methods, artificial intelligence techniques and operations research methodologies. In all cases, the quality of the data is a fundamental point. This is mainly related to the adequacy of the sample data in terms of the number of observation and the relevance of the decision attributes (i.e., independent variables) used in the analysis.
The latter is related to the feature selection problem. Feature selection refers to the identification of the appropriate attributes (features) that should be introduced in the analysis in order to maximize the expected performance of the resulting model. This has significant implications for issues such as (Kira & Rendell, 1992): (1) noise reduction through the elimination of noisy features, (2) reduction of the time and cost required implement an appropriate model, (3) simplification of the resulting models, and (4) facilitation of the easy use and updating of the models.
The basic feature selection problem is an optimization problem, with a performance measure for each subset of features, which represents expected classification performance of the resulting model. The problem is to search through the space of feature subsets in order to identify the optimal or near-optimal one with respect to the performance measure. Unfortunately, finding the optimum feature subset has been proved to be NP-hard (Kira & Rendell, 1992). Many algorithms are, thus, proposed to find the suboptimal solutions in comparably smaller amount of time (Jain & Zongker, 1997). Branch and bound approaches (Narendra & Fukunaga, 1977), sequential forward/backward search (Aha and Bankert, 1996, Cantu-Paz et al., 2004) and filters approaches (Cantu-Paz, 2004) deterministically search for the suboptimal solutions. One of the most important of the filter approaches is the Kira and Rendell’s Relief algorithm (Kira & Rendell, 1992). Stochastic algorithms, including simulated annealing (Siedlecki & Sklansky, 1988), scatter search (Lopez, Torres, Batista, Perez, & Moreno-Vega, 2006), ant colony optimization (Al-Ani, 2005a, Al-Ani, 2005b, Parpinelli et al., 2002, Shelokar et al., 2004) and genetic algorithms (Cantu-Paz et al., 2004) are of great interest recently because they often yield high accuracy and are much faster.
In this paper, two algorithms for the solution of the feature selection problem based on ant colony and particle swarm optimization are presented. These algorithms are combined with three nearest neighbour based classifiers, the 1-nearest neighbour, the k-nearest neighbour and the weighted k-nearest neighbour classifier. The algorithms are applied to two data sets involving financial decision-making problems. The first involves credit risk assessment and the second is related to qualified audit reports. A comparison of the proposed algorithms with two other metaheuristics, namely Tabu search metaheuristic (Glover, 1989, Glover, 1990) and a genetic algorithm (Goldberg, 1989, Reeves, 1995, Reeves, 2003) illustrates the performance of the proposed algorithms.
The rest of the paper is organized as follows: the next section provides a detailed analysis of the proposed algorithms. Section 3 describes the applications context using the aforementioned data financial data sets and the experimental settings, whereas Section 4 presents the obtained computational results. The last section concludes the paper and discusses some future research directions.
Section snippets
Nearest neighbour classifiers
Initially, the classic 1-nearest neighbour (1-nn) (Duda & Hart, 1973) method is used. The nearest neighbour classifier was selected as it is a method very easy to implement it and it does not need any optimization procedure as for example it is necessary in support vector machines and in neural networks. Assume a training sample of Mtrain vectors yj = (yj1, …, yjd), j = 1, …, Mtrain, where d is the number of selected features and yjl is the description of observation j on feature l. In the 1–nn
Data
The two metaheuristic algorithms are applied to two financial classification problems. The first is related to credit risk assessment. The data, taken from Doumpos and Pasiouras (2005) involve 1330 firm-year observations for UK non-financial firms, over the period 1999–2001. The sample observations are classified into five risk groups according to their level of likelihood of default, measured on the basis of their QuiScore, a credit rating assigned by Qui Credit Assessment Ltd. In particular,
Results
Table 2 presents the classification results for the optimal solution of each of the proposed algorithms, the ACO based metaheuristic and PSO based metaheuristic, for both financial classification problems. The results of the algorithms used for the comparisons are also shown. When the Tabu metaheuristic was used a number of tests were performed in order to choose the best k and, finally, a value of k equal to 5 was chosen. The statistical significance of the differences between the methods is
Conclusions and future work
An important issue in building a good classifier is the selection of a set of appropriate input feature variables. The ant colony optimization and the particle swarm optimization algorithms have been proposed in this study for solving this feature subset selection problem. Three different classifiers were used for the classification problem, based on the nearest neighbour classification rule. The performance of the proposed algorithm was tested using financial data involving credit risk
References (28)
- et al.
An ant colony classifier system: Application to some process engineering problems
Computers and Chemical Engineering
(2004) - et al.
Credit risk rating systems at large US banks
Journal of Banking and Finance
(2000) - et al.
A comparative evaluation of sequential feature selection algorithms
Feature subset selection using ant colony optimization
International Journal of Computational Intelligence
(2005)Ant colony optimization for feature subset selection
Transactions on Engineering, Computing and Technology
(2005)- Cantu-Paz, E. (2004). Feature subset selection, class separability, and genetic algorithms. In Genetic and evolutionary...
- Cantu-Paz, E., Newsam, S., & Kamath, C. (2004). Feature selection in scientific application. In Proceedings of the 2004...
- et al.
Ant system: Optimization by a colony of cooperating agents
IEEE Transactions on Systems, Man, and Cybernetics – Part B
(1996) - et al.
Ant colony optimization
(2004) - et al.
Explaining qualifications in audit reports using a support vector machine methodology
Intelligent Systems in Accounting, Finance and Management
(2005)
Developing and testing models for replicating credit ratings: A multicriteria approach
Computational Economics
Multicriteria sorting methodology: Application to financial decision problems
Parallel Algorithms and Applications
Pattern classification and scene analysis
Tabu search I
ORSA Journal on Computing
Cited by (81)
Comprehensive learning Harris hawks-equilibrium optimization with terminal replacement mechanism for constrained optimization problems
2022, Expert Systems with ApplicationsCitation Excerpt :Metaheuristic is one of the most popular optimization techniques inspired from nature, with the characteristics of simplicity, flexibility, derivation-free, black-box computing, and parallel computing, which can provide good performance in different kinds of optimization problems (Li, Liu, Zhao & Zeng, 2021; Houssein, Mahdy, Blondin et al., 2021). Therefore, metaheuristic algorithms are successful in solving real-world optimization problems (Osaba et al., 2021), such as civil engineering (Li, Jiang & Yang, 2012; Li & Hu, 2014), finance (Marinakis et al., 2009), medicine (Elaziz et al., 2020), industry (Houssein et al., 2021), reliability-based design optimization (Meng, Li, Wang et al., 2021), and so on. Metaheuristic algorithms can be classified as four categories based on the inspiration from nature (Faramarzi et al., 2020): swarm-based algorithms, evolutionary algorithm, physics or chemistry-based algorithms, and social or human-based algorithms.
Performability evaluation, validation and optimization for the steam generation system of a coal-fired thermal power plant
2022, MethodsXCitation Excerpt :Kumar et al. [14] analyzed the availability of a system in a thermal power plant with the help of the Markov approach and suggested the maintenance schedule for various subsystems of the system concerned. Marinakis et al. [15] proposed the Ant Colony (ACO and the PSO algorithms to solve the financial classification model. They tested the proposed methods through two different financial classification problems.
Improving K-means clustering with enhanced Firefly Algorithms
2019, Applied Soft Computing JournalThe continuous-discrete PSO algorithm for shape formation problem of multiple agents in two and three dimensional space
2018, Applied Soft Computing JournalCitation Excerpt :Thirdly, the code of PSO algorithm is relatively simple and the efficiency of searching for the suboptimal or global optimum is relatively high according to one previously reported result [20]. Because of the aforementioned advantages, PSO algorithm has been successfully and widely applied to a broad range of optimization problems, such as electric power system [21–24], electromagnetic [25], locating and tracking [26–29], intelligent control [30], neural network [31,32], fault detection [33,34], feature selection [35–37], path planning [38] and others [39,40], etc. In this paper, the shape formation problem, which can be applied to the helicopter and ship formation and the large-scale performance in the future can be roughly classified by two main cases.
An efficient hybrid clustering method based on improved cuckoo optimization and modified particle swarm optimization algorithms
2018, Applied Soft Computing JournalStatistically aided Binary Multi-Objective Grey Wolf Optimizer: a new feature selection approach for classification
2023, Journal of Supercomputing