Predicting financial distress of contractors in the construction industry using ensemble learning
Introduction
The ability to predict the financial distress of contractor in the early stages of a construction project is critical for project owners and various stakeholders, such as investors and banks. This is because financial distress precedes a financial crisis, such as bankruptcy or liquidation, and is the result of detrimental events (Hertzel et al., 2008, Platt and Platt, 2002, Tinoco and Wilson, 2013). Whitaker (1999) and Tinco and Wilson (2013) demonstrated that financial distress can be identified prior to bankruptcy and default and can impede socioeconomic developments and cause damage to stakeholders, including creditors, labor unions, governmental bodies, employees, and even customers and suppliers (Wu, 2010). Therefore, the financial early warning signs that inform a contractor that it is approaching financial crisis can help it to identify problems and take strategic action before it experiences a financial crisis (Platt & Platt, 2002).
Compared with other industries, the construction industry is especially vulnerable to financial crises due to the high degree of uncertainty (Horta and Camanho, 2013, Huang et al., 2013) caused by the uniqueness and long duration of the projects as well as the industry's sensitivity to economic cycles (Tserng, Lin, Tsai, & Chen, 2011). For example, in South Korea, the construction industry had the highest bankruptcy rate among all industries between 1998 and 2017 (NICE Information Service, 2018). In this situation, predicting whether a contractor will suffer a financial crisis during a construction project is necessary for project owner and stakeholders. Therefore, many studies have been conducted regarding the prediction of the financial crisis of contractors.
Models for predicting the financial crisis of contractor have been proposed since the late 1970s. These proposed models are based on linear models, such as multivariate discriminant analysis (MDA) (Ng, Wong, & Zhang, 2011), logistic regression (LR) (Tserng, Chen, Huang, Lei, & Tran, 2014), and data-mining techniques (e.g. Chen, 2012, Horta and Camanho, 2013, Tserng et al., 2011). Previous studies predicted the financial crisis of a contractor for one-quarter or one-year ahead of prediction point. However, the construction industry has relatively long projects’ duration generally more than one year. An analysis of the projects’ duration for 122 construction projects by the Public Procurement Service of South Korea revealed that the average project period was 1.4 years (Public Procurement Service 2015, Public Procurement Service 2016). In this context, if a project owner decides to employ a model to predict whether a contractor will suffer a financial crisis before the end of the project, the models proposed in the previous studies are difficult for the owner to make decision-making after a period of one year. Therefore, considering the projects’ duration from bid to construction, it is necessary to predict the financial distress of the contractor for a period more than one year ahead of prediction point.
The aim of this study is to propose models that predict the financial distress of a contractor for two- and three-year ahead of prediction point based on ensemble learning. The synthetic minority over-sampling technique (SMOTE) was used to solve the imbalance problem between the number of normal contractors and the number of financially distressed contractors. The base classifiers of the ensemble classifier were selected as the combination with the maximum weighted count of errors and correct results (WCEC) value among the six classifiers, namely the support vector machine (SVM), artificial neural network (ANN), commercial version 4.5 (C4.5), naive Bayes (NB), LR, and k-nearest neighbor (KNN). The performance of the model proposed in this study was measured in terms of the area under the receiver operating characteristic curve (AUC), which is a widely used measure for evaluating prediction performance.
Section snippets
Literature review
The prediction models for financial crisis of the contractors have been proposed since the late 1970s. Most of the approaches are based on linear models, such as the MDA (Ng et al., 2011) and LR (Tserng et al., 2014). Recently, data-mining techniques have shown performance improvements over traditional linear models in the prediction of the financial crisis of contractor (Horta and Camanho, 2013, Tserng et al., 2011). Thus, financial crisis prediction models that employ data-mining techniques
Data collection and preparation
The data of this study regarding all of the Korean contractors that are classified by the Korean Standard Industrial Classification were collected from the Korea Information Service Value.1 The data collection and preparation processes are shown in Fig. 1.
In practice, to evaluate the financial status of the contractor during the pre-qualification phase, Public Procurement Service utilizes the credit rating. The credit rating is evaluated by credit rating agencies, i.e., Korea
Methodology
This study proposes models for predicting financial distress of contractor for two- and three-year ahead of prediction point based on an ensemble classifier. The performance of the proposed model is compared with six single-classifier models in predicting financial distress for two- and three-year ahead of prediction point. The framework for the modeling and evaluation processes is shown in Fig. 3. The processes are as follows: a SMOTE-based oversampling technique is applied to the financially
Results
Fig. 4 shows the performance of the models in predicting the financial distress of contractor in terms of the AUC value. The results of each classifier were generated by applying the 10-fold cross-validation to the dataset for each of the prediction years (2011 and 2012). By using the classification results of the six single classifiers, the WCEC was computed for every subset of the base classifiers. Based on the WCEC results, the optimal subsets with the maximum WCEC values were selected for
Conclusion
The financial crisis of a construction contractor during a construction project can be a critical risk for project owners and stakeholders such as investors, banks, and contractors. Although many financial crisis prediction models have been proposed for the construction industry, most of the previous studies have been limited to prediction for one-quarter or one-year ahead of prediction point, even though project duration of the construction industry are generally over one year. In addition,
Acknowledgments
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2013R1A1A2A10058175).
References (87)
- et al.
Using diversity of errors for selecting members of a committee classifier
Pattern Recognition
(2006) - et al.
On combining classifiers using sum and product rules
Pattern Recognition Letters
(2001) - et al.
On learning algorithm selection for classification
Applied Soft Computing
(2006) - et al.
Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques
Analytica Chimica Acta
(2010) - et al.
35 years of studies on business failure: An overview of the classic statistical methodologies and their related problems
The British Accounting Review
(2006) - et al.
Machine learning models and bankruptcy prediction
Expert Systems with Applications
(2017) - et al.
Data mining approach to policy analysis in a health insurance domain
International Journal of Medical Informatics
(2001) Developing SFNN models to predict financial distress of construction companies
Expert Systems with Applications
(2012)- et al.
Evolutionary support vector machine inference system for construction management
Automation in Construction
(2009) - et al.
A novel hybrid intelligent approach for contractor default status prediction
Knowledge-Based Systems
(2014)