Elsevier

Expert Systems with Applications

Volume 110, 15 November 2018, Pages 1-10
Expert Systems with Applications

Predicting financial distress of contractors in the construction industry using ensemble learning

https://doi.org/10.1016/j.eswa.2018.05.026Get rights and content

Highlights

  • This study proposes financial distress prediction models based on ensemble learning.

  • The models provide two- and three-year ahead prediction of financial distress.

  • Performance was evaluated for contractors in South Korea from 2007 to 2012.

  • The models contribute to provide a financial early warning in the construction industry.

  • This model can help stakeholders to avoid damage due to financial crisis during a project.

Abstract

In the bid process, predicting whether the contractor will suffer a financial crisis during the construction project is vital to project owners and other stakeholders for identifying problems and taking strategic action. In this context, the models for predicting financial crisis of contractor have been extensively studied. However, the previous studies have been focused on predicting a financial crisis for one-quarter or one-year ahead of prediction point, even though the duration of projects are relatively long in the construction industry, usually exceeding one year. Moreover, despite the possibility of knowing the signs of financial crisis of a contractor through predicting financial distress, no attempt has been made to predict financial distress that contractor can suffer before reaching a financial crisis including highly visible legal events, such as bankruptcy, default, and delisting. This means that there is significant gap between those models and practical application in terms of the prediction period and definition of the financial crisis. This study proposes voting-based ensemble models that predict financial distress of contractor for two- and three-year ahead of prediction point using a finance-based definition of financial distress. The prediction performance of proposed model was evaluated using financial statements of contractors in South Korea from 2007 to 2012. The proposed models showed area under the receiver operating characteristic curve (AUC) values of 0.940 and 0.910 for predicting financial distress for each of the prediction years. By predicting financial distress of the contractor from the early stages of a construction project to the end stage with high accuracy, this model can help project owners and broad stakeholders to avoid damage due to financial crisis during a project.

Introduction

The ability to predict the financial distress of contractor in the early stages of a construction project is critical for project owners and various stakeholders, such as investors and banks. This is because financial distress precedes a financial crisis, such as bankruptcy or liquidation, and is the result of detrimental events (Hertzel et al., 2008, Platt and Platt, 2002, Tinoco and Wilson, 2013). Whitaker (1999) and Tinco and Wilson (2013) demonstrated that financial distress can be identified prior to bankruptcy and default and can impede socioeconomic developments and cause damage to stakeholders, including creditors, labor unions, governmental bodies, employees, and even customers and suppliers (Wu, 2010). Therefore, the financial early warning signs that inform a contractor that it is approaching financial crisis can help it to identify problems and take strategic action before it experiences a financial crisis (Platt & Platt, 2002).

Compared with other industries, the construction industry is especially vulnerable to financial crises due to the high degree of uncertainty (Horta and Camanho, 2013, Huang et al., 2013) caused by the uniqueness and long duration of the projects as well as the industry's sensitivity to economic cycles (Tserng, Lin, Tsai, & Chen, 2011). For example, in South Korea, the construction industry had the highest bankruptcy rate among all industries between 1998 and 2017 (NICE Information Service, 2018). In this situation, predicting whether a contractor will suffer a financial crisis during a construction project is necessary for project owner and stakeholders. Therefore, many studies have been conducted regarding the prediction of the financial crisis of contractors.

Models for predicting the financial crisis of contractor have been proposed since the late 1970s. These proposed models are based on linear models, such as multivariate discriminant analysis (MDA) (Ng, Wong, & Zhang, 2011), logistic regression (LR) (Tserng, Chen, Huang, Lei, & Tran, 2014), and data-mining techniques (e.g. Chen, 2012, Horta and Camanho, 2013, Tserng et al., 2011). Previous studies predicted the financial crisis of a contractor for one-quarter or one-year ahead of prediction point. However, the construction industry has relatively long projects’ duration generally more than one year. An analysis of the projects’ duration for 122 construction projects by the Public Procurement Service of South Korea revealed that the average project period was 1.4 years (Public Procurement Service 2015, Public Procurement Service 2016). In this context, if a project owner decides to employ a model to predict whether a contractor will suffer a financial crisis before the end of the project, the models proposed in the previous studies are difficult for the owner to make decision-making after a period of one year. Therefore, considering the projects’ duration from bid to construction, it is necessary to predict the financial distress of the contractor for a period more than one year ahead of prediction point.

The aim of this study is to propose models that predict the financial distress of a contractor for two- and three-year ahead of prediction point based on ensemble learning. The synthetic minority over-sampling technique (SMOTE) was used to solve the imbalance problem between the number of normal contractors and the number of financially distressed contractors. The base classifiers of the ensemble classifier were selected as the combination with the maximum weighted count of errors and correct results (WCEC) value among the six classifiers, namely the support vector machine (SVM), artificial neural network (ANN), commercial version 4.5 (C4.5), naive Bayes (NB), LR, and k-nearest neighbor (KNN). The performance of the model proposed in this study was measured in terms of the area under the receiver operating characteristic curve (AUC), which is a widely used measure for evaluating prediction performance.

Section snippets

Literature review

The prediction models for financial crisis of the contractors have been proposed since the late 1970s. Most of the approaches are based on linear models, such as the MDA (Ng et al., 2011) and LR (Tserng et al., 2014). Recently, data-mining techniques have shown performance improvements over traditional linear models in the prediction of the financial crisis of contractor (Horta and Camanho, 2013, Tserng et al., 2011). Thus, financial crisis prediction models that employ data-mining techniques

Data collection and preparation

The data of this study regarding all of the Korean contractors that are classified by the Korean Standard Industrial Classification were collected from the Korea Information Service Value.1 The data collection and preparation processes are shown in Fig. 1.

In practice, to evaluate the financial status of the contractor during the pre-qualification phase, Public Procurement Service utilizes the credit rating. The credit rating is evaluated by credit rating agencies, i.e., Korea

Methodology

This study proposes models for predicting financial distress of contractor for two- and three-year ahead of prediction point based on an ensemble classifier. The performance of the proposed model is compared with six single-classifier models in predicting financial distress for two- and three-year ahead of prediction point. The framework for the modeling and evaluation processes is shown in Fig. 3. The processes are as follows: a SMOTE-based oversampling technique is applied to the financially

Results

Fig. 4 shows the performance of the models in predicting the financial distress of contractor in terms of the AUC value. The results of each classifier were generated by applying the 10-fold cross-validation to the dataset for each of the prediction years (2011 and 2012). By using the classification results of the six single classifiers, the WCEC was computed for every subset of the base classifiers. Based on the WCEC results, the optimal subsets with the maximum WCEC values were selected for

Conclusion

The financial crisis of a construction contractor during a construction project can be a critical risk for project owners and stakeholders such as investors, banks, and contractors. Although many financial crisis prediction models have been proposed for the construction industry, most of the previous studies have been limited to prediction for one-quarter or one-year ahead of prediction point, even though project duration of the construction industry are generally over one year. In addition,

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2013R1A1A2A10058175).

References (87)

  • D. Delen et al.

    Predicting breast cancer survivability: A comparison of three data mining methods

    Artificial Intelligence in Medicine

    (2005)
  • G. Derelioğlu et al.

    Knowledge discovery using neural approach for SME's credit risk analysis problem in Turkey

    Expert Systems with Applications

    (2011)
  • Y. Ding et al.

    Forecasting financial condition of Chinese listed companies based on support vector machine

    Expert Systems with Applications

    (2008)
  • A. Fernández et al.

    Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets

    International Journal of Approximate Reasoning

    (2009)
  • M. Gao et al.

    A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems

    Neurocomputing

    (2011)
  • R. Geng et al.

    Prediction of financial distress: An empirical study of listed chinese companies using data mining

    European Journal of Operational Research

    (2015)
  • G. Giacinto et al.

    Intrusion detection in computer networks by a modular ensemble of one-class classifiers

    Information Fusion

    (2008)
  • J. Heo et al.

    AdaBoost based bankruptcy forecasting of Korean construction companies

    Applied Soft Computing

    (2014)
  • M.G. Hertzel et al.

    Inter-firm linkages and the wealth effects of financial distress along the supply chain

    Journal of Financial Economics

    (2008)
  • I.M. Horta et al.

    Company failure prediction in the construction industry

    Expert Systems with Applications

    (2013)
  • S. Hou et al.

    Classifier combination for sketch-based 3D part retrieval

    Computers & Graphics

    (2007)
  • Y.T. Hou et al.

    Malicious web content detection by machine learning

    Expert Systems with Applications

    (2010)
  • Z. Hua et al.

    Predicting corporate financial distress based on integration of support vector machine and logistic regression

    Expert Systems with Applications

    (2007)
  • B. Huang et al.

    Customer churn prediction in telecommunications

    Expert Systems with Applications

    (2012)
  • W.H. Huang et al.

    Contractor financial prequalification using simulation method based on cash flow model

    Automation in Construction

    (2013)
  • K. Kianmehr et al.

    Calling communities analysis and identification using machine learning techniques

    Expert Systems with Applications

    (2009)
  • M.J. Kim et al.

    Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction

    Expert Systems with Applications

    (2015)
  • H. Li et al.

    Ranking-order case-based reasoning for financial distress prediction

    Knowledge-Based Systems

    (2008)
  • H. Li et al.

    Predicting business failure using classification and regression tree: An empirical comparison with popular classical statistical methods and top classification mining methods

    Expert Systems with Applications

    (2010)
  • J.H. Min et al.

    Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters

    Expert Systems with Applications

    (2005)
  • D.S. Nascimento et al.

    Integrating complementary techniques for promoting diversity in classifier ensembles: A systematic study

    Neurocomputing

    (2014)
  • S. Peddabachigari et al.

    Modeling intrusion detection system using hybrid intelligent systems

    Journal of Network and Computer Applications

    (2007)
  • R. Perdisci et al.

    McPAD: A multiple classifier system for accurate payload-based anomaly detection

    Computer Networks

    (2009)
  • S. Ravikumar et al.

    Machine learning approach for automated visual inspection of machine components

    Expert Systems with Applications

    (2011)
  • S.N. Razavi et al.

    GPS-less indoor construction location sensing

    Automation in Construction

    (2012)
  • D. Ruta et al.

    Classifier selection for majority voting

    Information Fusion

    (2005)
  • N.R. Sakthivel et al.

    Vibration based fault diagnosis of monoblock centrifugal pump using decision tree

    Expert Systems with Applications

    (2010)
  • H. Son et al.

    Rapid and automated determination of rusted surface areas of a steel bridge for robotic maintenance systems

    Automation in Construction

    (2014)
  • H. Son et al.

    Classification of major construction materials in construction environments using ensemble classifiers

    Advanced Engineering Informatics

    (2014)
  • H. Son et al.

    Early prediction of the performance of green building projects using pre-project planning variables: Data mining approaches

    Journal of Cleaner Production

    (2015)
  • J. Sun et al.

    Financial distress prediction based on serial combination of multiple classifiers

    Expert Systems with Applications

    (2009)
  • J. Sun et al.

    Financial distress prediction using support vector machines: Ensemble vs. individual

    Applied Soft Computing

    (2012)
  • D.M.J. Tax et al.

    Combining multiple classifiers by averaging or by multiplying

    Pattern Recognition

    (2000)
  • Cited by (0)

    View full text