Elsevier

Computers & Geosciences

Volume 120, November 2018, Pages 105-114
Computers & Geosciences

Robustness of Extreme Learning Machine in the prediction of hydrological flow series

https://doi.org/10.1016/j.cageo.2018.08.003Get rights and content

Highlights

  • Relatively new machine learning technique to predict hydrological flows.

  • Extreme Learning Machine (ELM) learns with single iteration and is very fast.

  • ELM is a viable alternative where quick model response is vital for decision making.

  • ELM produces good results with any combination of input variables.

  • ELM produces better results than other widely used methods.

Abstract

Prediction of hydrological flow series generated from a catchment is an important aspect of water resources management and decision making. The underlying process underpinning catchment flow generation is complex and depends on many parameters. Determination of these parameters using a trial and error method or optimization algorithm is time consuming. Application of Artificial Intelligence (AI) based machine learning techniques including Artificial Neural Network, Genetic Programming (GP) and Support Vector Machine (SVM) replaced the complex modeling process and at the same time improved the prediction accuracy of hydrological time-series. However, they still require numerous iterations and computational time to generate optimum solutions. This study applies the Extreme Learning Machine (ELM) to hydrological flow series modeling and compares its performance with GP and Evolutionary Computation based SVM (EC-SVM). The robustness and performance of ELM were studied using the data from two different catchments located in two different climatic conditions. The robustness of ELM was evaluated by varying number of lagged input variables the number of hidden nodes and input parameter (regularization coefficient). Higher lead days prediction and extrapolation capability were also investigated. The results show that (1) ELM yields reasonable results with two or higher lagged input variables (flows) for 1-day lead prediction; (2) ELM produced satisfactory results very rapidly when the number of hidden nodes was greater than or equal to 1000; (3) ELM showed improved results when regularization coefficient was fine-tuned; (4) ELM was able to extrapolate extreme values well; (5) ELM generated reasonable results for higher number of lead days (second and third) predictions; (6) ELM was computationally much faster and capable of producing better results compared to other leading AI methods for prediction of flow series from the same catchment. ELM has the potential for forecasting real-time hydrological flow series.

Introduction

Predicting hydrological flow series generated from a catchment is an important aspect of water resources management and decision making. The underlying process of prediction of flows from a catchment is complex and depends on many parameters. Application of conceptual and physically based distributed models in the prediction of inflows require many parameters including catchment characteristics, soil characteristics, infiltration, river networks and their details or their conceptual representations, rainfall and runoff data. The values of some of these parameters, like infiltration rates, can only be determined through calibration. Calibration of such models using a trial and error method or optimization algorithm requires considerable effort and experience particularly when the number of the calibration parameters is large (Atiquzzaman and Kandasamy, 2016a). Artificial intelligence (AI) based machine learning techniques have proven superior in this modeling process (e.g flow prediction) compared to other stochastic models including Autoregressive (AR), Autoregressive Moving Average (ARMA), Autoregressive Integrated Moving Average (ARIMA) and Autoregressive moving average with Exogenous Inputs (ARMAX) (Hsu et al., 1995; Lohani et al., 2012). AI includes Artificial Neural Network (ANN) (Anctil et al., 2004), Fuzzy Logic (FL) (Tayfur and Sing, 2006), Support Vector Machine (SVM) (Liong and Sivapragasam, 2002, Yu and Liong, 2007), Chaos Theory (Yu et al., 2002), and Genetic Programming (GP) (Jayawardena et al., 2005). These have become popular due to their ability in recognizing non-linearity in complex hydrological process and were successfully applied in rainfall-runoff models, hydrological flow series predictions, etc. Generally, application of data driven models to predict hydrological flow series (future discharges) requires an input of lagged discharges or meteorological data (Akhtar et al., 2009).

Cigizoglu (2003) studied the application of ANN for forecasting of daily flows for a river in Turkey. Their analysis demonstrated ANN's superior capability compared to conventional models (e.g. AR and regression models). Gholami et al. (2015) achieved high degree of accuracy in the prediction groundwater fluctuation using dendrochronology (tree-ring diameter) and ANN (multilayer perceptron, MLP). Wu and Chau (2011) found that the performance of ANN can be improved significantly if the input data is preprocessed with Singular Spectrum Analysis (SSA). Rasouli et al. (2012) applied three machine learning methods including Bayesian Neural Network (BNN) for streamflow forecasting using different combination of local meteo-hydrologic observations and climate indices. They found that BNN outperformed the other nonlinear models. Chen and Chang (2009) proposed an evolutionary algorithm (Genetic Algorithm (GA)) based ANN (EANN) to define the optimal network architecture and for prediction of real-time inflows to the Shihmen Reservoir in Taiwan. They demonstrated that EANN performed better than the ARMAX stochastic model. Chen and Chang (2009) stated that the performance of ANN depends on network architecture (e.g. inputs, number of hidden layers, the number of neurons and activation functions) and noted that very simple network architecture of ANN may not accurately predict while too complex architecture may reduce its generalization ability due to over-fitting. The performance of ANN was improved by combining with other techniques including hybrid methods (Chen et al., 2015). For example, Deka and Chandramouli (2009) proposed Fuzzy Neural Network (FNN) hybrid model to study the operation of a proposed multipurpose reservoir and found the FNN is highly adaptive, flexible, easy to build and efficient. Adaptive Neural-based Fuzzy Inference System (ANFIS) significantly improved on ANN predictions for reservoir prediction (Bhakra Dam, India, Lohani et al., 2012); for forecast of daily flood discharge (Yom River Basin, Thailand, Tingsanchali and Quang, 2004; and Ajay River Basin in Jharkhand, India, Mukerji et al., 2009), and for event-based rainfall-runoff modeling using lag time (Talei and Chua, 2012).

Cheng et al. (2005) adopted a parallel GA with Fuzzy Optimal model in a cluster of computers to reduce the computational run time required to optimize the rainfall runoff model (Xinanjiang) and to improve the quality of the results. As the problem was partitioned into smaller pieces, their proposed hybrid approach achieved the superior results quicker than GAs. The application of GP in real-time runoff forecasting was demonstrated by Khu et al. (2001), and Liong et al. (2002a). Makkeasorn et al., 2008 showed GP performed better than neural networks (NN) for forecasting discharges in a semi-arid watershed in South Texas, USA by including sea surface temperature, spatio-temporal rainfall distribution, meteorological data and historical streamflow data.

SVM is another powerful AI technique that has been successfully applied in flow forecasting, rainfall runoff modeling (Sivapragasam et al., 2001) and streamflow forecasting (Chiogna et al., 2018). Chiogna et al. (2018) proposed SVM with hydrological model (Soil Water Assessment Tool) output, the hydropower energy price and the day of the week to capture sudden fluctuations in river stage caused by the hydropower production company in Upper Adige River basin in North-East Italy. They found that SVM was able to reproduce the hydropeaking and performed better than SWAT under low flow condition when the streamflow was impacted by the hydropower. SVM was applied by Sivapragasam (2002) and Liong and Sivapragasam (2002) to predict stage in the city of Dhaka, Bangladesh using daily water level data measured at five gauging stations and showed that it performed better than ANN (Liong et al., 1999). Similarly, SVM was shown to be comparable or better than ANFIS and GP, for application in forecast of monthly river flow (Wang et al., 2009) and short term river flow (Heihe River, Northern China, He et al., 2014). SVM was further improved by reducing the noise from input data using Singular Spectrum Analysis (Sivapragasam et al., 2001), optimization of SVM parameters using Evolutionary Computation based Algorithm (EC-SVM) (Yu et al., 2004) and Particle Swarm Optimization algorithm (Wang et al., 2013). While SVM overcame some drawbacks of ANN (finding global optimized solutions and over-fitting, Lin et al., 2006) it required a long simulation time for large complex problem, and in the selection of an appropriate kernel function and associated parameters (C and ε). Fotovatikhah et al. (2018) reviewed the available AI and computational intelligence (CI) methods in the literature including ANN, fuzzy sets, wavelet models, SVM, EC and hybrid methods employed in hydrology, flood and waste flow prediction. They found that EC and SVMs showed lower error rates compared to other machine learning and soft computing techniques.

Literature on the application of AI approaches including ANN, ANFIS, SVM and GP in hydrological time-series prediction indicates that their performances are not consistent for all applications and it is difficult to state which method is superior. Superior performance depends on appropriate parameters and network configurations. Researchers have attempted to improve the performance of these methods using hybrid approach (ANFIS) or by combining them with other algorithms (EC-SVM) to optimize the parameters. However, they still required numerous iterations and significant computational time to generate optimum solutions. In order to overcome the long computational time and to produce generalized solution, a learning algorithm called Extreme Learning Machine (ELM), developed by Huang et al. (2006) was used in this study. ELM determines weights related output analytically with randomly generated input weights. The performance of ELM has been compared by Huang et al. (2006) with conventional neural network and SVM on some benchmarking problems in the function approximation and classification areas. Huang et al. (2006) reported that ELM is capable of approximating any continuous function and implementing any classification. ELM learns faster (Taormina and Chau, 2015) and is stable with a wide range of number of hidden nodes. ELM was also applied by Taormina and Chau (2015) in the selection of input variables for rainfall-runoff modeling. They obtained most accurate solutions with ELM coupled with Binary-coded discrete Fully Informed Particle Swarm Optimization (BFIPS). Atiquzzaman and Kandasamy (2016) demonstrated that ELM's learning speed and accuracy were comparable to Standard Chaos Technique, Inverse Approach (Phoon et al., 2002) and EC-SVM in the forecasting of hydrological time-series. However, the robustness of ELM's performance on different input parameters, longer lead day prediction and extrapolation capability was not investigated by Atiquzzaman and Kandasamy (2016).

This paper presents the robustness of ELM (a MATLAB program developed by Huang et al., 2006) for predicting hydrological flow series. ELM was applied to the Tryggevælde (Denmark) and Mississippi River (USA) Catchments. The performance of ELM was tested with different combinations of input variables. The number of nodes in the hidden layer was varied to check the sensitivity of ELM's result. The generalization capability of ELM was investigated for longer lead-day prediction (e.g. second and third) and for its extrapolation capability. Finally, the ELM results were compared with two most superior techniques GP and EC-SVM to demonstrate its fast learning capability.

Section snippets

Catchment data

ELM is capable of producing better flood prediction for any catchment under different climatic condition. To demonstrate the robustness ELM, data from two different catchments obtained from two different climatic conditions (Liong et al., 2002b; Yu et al., 2004) was used in this study and the results were compared with other published techniques.

The first catchment is Tryggevælde Catchment (130.5 km2) located in Denmark in the eastern part of Sealand, north of the village Karise. This is a

Influence of lagged variables

The ELM model was run for the Tryggevælde and Mississippi River catchments with flow data as input for different m values ranging from 1 to 7. The number of hidden nodes was set to the number of training samples (i.e. 6204). The CC, NSE, RMSE and NRMSE values for testing are presented in Table 1. The time required to train ELM for the two catchments was less than 122sec. The CC and NSE are higher than 0.9 and 0.8 respectively for all testing results which show that ELM predicts generalized

Discussion

In this study ELM, an AI Technique was presented to predict hydrological flow series. ELM's performance was demonstrated with data from two different catchment sizes i.e. a relatively smaller catchment (130.5 km2) called the Tryggevælde Catchment (Denmark) and the large Mississippi River (USA) catchment (3.2 million km2). ELM proved to be fast and did not depend on complex network architectures. Firstly, ELM's performance based on different lagged flows (1–7 days) was tested. The best results

Conclusion

The application of ELM was demonstrated in the prediction of hydrological flows from two different catchment sizes from two different climatic conditions (Tryggevælde Catchment, Denmark; and Mississippi River, USA). Literature shows that EC-SVM performed better than ANN, ANFIS, Fuzzy Logic in the prediction flows. ELM's performance was compared with EC-SVM and GP. The results showed how ELM improved prediction accuracies and reached the solutions very quickly compared to other techniques. ELM

References (45)

  • V. Nourani et al.

    Applications of hybrid wavelet-Artificial Intelligence models in hydrology: a Review

    J. Hydrol.

    (2014)
  • K. Rasouli et al.

    Daily streamflow forecasting by machine learning methods with weather and climate inputs

    J. Hydrol.

    (2012)
  • A. Talei et al.

    Influence of lag time on event-based rainfall-runoff modeling using the data driven approach

    J. Hydrol.

    (2012)
  • R. Taormina et al.

    Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and Extreme Learning Machines

    J. Hydrol.

    (2015)
  • W.-C. Wang et al.

    A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series

    J. Hydrol.

    (2009)
  • C.L. Wu et al.

    Rainfall-runoff modelling using artificial neural network coupled with singular Spectrum analysis

    J. Hydrol.

    (2011)
  • X. Yu et al.

    Forecasting of hydrological time series with ridge regression in feature space

    J. Hydrol.

    (2007)
  • M.K. Akhtar et al.

    River flow forecasting with artificial neural networks using satellite observed precipitation pre-processed with flow length and travel time information: case study of the Ganges river basin

    Hydrol. Earth Syst. Sci.

    (2009)
  • M. Atiquzzaman et al.

    Prediction of hydrological time-series using extreme learning machine

    J. Hydroinf.

    (2016)
  • M. Atiquzzaman et al.

    Prediction of inflows from Dam catchment using genetic programming

    Int. Journal of Hydrology Science and Technology

    (2016)
  • C.T. Cheng et al.

    Multiple criteria rainfall-runoff model calibration using a parallel genetic algorithm in a cluster of computers

    Hydrol. Sci. J.

    (2005)
  • H.K. Cigizoglu

    Estimation, forecasting and extrapolation of river flows by artificial neural networks

    Hydrol. Sci. J.

    (2003)
  • Cited by (27)

    • Randomization-based machine learning in renewable energy prediction problems: Critical literature review, new results and perspectives

      2022, Applied Soft Computing
      Citation Excerpt :

      The model has been tested in data acquired from two reservoirs in southwestern China. After this study, it was not until 4 years after when the interest in applying ELMs to this problem reignited with the work in [162], which focuses on the robustness of ELMs for hydrological flow series forecasting. The work compares ELMs for hydrological flow series modeling with other ML methods, such as genetic programming and evolutionary computation based SVM.

    • Using bootstrap ELM and LSSVM models to estimate river ice thickness in the Mackenzie River Basin in the Northwest Territories, Canada

      2019, Journal of Hydrology
      Citation Excerpt :

      To overcome these limitations, relatively new machine learning models, e.g., extreme learning machine (ELM), and least squares support vector machine (LSSVM), have been developed. The faster learning algorithms and improved generalization performance of the ELM models have led to their extensive use in different environmental modeling processes (e.g., drought (Deo and Şahin, 2015), water quality (Barzegar et al., 2018b; Fijani et al., 2019), groundwater level (Barzegar et al., 2017), groundwater vulnerability (Barzegar et al., 2018a), water demand (Quilty and Adamowski, 2018), water flow (Yaseen et al., 2016; Atiquzzaman and Kandasamy, 2018), wave height (Ali and Prasad, 2019) and soil temperature (Feng et al., 2019)). In addition, advanced design features of ELM models, such as analytical output determination by a least squares problem and random generation of the parameters of hidden nodes without the need for tuning the algorithm, renders ELM an interesting and new alternative to traditional forms of machine learning methods (Yaseen et al., 2016).

    View all citing articles on Scopus
    1

    Md Atiquzzaman is a PhD student at UTS, undertook this study as part of doctoral studies and prepared the manuscript.

    2

    Jaya Kandasamy is the PhD supervisor of the first author and edited and revised the manuscript.

    View full text