Abstract
Accurate and reliable river flow forecasts attained with data-intelligent models can provide significant information about future water resources management. In this study we employed a 50-model ensemble of three data-driven predictive models, namely the support vector regression (SVR), multivariate adaptive regression spline (MARS) and M5 model tree (M5Tree) to forecast river flow data in a semiarid and ecologically significant mountainous region of Pailugou catchment in northwestern China. To attain stable and accurate forecast results, 50 different models were trained by randomly sampling the entire river flow data into 80% for training and 20% for testing subsets. To attain a complete evaluation of the ensemble-model based results, the global mean of six quantitative statistical performance evaluation measures: the coefficient of correlation (R), mean absolute relative error (MAE), root mean squared error (RMSE), Nash–Sutcliffe efficiency coefficient (NS), relative RMSE, and the Willmott’s Index (WI), and Taylor diagrams, including skill scores relative to a persistence model, were selected to assess the performances of the developed predictive models. The results indicated that all of the averaged R value attained was higher than 0.900 and all of the averaged NS values were higher than 0.800, representing good performance of the SVR, MARS and M5Tree models applied in the 1-, 2- and 3-day ahead modeling horizon, and this also accorded with the deductions made through an assessment of the Willmott’s Index. However, the M5Tree model outperformed both the SVR and MARS models (with NS = 0.917 vs. 0.904 and 0.901 for 1-day, 0.893 vs. 0.854 and 0.845 for 2-day, and 0.850 vs. 0.828 and 0.810 for 3-day forecasting horizons, respectively), which was in concurrence with the high value of WI. Therefore, based on the ensemble of 50 models, the performance of the M5Tree can be considered as superior to the SVR and MARS models when applied in a problem of river flow forecasting at multiple forecast horizon. A detailed comparison of the overall performance of all three models evaluated through Taylor diagrams and boxplots indicated that the 1-day ahead forecasting results were more accurate for all of the predictive models compared to the 2- and 3-day ahead forecasting horizons. Data-intelligent models designed in this study indicate that the M5Tree method could successfully be explored for short-term river flow forecasting in semiarid mountainous regions, which may have useful implications in water resources management, ecological sustainability and assessment of river systems.
Similar content being viewed by others
References
Adnan RM, Yuan X, Kisi O, Anam R (2017) Improving accuracy of river flow forecasting using LSSVR with gravitational search algorithm. Adv Meteorol 2017:23. https://doi.org/10.1155/2017/2391621
Alizadeh MJ, Kavianpour MR, Kisi O, Nourani V (2017) A new approach for simulating and forecasting the rainfall–runoff process within the next two months. J Hydrol 548:588–597. https://doi.org/10.1016/j.jhydrol.2017.03.032
Al-Musaylh MS, Deo RC, Adamowski JF, Li Y (2018) Short-term electricity demand forecasting with MARS, SVR and ARIMA models using aggregated demand data in Queensland, Australia. Adv Eng Inform 35:1–16
Aqil M, Kita I, Yano A, Nishiyama S (2007) A comparative study of artificial neural networks and neuro-fuzzy in continuous modeling of the daily and hourly behaviour of runoff. J Hydrol 337:22–34. https://doi.org/10.1016/j.jhydrol.2007.01.013
Asefa T, Kemblowski M, Mckee M, Khalil A (2006) Multi-time scale stream flow predictions: the support vector machines approach. J Hydrol 318:7–16
Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36:7624–7629
Bergmeir C, Hyndman RJ, Koo B (2018) A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal 120:70–83. https://doi.org/10.1016/j.csda.2017.11.003
Bhattacharya B, Solomatine DP (2005) Neural networks and M5 model trees in modelling water level-discharge relationship. Elsevier, Amsterdam
Carrier C, Kalra A, Ahmad S (2013) Using Paleo reconstructions to improve streamflow forecast lead time in the western United States. J Am Water Resour Assoc 49:1351–1366
Ch S, Anand N, Panigrahi BK, Mathur S (2013) Streamflow forecasting by SVM with quantum behaved particle swarm optimization. Neurocomputing 101:18–23
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250
Chang F, Chen Y (2001) A counterpropagation fuzzy-neural network modeling approach to real time streamflow prediction. J Hydrol 245:153–164
Choubin B, Malekian A, Gloshan M (2016) Application of several data-driven techniques to predict a standardized precipitation index. Atmosfera 29:121–128
Christensen NS, Wood AW, Voisin N, Lettenmaier DP, Palmer RN (2004) The effects of climate change on the hydrology and water resources of the Colorado river basin. Clim Change 62:337–363. https://doi.org/10.1023/B:CLIM.0000013684.13621.1f
Clark MP, Kavetski D, Fenicia F (2011) Pursuing the method of multiple working hypotheses for hydrological modeling. Water Resour Res. https://doi.org/10.1029/2010wr009827
Danandeh Mehr A, Kahya E, Şahin A, Nazemosadat MJ (2015) Successive-station monthly streamflow prediction using different artificial neural network algorithms. Int J Environ Sci Technol 12:2191–2200. https://doi.org/10.1007/s13762-014-0613-0
Deo RC, Şahin M (2015) Application of the extreme learning machine algorithm for the prediction of monthly effective drought index in eastern Australia. Atmos Res 153:512–525. https://doi.org/10.1016/j.atmosres.2014.10.016
Deo RC, Şahin M (2016) An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ Monit Assess 188:90. https://doi.org/10.1007/s10661-016-5094-9
Deo RC, Şahin M (2017) Forecasting long-term global solar radiation with an ANN algorithm coupled with satellite-derived (MODIS) land surface temperature (LST) for regional locations in Queensland. Renew Sustain Energy Rev 72:828–848. https://doi.org/10.1016/j.rser.2017.01.114
Deo RC, Samui P (2017) Forecasting evaporative loss by least-square support-vector regression and evaluation with genetic programming, Gaussian process, and minimax probability machine regression: case study of Brisbane city. J Hydrol Eng 22:05017003. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001506
Deo RC, Wen X, Qi F (2016) A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset. Appl Energy 168:568–593. https://doi.org/10.1016/j.apenergy.2016.01.130
Deo RC, Downs N, Parisi AV, Adamowski JF, Quilty JM (2017a) Very short-term reactive forecasting of the solar ultraviolet index using an extreme learning machine integrated with the solar zenith angle. Environ Res 155:141–166. https://doi.org/10.1016/j.envres.2017.01.035
Deo RC, Kisi O, Singh VP (2017b) Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model. Atmos Res 184:149–175. https://doi.org/10.1016/j.atmosres.2016.10.004
Di C, Yang X, Wang X (2014) A four-stage hybrid model for hydrological time series forecasting. PLoS ONE 9:e104663. https://doi.org/10.1371/journal.pone.0104663
Dibike YB, Velickov S, Solomatine DP, Abbott MB (2001) Model induction with support vector machines: introduction and applications. J Comput Civ Eng 15:208–216
Dumedah G (2015) Toward essential union between evolutionary strategy and data assimilation for model diagnostics: an application for reducing the search space of optimization problems using hydrologic genome map. Environ Modell Softw 69:342–352. https://doi.org/10.1016/j.envsoft.2014.09.025
El-Shafie A, Taha MR, Noureldin A (2007) A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour Manag 21:533–556. https://doi.org/10.1007/s11269-006-9027-1
Friedel MJ (2011) A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty. Environ Model Softw 26:1583–1598
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67
Gao G, Fu B, Wang S, Liang W, Jiang X (2016) Determining the hydrological responses to climate variability and land use/cover change in the Loess Plateau with the Budyko framework. Sci Total Environ 557–558:331–342. https://doi.org/10.1016/j.scitotenv.2016.03.019
Greco R (2012) A fuzzy-autoregressive model of daily river flows. Comput Geosci 43:17–23
Guimarães Santos CA, Silva GBLd (2014) Daily streamflow forecasting using a wavelet transform and artificial neural network hybrid models. Hydrol Sci J 59:312–324. https://doi.org/10.1080/02626667.2013.800944
Guo J, Zhou J, Qin H, Zou Q, Li Q (2011) Monthly streamflow forecasting based on improved support vector machine model. Expert Syst Appl 38:13073–13081
Haykin S (1999) Neural network—a comprehensive foundation. Prentice-Hall, Englewood Cliffs
He ZB, Wen XH, Liu H, Du J (2014) comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J Hydrol 509:379–386. https://doi.org/10.1016/j.jhydrol.2013.11.054
Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification
IPCC (2001) Summary of policy makers. A report of working group I of the intergovernmental panel on climate change. In: Houghton JT, Ding Y, Griggs DJ, Noguer M, van der Linden PJ, Dai X, Maskell K, Johnson CA (eds) Climate change 2001: the scientific basis. Contribution of working group 1 to the third assessment report of the intergovernmental panel on climate change. Cambridge University Press, Cambridge
Jain YK, Bhandare SK (2011) Min max normalization based data perturbation method for privacy protection. Int J Comput Commun Technol 2:45–50
Kagoda PA, Ndiritu J, Ntuli C, Mwaka B (2010) Application of radial basis function neural networks to short-term streamflow forecasting. Phys Chem Earth Parts A/B/C 35:571–581. https://doi.org/10.1016/j.pce.2010.07.021
Kalteh AM (2013) Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform. Comput Geosci 54:1–8
Katambara Z, Ndiritu J (2010) A hybrid conceptual–fuzzy inference streamflow modelling for the Letaba River system in South Africa. Phys Chem Earth 35:582–595
Kisi O (2010) Wavelet regression model for short-term streamflow forecasting. J Hydrol 389:344–353
Kisi O (2015) Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree. J Hydrol 528:312–320
Kisi O (2016) Modeling reference evapotranspiration using three different heuristic regression approaches. Agric Water Manag 169:162–172
Kişi Ö (2007) Streamflow forecasting using different artificial neural network algorithms. J Hydrol Eng 12:532–539. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:5(532)
Kisi O, Cimen M (2011) A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J Hydrol 399:132–140
Kisi O, Shiri J, Tombul M (2013) Modeling rainfall–runoff process using soft computing techniques. Comput Geosci 51:108–117
K-l Hsu, Gupta HV, Gao X, Sorooshian S, Imam B (2002) Self-organizing linear output map (SOLO): an artificial neural network suitable for hydrologic modeling and analysis. Water Resour Res 38:38-1–38-17. https://doi.org/10.1029/2001wr000795
Krause P, Boyle D, Bäse F (2005) Comparison of different efficiency criteria for hydrological model assessment. Adv Geosci 5:89–97
Legates DR, McCabe GJ (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233–241
Li P, Kwon H, Sun L, Lall U, Kao J (2010) A modified support vector machine based prediction model on streamflow at the Shihmen reservoir, Taiwan. Int J Climatol 30:1256–1268
Li J, Tan S, Chen F, Feng P (2014) Quantitatively analyze the impact of land use/land cover change on annual runoff decrease. Nat Hazards 74:1191–1207. https://doi.org/10.1007/s11069-014-1237-x
Lima CHR, Lall U (2010) Spatial scaling in a changing climate: a hierarchical bayesian model for non-stationary multi-site annual maximum and monthly streamflow. J Hydrol 383:307–318. https://doi.org/10.1016/j.jhydrol.2009.12.045
Liu Z, Zhou P, Chen G, Guo L (2014) Evaluating a coupled discrete wavelet transform and support vector regression for daily and monthly streamflow forecasting. J Hydrol 519:2822–2831
Maier HR, Kapelan Z, Kasprzyk JR, Kollat JB, Matott LS, Cunha MDC, Dandy GC, Gibbs MS, Keedwell E, Marchi A (2014) Evolutionary algorithms and other metaheuristics in water resources: current status, research challenges and future directions. Environ Model Softw 62:271–299
Makwana J, Tiwari M (2014) Intermittent streamflow forecasting and extreme event modelling using wavelet based artificial neural networks. Water Resour Manag 28:4857–4873
Mehr AD, Kahya E, Olyaie E (2013) Streamflow prediction using linear genetic programming in comparison with a neuro-wavelet technique. J Hydrol 505:240–249
Nash J, Sutcliffe J (1970) River flow forecasting through conceptual models part I—a discussion of principles. J Hydrol 10:282–290
Nayak PC, Sudheer KP, Rangan DM, Ramasastri KS (2005) Short-term flood forecasting with a neurofuzzy model. Water Resour Res. https://doi.org/10.1029/2004wr003562
Noori N, Kalin L (2016) Coupling SWAT and ANN models for enhanced daily streamflow prediction. J Hydrol 533:141–151. https://doi.org/10.1016/j.jhydrol.2015.11.050
Nourani V, Kisi O, Komasi M (2011) Two hybrid artificial intelligence approaches for modeling rainfall–runoff process. J Hydrol 402:41–59
Okkan U, Serbes ZA, Samui P (2014) Relevance vector machines approach for long-term flow prediction. Neural Comput Appl 25:1393–1405. https://doi.org/10.1007/s00521-014-1626-9
Pal M, Deswal S (2009) M5 model tree based modelling of reference evapotranspiration. Hydrol Process 23:1437–1443
Peng X, Zhang T, Pan X, Wang Q, Zhong X, Wang K, Mu C (2013) Spatial and temporal variations of seasonally frozen ground over the Heihe river basin of Qilian mountain in western China. Adv Earth Sci 28:497–508
Prasad R, Deo RC, Li Y, Maraseni T (2017) Input selection and performance optimization of ANN-based streamflow forecasts in a drought-prone Murray Darling Basin using IIS and MODWT algorithm. Atmos Res 197:42–63
Quinlan JR (1992) Learning with continuous classes. In: Australian joint conference on artificial intelligence, pp 343–348
Rahimikhoob A, Asadi M, Mashal M (2013) A comparison between conventional and M5 model tree methods for converting pan evaporation to reference evapotranspiration for semi-arid region. Water Resour Manag 27:4815–4826. https://doi.org/10.1007/s11269-013-0440-y
Sahay RR, Srivastava A (2013) Predicting monsoon floods in rivers embedding wavelet transform, genetic algorithm and neural network. Water Resour Manag 28:301–317
Salcedo-Sanz S, Deo RC, Cornejo-Bueno L, Camacho-Gómez C, Ghimire S (2018) An efficient neuro-evolutionary hybrid modelling mechanism for the estimation of daily global solar radiation in the Sunshine state of Australia. Appl Energy 209:79–94
Sanikhani H, Kisi O (2012) River flow estimation and forecasting by using two different adaptive neuro-fuzzy approaches. Water Resour Manag 26:1715–1729
Sephton P (2001) Forecasting recessions: can we do better on MARSTM? Fed Reserv Bank of St. Louis Rev 83:39–49
Sharda VN, Patel RM, Prasher SO, Ojasvi PR, Prakash C (2006) Modeling runoff from middle Himalayan watersheds employing artificial intelligence techniques. Agric Water Manag 83:233–242. https://doi.org/10.1016/j.agwat.2006.01.003
Sharda VN, Prasher SO, Patel RM, Ojasvi PR, Prakash C (2008) Performance of multivariate adaptive regression splines (MARS) in predicting runoff in mid-Himalayan micro-watersheds with limited data/Performances de régressions par splines multiples et adaptives (MARS) pour la prévision d’écoulement au sein de micro-bassins versants Himalayens d’altitudes intermédiaires avec peu de données. Hydrol Sci J 53:1165–1175. https://doi.org/10.1623/hysj.53.6.1165
Sharma S, Srivastava P, Fang X, Kalin L (2015) Performance comparison of adoptive neuro fuzzy inference system (ANFIS) with loading simulation program C++ (LSPC) model for streamflow simulation in El Niño Southern Oscillation (ENSO)-affected watershed. Expert Syst Appl 42:2213–2223. https://doi.org/10.1016/j.eswa.2014.09.062
Shu C, Ouarda TBMJ (2008) Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J Hydrol 349:31–43
Suykens J, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48:85–105. https://doi.org/10.1016/S0925-2312(01)00644-0
Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res Atmos 106:7183–7192
Taylor M, Kosmopoulos P, Kazadzis S, Keramitsoglou I, Kiranoudis C (2016) Neural network radiative transfer solvers for the generation of high resolution solar irradiance spectra parameterized by cloud and aerosol parameters. J Quant Spectrosc Radiat Transf 168:176–192
Terzi Ö, Ergin G (2013) Forecasting of monthly river flow with autoregressive modeling and data-driven techniques. Neural Comput Appl 25:179–188. https://doi.org/10.1007/s00521-013-1469-9
Tezel G, Buyukyildiz M (2016) Monthly evaporation forecasting using artificial neural networks and support vector machines. Theor Appl Climatol 124:69–80. https://doi.org/10.1007/s00704-015-1392-3
Tongal H, Berndtsson R (2017) Impact of complexity on daily and multi-step forecasting of streamflow with chaotic, stochastic, and black-box models. Stoch Environ Res Risk Assess 31:661–682. https://doi.org/10.1007/s00477-016-1236-4
Tran HD, Muttil N, Perera BJC (2015) Selection of significant input variables for time series forecasting. Environ Model Softw 64:156–163
Tricomi FG (1985) Integral equations. Dover Publications Inc., New York
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Whigham PA, Crapper PF (2001) Modelling rainfall–runoff using genetic programming. Math Comput Model 33:707–721
Willmott CJ (1981) On the validation of models. Phys Geogr 2:184–194
Willmott CJ (1982) Some comments on the evaluation of model performance. Bull Am Meteor Soc 63:1309–1313
Willmott CJ (1984) On the evaluation of model performance in physical geography. Spat Stat Models 40:443–460
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82
Xiao M, Zhang Q, Singh VP, Chen X (2017) Probabilistic forecasting of seasonal drought behaviors in the Huai river basin, China. Theor Appl Climatol 128:667–677. https://doi.org/10.1007/s00704-016-1733-x
Yan L, Xiong L, Liu D, Hu T, Xu C-Y (2017) Frequency analysis of nonstationary annual maximum flood series using the time-varying two-component mixture distributions. Hydrol Process 31:69–89. https://doi.org/10.1002/hyp.10965
Yaseen ZM, El-shafie A, Jaafar O, Afan HA, Sayl KN (2015) Artificial intelligence based models for stream-flow forecasting: 2000–2015. J Hydrol 530:829–844. https://doi.org/10.1016/j.jhydrol.2015.10.038
Yaseen ZM, Jaafar O, Deo RC, Kisi O, Adamowski J, Quilty J, El-Shafie A (2016) Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J Hydrol 524:603–614
Yaseen ZM, Ghareb MI, Ebtehaj I, Bonakdari H, Siddique R, Heddam S, Yusif A, Deo RC (2017) Rainfall pattern forecasting using novel hybrid intelligent model based ANFIS-FFA. Water Resour Manag (in press)
Yaseen ZM, Fu M, Wang C, Mohtar WHMW, Deo RC, El-shafie A (2018) Application of the hybrid artificial neural network coupled with rolling mechanism and grey model algorithms for streamflow forecasting over multiple time horizons. Water Resour Manag 32:1–17
Ye A, Deng X, Ma F, Duan Q, Zhou Z, Du C (2017) Integrating weather and climate predictions for seamless hydrologic ensemble forecasting: a case study in the Yalong river basin. J Hydrol 547:196–207. https://doi.org/10.1016/j.jhydrol.2017.01.053
Yin ZL, Xiao HL, Zou SB, Zhu R, Zhixiang LU, Lan YC, Shen YP (2014) Simulation of hydrological processes of mountainous watersheds in inland river basins: taking the Heihe mainstream river as an example. J Arid Land 6:16–26
Yonaba H, Anctil F, Fortin V (2010) Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting. J Hydrol Eng 15:275–283. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000188
Zhang WG, Goh ATC (2013) Multivariate adaptive regression splines for analysis of geotechnical engineering systems. Comput Geotech 48:82–95. https://doi.org/10.1016/j.compgeo.2012.09.016
Zhang Q, Singh VP, Li K, Li J (2014) Trend, periodicity and abrupt change in streamflow of the East river, the Pearl river basin. Hydrol Process 28:305–314. https://doi.org/10.1002/hyp.9576
Zhang H, Singh VP, Wang B, Yu Y (2016a) CEREF: a hybrid data-driven model for forecasting annual streamflow from a socio-hydrological system. J Hydrol 540:246–256. https://doi.org/10.1016/j.jhydrol.2016.06.029
Zhang W, Goh ATC, Zhang Y (2016b) Multivariate adaptive regression splines application for multivariate geotechnical problems with big data. Geotech Geol Eng 34:193–204. https://doi.org/10.1007/s10706-015-9938-9
Zimmer A, Schmidt AR, Ostfeld A, Minsker BS (2015) Evolutionary algorithm enhancement for model predictive control and real-time decision support. Environ Model Softw 69:330–341
Acknowledgements
This work was supported by the National Key R&D Program of China (Grant Numbers 2017YFC0404302, 2016YFC0400908), and the National Natural Science Foundation of China (Grant Number 41601038), and the Key Research Program of Frontier Sciences, CAS (Grant Number QYZDJ-SSW-DQC031), and CAS “Light of West China” Program, and the China Postdoctoral Science Foundation (Grant Number 2015M572620), and the Foundation for Excellent Youth Scholars of Northwest Institute of Eco-Environment and Resources, CAS. Dr R C Deo thanks the continued support of CAS for collaborations with the Chinese counterpart researchers.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Yin, Z., Feng, Q., Wen, X. et al. Design and evaluation of SVR, MARS and M5Tree models for 1, 2 and 3-day lead time forecasting of river flow data in a semiarid mountainous catchment. Stoch Environ Res Risk Assess 32, 2457–2476 (2018). https://doi.org/10.1007/s00477-018-1585-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-018-1585-2