Abstract
The use of new data-driven approaches based on the so-called expert systems to simulate runoff generation processes is a promising frontier that may allow for overcoming some modeling difficulties related to more complex traditional approaches. The present study highlights the potential of expert systems in creating regional hydrological models, for which they can benefit from the availability of large database. Different soft computing models for the reconstruction of the monthly natural runoff in river basins are explored, focusing on a new class of heuristic models, which is the Multi-Gene Genetic Programming (MGGP). The region under study is Sicily (Italy), where a regression based rainfall-runoff model, here used as benchmark model, was previously built starting from the analysis of a regional database relative to several gauged watersheds across the region. In the present study, different models are created using the same dataset, including: six MGGPs generated considering different modeling set-up; a Multi-Layer Perceptron Artificial Neural Network (ANN); two new hybrid models (ANN-MGGP), combining a Classifier ANN and two MGGPs that simulate separately low and high runoff. Results show how all the soft computing models perform similarly and outperform the benchmark model, demonstrating that MGGP can be considered as a valid alternative to the much more consolidated ANN technique. The new introduced hybrid ANN-MGGP is the only model showing at least satisfactory performance (i.e. Nash–Sutcliffe Efficiency above 0.5) over the full range of 38 watersheds explored, representing a useful regional tool for reconstructing monthly runoff series also at ungauged sites.
Graphical abstract












Similar content being viewed by others
Abbreviations
- AdB:
-
Basin authority of Sicilian Region (Autorità di Bacino della Regione Sicilia)
- ANN:
-
Artificial neural network
- CN:
-
Curve number
- DEM:
-
Digital elevation model
- GA:
-
Genetic algorithm
- GEP:
-
Gene expression programming
- GIS:
-
Geographic information system
- GP:
-
Genetic programming
- HL:
-
Hidden layer
- LGP:
-
Linear genetic programming
- PDRMSE :
-
Percent difference in RMSE for soft computing models with respect to the benchmark model
- NSE:
-
Nash–Sutcliffe efficiency
- MGGP:
-
Multi-gene genetic programming
- OL:
-
Output layer
- QGIS:
-
Quantum GIS
- SCS:
-
Soil conservation service
- SM:
-
Supplementary material
- SMA:
-
Simple moving average
- sub-ET:
-
Sub-expression trees
- RMSE:
-
Root mean squared error
- RMSEBM :
-
Root mean squared error for the benchmark model
- RMSESC :
-
Root mean squared error for soft computing model
- Tri.Mo.Ti.S.:
-
Trinacria model for monthly time series
References
Abda Z, Zerouali B, Chettih M, Santos CAG, de Farias CAS, Elbeltagi A (2022) Assessing machine learning models for streamflow estimation: a case study in Oued Sebaou watershed (Northern Algeria). Hydrol Sci J 67(9):1328–1341. https://doi.org/10.1080/02626667.2022.2083511
Achite M, Banadkooki FB, Ehteram M et al (2022) Exploring Bayesian model averaging with multiple ANNs for meteorological drought forecasts. Stoch Environ Res Risk Assess 36:1835–1860. https://doi.org/10.1007/s00477-021-02150-6
Adnan RM, Petroselli A, Heddam S et al (2021) Short term rainfall-runoff modelling using several machine learning methods and a conceptual event-based model. Stoch Environ Res Risk Assess 35:597–616. https://doi.org/10.1007/s00477-020-01910-0
Babovic V (2009) Introducing knowledge into learning based on genetic programming. J Hydroinf 11(3–4):181–193
Berry MJA, Linoff G (1997) Data mining techniques. Wiley, New York
Bhadra A, Bandyopadhyay A, Singh R, Raghuwanshi NS (2010) Rainfall-runoff modeling: comparison of two approaches with different data requirements. Water Resour Manag 24:37–62
Boughton W, Chiew F (2007) Estimating runoff in ungauged catchments from rainfall, PET and the AWBM model. Environ Model Softw 22(4):476–487. https://doi.org/10.1016/j.envsoft.2006.01.009
Bourdin DR, Fleming SW, Stull RB (2012) Streamflow modelling: a primer on applications, approaches and challenges. Atmos Ocean 50(4):507–536
Bhusal A, Parajuli U, Regmi S, Kalra A (2022) Application of machine learning and process-based models for rainfall-runoff simulation in DuPage River Basin. Illinois Hydrol 2022(9):117. https://doi.org/10.3390/hydrology9070117
Cigizoglu HK (2005) Application of generalized regression neural networks to intermittent flow forecasting and estimation. J Hydrol Eng ASCE 10(4):336–341
Črepinšek M, Liu S-H, Mernik M (2013) Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput Surv 45(3):33. https://doi.org/10.1145/2480741.2480752
Cutore P, Cristaudo G, Campisano A, Modica C, Cancelliere A, Rossi G (2007) Regional models for the estimation of streamflow series in ungauged Basins. Water Resour Manag 21:789–800. https://doi.org/10.1007/s11269-006-9110-7
Danandeh Mehr A, Kahya E, Olyaie E (2013) Streamflow prediction using linear genetic programming in comparison with a neuro-wavelet technique. J Hydrol 505:240–249
Demirel MC, Booij MJ, Hoekstra AY (2015) The skill of seasonal ensemble low-flow forecasts in the Moselle River for three different hydrological models. Hydrol Earth Syst Sci 19:275–291. https://doi.org/10.5194/hess-19-275-2015
Di Piazza A, Lo Conti F, Noto LV, Viola F, La Loggia G (2011) Comparative analysis of different techniques for spatial interpolation of rainfall data to create a serially complete monthly time series of precipitation for Sicily, Italy. Int J Appl Earth Obs Geoinf 13:396–408
Di Piazza A, Lo Conti F, Viola F, Eccel E, Noto LV (2015) Comparative analysis of spatial interpolation methods in the mediterranean area: application to temperature in sicily. Water 7(5):1866–1888. https://doi.org/10.3390/w7051866
Escalante-Sandoval C, Amores-Rovelo L (2017) Regional monthly runoff forecast in southern Canada using ANN K-Means, and L-Moments Techniques, Canadian water resources. Journal Revue Canadienne Des Ressources Hydriques 42(3):205–222. https://doi.org/10.1080/07011784.2017.1290552
Gandomi AH, Alavi AH (2012a) A new multi-gene genetic programming approach to nonlinear system modeling. Part I: materials and structural engineering problems. Neural Comput Appl 21:171–187. https://doi.org/10.1007/s00521-011-0734-z
Gandomi AH, Alavi AH (2012b) A new multi-gene genetic programming approach to nonlinear system modeling. Part II: geotechnical and earthquake engineering problems. Neural Comput Appl 21:189–201
Gauch M, Mai J, Lin J (2021) The proper care and feeding of CAMELS: How limited training data affects streamflow prediction. Environ Model Softw 135:104926. https://doi.org/10.1016/j.envsoft.2020.104926
Ghorbani MA, Khatibi R, Danandeh Mehr A, Asadi H (2018) Chaos-based multigene genetic programming: a new hybrid strategy for river flow forecasting. J Hydrol 562:455–467
He Y, Bárdossy A, Zehe E (2011) A review of regionalisation for continuous streamflow simulation. Hydrol Earth Syst Sci 15(11):3539–3553
Hrnjica B, Danandeh Mehr A (2018) Optimized genetic programming applications: emerging research and opportunities. IGI Global, New York
Kajbaf AA, Bensi M, Brubaker KL (2022) Temporal downscaling of precipitation from climate model projections using machine learning. Stoch Environ Res Risk Assess. https://doi.org/10.1007/s00477-022-02259-2
Khan MT, Shoaib M, Hammad M, Salahudin H, Ahmad F, Ahmad S (2021) Application of machine learning techniques in rainfall-runoff modelling of the Soan River Basin. Pakistan Water 2021(13):3528. https://doi.org/10.3390/w13243528
Kratzert F, Klotz D, Herrnegger M, Sampson AK, Hochreiter S, Nearing G (2019) Toward improved predictions in ungauged basins: exploiting the power of machine learning. Water Resour Res 55:11344–11354. https://doi.org/10.1029/2019WR026065
Kisi O (2004) River flow modeling using artificial neural networks. J Hydrol Eng ASCE 9(1):60–63
Kisi O, Cigizoglu HK (2007) Comparison of different ANN techniques in river flow prediction. Civ Eng Environ Syst 24(3):211–231
Kisi O, Nia AM, Gosheh MG, Tajabadi MRJ, Ahmadi A (2012) Intermittent streamflow forecasting by using several data driven techniques. Water Resourc Manag 26(2):457–474
Kizza M, Guerrero JL, Rodhe A, Chong-yu X, Ntale HK (2013) Modelling catchment inflows into Lake Victoria: regionalisation of the parameters of a conceptual water balance model. Hydrol Res 44(5):789–808
Koza J (1992) Genetic programming, on the programming of computers by means of natural selection. MIT Press, Cambridge
Lee S, Ryu JH, Min K, Won JS (2003) Landslide susceptibility analysis using GIS and artificial neural network. Earth Surf Proc Land 28:1361–1376
Livneh B, Kumar R, Samaniego L (2015) Influence of soil textural properties on hydrologic fluxes in the Mississippi river basin. Hydrol Process 29:4638–4655. https://doi.org/10.1002/hyp.10601
MacKay DJC (1992) Bayesian interpolation. Neural Comput 4(3):415–447
Mehr AD (2018a) An improved gene expression programming model for streamflow forecasting in intermittent streams. J Hydrol 563:669–678. https://doi.org/10.1016/j.jhydrol.2018.06.049
Mehr AD (2018b) Month ahead rainfall forecasting using gene expression programming. Am J Earth Environ Sci 1(2):63–70
Mehr AD, Demirel MC (2016) On the calibration of multigene genetic programming to simulate low flows in the Moselle River. Uludağ Univ J Faculty Eng 21(2):365–376. https://doi.org/10.17482/uumfd.278107
Mehr AD, Nourani V (2018) Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling. Water Resour Manag 32:2665–2679. https://doi.org/10.1007/s11269-018-1951-3
Meshgi A, Schmitter P, Chui TM, Babovic V (2015) Development of a modular streamflow model to quantify runoff contributions from different land use types in tropical urban environments using genetic programming. J Hydrol 525:711–723
Mohammad-Azari S, Bozorg-Haddad O, Loáiciga HA (2020) State-of-art of genetic programming applications in water-resources systems analysis. Environ Monitor Assessm 192(2):73. https://doi.org/10.1007/s10661-019-8040-9
Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900
Mosavi A, Ozturk P, Chau K (2018) Flood prediction using machine learning models literature review. Water 10(11):1536. https://doi.org/10.3390/w10111536
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models, 1. Discuss Princip J Hydrol 10:282–290
Nearing GS, Kratzert F, Sampson AK, Pelissier CS, Klotz D, Frame JM et al (2021) What role does hydrological science play in the age of machine learning? Water Resour Res 57:e2020WR028091. https://doi.org/10.1029/2020WR028091
Noh H, Kwon S, Seo IW, Baek D, Jung SH (2021) Multi-gene genetic programming regression model for prediction of transient storage model parameters in natural rivers. Water 13(1):76. https://doi.org/10.3390/w13010076
Noto LV (2014) Exploiting the topographic information in a PDM-based conceptual hydrological model. J Hydrol Eng 19(16):1173–1185. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000908
Noto LV, Cipolla G, Francipane A, Pumo D (2022) Climate change in the mediterranean basin (part I): induced alterations on climate forcings and hydrological processes. Water Resour Manag. https://doi.org/10.1007/s11269-022-03400-0
Pumo D, Viola F, Noto LV (2016a) Generation of natural runoff monthly series at ungauged sites using a regional regressive model. Water 8(5):209. https://doi.org/10.3390/w8050209
Pumo D, Caracciolo D, Viola F, Noto LV (2016b) Climate change effects on the hydrological regime of small non-perennial river basins. Sci Total Environ 2016(542):76–92. https://doi.org/10.1016/j.scitoten.2015.10.109
Pumo D, Lo Conti F, Viola F, Noto LV (2017) An automatic tool for reconstructing monthly time-series of hydro-climatic variables at ungauged basins. Environ Model Softw 95:381–400. https://doi.org/10.1016/j.envsoft.2017.06.045
Rahimzad M, Moghaddam Nia A, Zolfonoon H, Soltani J, Mehr AD, Kwon H (2020) Performance comparison of an LSTM-based deep learning model versus conventional machine learning algorithms for streamflow forecasting. Water Resour Manag 35:4167–4187. https://doi.org/10.1007/s11269-021-02937-w
Ravansalar M, Rajaee T, Kisi O (2017) Wavelet-linear genetic programming: a new approach for modeling monthly streamflow. J Hydrol 549:461–475
Razavi T, Coulibaly P (2013) Streamflow prediction in ungauged basins: review of regionalization methods. J Hydrol Eng 18(8):958–975
Riahi-Madvar H, Dehghani M, Seifi A, Singh VP (2019) Pareto optimal multigene genetic programming for prediction of longitudinal dispersion coefficient. Water Resour Manag 33:905–921
Roushangar K, Alizadeh F (2019) Scenario-based prediction of short-term river stage–discharge process using wavelet-EEMD-based relevance vector machine. J Hydroinf 21(1):56–76. https://doi.org/10.2166/hydro.2018.023
Roushangar K, Alizadeh F, Nourani V (2018) Improving capability of conceptual modeling of watershed rainfall–runoff using hybrid wavelet-extreme learning machine approach. J Hydroinf 20(1):69–87. https://doi.org/10.2166/hydro.2017.011
Samaniego L, Kumar R, Attinger S (2010) Multiscale parameter regionalization of a grid-based hydrologic model at the mesoscale. Water Resour Res 46:W05523. https://doi.org/10.1029/2008WR007327
Sachindra DA, Kanae S (2019) Machine learning for downscaling: the use of parallel multiple populations in genetic programming. Stoch Env Res Risk Assess 33:1497–1533. https://doi.org/10.1007/s00477-019-01721-y
Sattar AMA, Gharabaghi B (2015) Gene expression models for prediction of longitudinal dispersion coefficient in streams. J Hydrol 524:587–596. https://doi.org/10.1016/j.jhydrol.2015.03.016
Searson D (2015) GPTIPS 2: an open-source software platform for symbolic data mining. In: Al., AHG et (ed) Chapter 22 in handbook of genetic programming applications. Springer, New York
Shoaib M, Shamseldin AY, Melville BW, Khan MM (2015) Runoff forecasting using hybrid wavelet gene expression programming (WGEP) approach. J Hydrol 527:326–344
Shu XS, Ding W, Peng Y, Wang ZR, Wu J, Li M (2021) Monthly streamflow forecasting using convolutional neural network. Water Resour Manag 35(15):5089–5104
Singh KK, Pal M, Singh VP (2010) Estimation of mean annual flood in Indian catchments using backpropagation neural network and M5 model tree. Water Resour Manag 24:2007–2019
Sordo-Ward A, Granados A, Iglesias A, Garrote L, Bejarano MD (2019) Adaptation effort and performance of water management strategies to face climate change impacts in six representative basins of Southern Europe. Water 11(5):1078
Stathakis D (2009) How many hidden layers and nodes? Int J Remote Sens 30(8):2133–2147. https://doi.org/10.1080/01431160802549278
Xu W, Chen J, Zhang XJ (2022) Scale effects of the monthly streamflow prediction using a state-of-the-art deep learning model. Water Resour Manag 36:3609–3625. https://doi.org/10.1007/s11269-022-03216-y
Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 79:251–266
Zhang Y, Chiew FHS, Li M, Post D (2018) Predicting runoff signatures using regression and hydrological modeling approaches. Water Resour Res 54:7859–7878. https://doi.org/10.1029/2018WR023325
Acknowledgements
The authors thank anonymous reviewers for their helpful suggestions on the quality improvement of the present paper.
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
DP is the first and corresponding author. Both authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by DP. The first draft of the manuscript was written by DP and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Data availability
Data used in this article can be found, on request, at the website of the Basin Authority of Sicilian Region (Autorità di Bacino della Regione Sicilia) through the following link: https://www.regione.sicilia.it/istituzioni/regione/strutture-regionali/presidenza-regione/autorita-bacino-distretto-idrografico-sicilia. Data can also be freely visualized at the website of ISPRA (Istituto Superiore per la Protezione e la Ricerca Ambientale, https://www.isprambiente.gov.it).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pumo, D., Noto, L.V. Exploring the use of multi-gene genetic programming in regional models for the simulation of monthly river runoff series. Stoch Environ Res Risk Assess 37, 1917–1941 (2023). https://doi.org/10.1007/s00477-022-02373-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02373-1