Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS

doi:10.1016/j.jhydrol.2021.126423

Journal of Hydrology

Volume 598, July 2021, 126423

https://doi.org/10.1016/j.jhydrol.2021.126423 Get rights and content

Highlights

•
DMLs have a great potential in interchangeability over PHMs with proper input scenarios.
•
DMLs showed consistently better performance than PHMs in the high-flow regime.
•
However, one single model cannot consistently prevail in a particular flow regime.
•
Both DMLs and PHMs have pros and cons in capturing rainfall-runoff relationships.

Abstract

With recent developments in computational techniques, Data-driven Machine Learning Models (DMLs) have shown great potential in simulating streamflow and capturing the rainfall-runoff relationship in given watersheds, which are traditionally fulfilled by Process-based Hydrologic Models (PHMs). There are debates on whether the DMLs can outperform and possibly replace the classical PHMs for streamflow simulation and river forecasting, but no clear conclusions have been made. This study aims to investigate whether the newer DMLs have any potential in further improving the simulation accuracy of classical PHMs, and vice versa. To do this, we compared a few popular PHMs and DMLs over four watersheds across the Continental US (CONUS) that are associated with different input, climate, and regional conditions. A total of five hydrologic models were chosen, including (1) two classical lumped models, i.e., the Sacramento Soil Moisture Accounting (SAC-SMA) and Xinanjiang (XAJ); (2) one modern distributed model, termed Coupled Routing and Excess Storage (CREST); (3) and two DMLs including an Artificial Neural Networks (ANN) and a deep learning model, termed Long Short Term Memory (LSTM). Our results demonstrated that the DMLs still significantly biased when using the baseline input scenario with the PHMs. However, the DMLs fed with delayed input scenarios had great potential and can reach high simulation accuracy. The DMLs, especially the ANN, outperformed other employed models under the rainfall-runoff relationship in which rainfall dominantly drives. The DMLs also showed better performance in the high-flow regime, while the PHMs had a better performance for the low-flow regime, implying both PHMs and DMLs have their own merits and are worthy of joint development. In general, our study indicated a great potential of using DMLs to simulate streamflow, but further studies are still needed to verify the transferability and scalability of DMLs in large-scale experiments, such as the Distributed Model Intercomparison Projects 1&2 conducted by National Weather Services but to compare modern DMLs and PHMs.

Introduction

Hydrologic models are powerful tools for forecasting potential floods or droughts and managing surface and subsurface water resources. From the beginning of the 1850s to the present, hydrologic models have rapidly developed with the advanced numerical mathematics and computer revolutions (Chow, 1964, Singh, 2018, Singh and Woolhiser, 2002). The current development of the hydrologic model is towards integrating the Earth system of climate and weather, global atmospheric circulation, and geospatial characteristics (Brown et al., 2014, Senatore et al., 2020, Sorooshian et al., 2008, Tian et al., 2020). Our understanding of the water cycle’s physical processes has greatly improved, and many new types of data are also become available to be used in hydrologic models. Along with these changes, hydrologic models have evolved from simple and conceptual models to various process-based and more complicated models. These models could be grouped into Process-based Hydrologic Models (PHMs) and and Data-driven Machine Learning Models (DMLs). The former models were originated from the classical bucket model, and they are also called conceptual, mechanistic, or process-driven models (Islam, 2011). The later models were derived from the traditional statistical analysis relying on mathematical regression, and their latest developments are towards using advanced computational intelligence (Clark et al., 2015, Grayson et al., 1992, Koutsoyiannis, 2003, Machiwal and Jha, 2012, Salas, 1980, Solomatine et al., 2008). The DMLs have begun to permeate the hydrologic community with rapid growth, leveraged by the blossom of different machine learning algorithms for classification and regression problems (Aytek et al., 2008, Liu and Xu, 2017, Mosavi et al., 2018, Rasouli et al., 2012, Wang et al., 2009). However, it is still unknown whether the DMLs will gradually prevail over the PHMs or vice versa, and how the traditional hydrologic models will further develop, given the distinct philosophy used in those two model groups. It is important to carry out a detailed evaluation of these two types of models and to identify which models are more reliable and powerful than others under what conditions, rather than simply developing new models or applying them separately in different study domains. Therefore, we set up a few benchmark hydrological simulation cases and comprehensively compared the pros and cons of a few popular PHMs and DMLs. Our research goals are to identify in what conditions the PHMs or the DMLs could generate the most accurate streamflow simulation and to investigate whether the newer DMLs have any potential in further improving the simulation accuracy of classical PHMs, and vice versa. The following introduction section reviews the employed PHMs and DMLs in this study.

The PHMs can be categorized into three types, which are the lumped models, semi-distributed models, and fully distributed models. In the present study, we used two lumped models (Sacramento Soil Moisture Accounting; SAC-SMA and XinAnJiang; XAJ) and one distributed model (Coupled Routing and Excess STorage; CREST). We include both lumped and distributed models because in prior studies, i.e., the two phases of the Distributed Model Intercomparison Project (DMIP 1&2) conducted by the National Weather Service, it was found that they sometimes outperform one another in simulating the streamflow and there were no major differences in their performances throughout many study cases (Grayson et al., 1992, Khakbaz et al., 2012, Koren et al., 2004). The SAC-SMA model has been well-recognized and widely used in both operational agencies and research communities as a key component of the National Weather Service River Forecast System (NWSRFS) for rainfall-runoff modeling (Behrangi et al., 2011, Boyle et al., 2001, Chu et al., 2010, Sorooshian et al., 1993). Numerous studies have applied the SAC-SMA model for flood forecasting, soil properties studies, and baseflow generation around the world (Abdulla et al., 1999, Ajami et al., 2004, Buchtele et al., 1996, Hogue et al., 2006, Hogue et al., 2000, Moreda et al., 2006). Another lumped model in this study is the XAJ model. The XAJ model has been widely used for rainfall-runoff simulation, flood forecasting, and water resources planning and management in large-scale humid and semi-humid regions, because it requires fewer forcing data to execute as compared to other pumped hydrologic models (Lu and Li, 2014, Ren-Jun, 1992, Xu et al., 2013, Zeng et al., 2018, Zhijia et al., 2013). Several studies have used these two lumped models simultaneously to forecast streamflow and achieved satisfactory simulation results under different climate conditions (Hao et al., 2018, Huang et al., 2016, Huo et al., 2019). While, other studies argued that these lumped models were limited in simulating the nonlinear process, and they only performed well in semi-humid and humid regions without low transferability to other climate regions (Hu et al., 2005, Huang et al., 2016, Yao et al., 2009). However, the lumped hydrologic models remain prevailing in the water community because of their proven effectiveness and robustness through the years. The CREST model is a distributed hydrologic model. The key concept of the distributed hydrologic models is to divide the watershed into smaller hydrologic response units and to build individual and smaller lumped models for each hydrologic unit. All hydrologic response units are connected by mass-balance equations, and the water is routed through all meshed units and then become the total watershed discharge. In the CREST model, four excess storage reservoirs represent the interception by the vegetation canopy and subsurface water storage in one underlying soil layer (Wang et al., 2011). We chose the CREST model because it was adopted as an operational tool across the US NWS for flash flood forecasting by local NWS Forecast Offices in the Flooded Locations and Simulated Hydrographs Project (Gourley et al., 2017). With the convenience of coupling remotly sensed data with distributed hydrologic models, the number of applications of the CREST model is increasing in recent years (Kan et al., 2017, Li et al., 2018, Shen et al., 2017).

Similar to the PHMs, the DMLs have also been widely used to solve various classification and regression problems in hydrologic sciences. The DMLs can identify the statistical relationship between input and output data without the explicit requirement of users to know of the physical processes (Reichstein et al., 2019, Solomatine and Ostfeld, 2008). In some recent studies, researchers have used the DMLs to simulate complicated hydrologic processes, and some studies have achieved better performances than traditional PHMs (Chaney et al., 2018, Chaney et al., 2016, Ham et al., 2019, Kim et al., 2019, Shen, 2018, Zhao et al., 2019). In the present study, two popular machine learning algorithms, namely the Artificial Neural Networks (ANN) and Long Short Term Memory (LSTM), are implemented to simulate the watershed discharge and compared with the PHMs (i.e., the SAC-SMA, XAJ, and the CREST models). The ANN model showed superior performance in hydrologic simulation under complex geophysical processes with the growing popularity from the literature (Kim et al., 2020, Yang et al., 2017b). It can link the climate information without requiring explicit interpretation, which the PHMs could not achieve using the micro-scale mass-balance governing equations (Abbot and Marohasy, 2012, Aksoy and Dahamsheh, 2009, Azadi and Sepaskhah, 2012, Dahamsheh and Aksoy, 2009, French et al., 1992, Hung et al., 2009, Kim et al., 2019). Beyond the ANN model, the Recurrent Neural Networks (RNN) is another ANN class, where connections between nodes are designed to exhibit temporal dynamic behavior by processing the inputs in its sequential order (Kratzert et al., 2018, Rumelhart et al., 1986, Rumelhart et al., 1994). Unlike the traditional ANN model, the RNN model has a memory that remembers some information about a sequence, which can greatly improve the predictive performance if the inputs and outputs are correlated. Most recently, the LSTM model, a class of the RNN model, has gained lots of popularity in hydrological sciences, and has been successfully applied to solve hydrological and environmental forecasting problems (Akbari Asanjan et al., 2018, Kumar et al., 2019, Srivastava and Lessmann, 2018).

The present study also builds on many existing studies related to streamflow simulation by comparing hydrologic model performance. Previous studies have preliminarily compared the model simulation performance between PHMs and DMLs (Daliakopoulos and Tsanis, 2016, Hsu et al., 1995, Ju et al., 2009, Rauf and Ghumman, 2018, Rezaeianzadeh et al., 2013, Roodsari et al., 2019, Srivastava et al., 2006, Tokar and Markus, 2000, Wang et al., 2017). One common agreement in these studies is that each model, PHMs or DMLs, may differ model performance depending on the hydrological and climate condition (such as climate, weather, terrain, and soil), data availability, and simulation objectives in the study regions. Besides, there are many discussions among hydrologists, in which concerns are raised about the lack of physical constraints and formulation of the water routing dynamics in all DMLs, though there are cases that DMLs outperform the PHMs using a flexible set of information as inputs (Kratzert et al., 2019, Sellars, 2018). Following this discussion, the designed experiments in this study will answer the questions that (1) whether the DMLs and PHMs can supplement each other at a particular simulation condition such as season or flow regimes in our study cases, and (2) how water managers could acquire a deeper understanding of the pros and cons of both PHMs and DMLs and develop more accurate streamflow simulation by taking advantage of both types of models. To better understand the performances of the PHMs to the DMLs in simulating streamflow, this study compares a few popular PHMs and DMLs over four watersheds across the Continental US (CONUS) that are associated with different climate and regional conditions. Based on the simulation results, we further examined how the model performance will vary based on different seasonalities and magnitude of flow regimes and explore whether the PHMs and DMLs could be interchangeable under certain conditions.

The rest of this paper is organized as follows. Section 2 presents the methodologies applied in this study, including the three PHMs and two DMLs. Section 3 provides detailed information on study basins and datasets. Section 4 summarizes daily streamflow simulation results at four watersheds in the CONUS. Section 5 provides the result analysis and discussion, and Section 6 summarizes our conclusions and recommendations for the development of PHMs and DMLs.

Section snippets

Methodology

Fig. 1 shows the conceptual comparison between PHMs (left side) and DMLs (right side). Both PHMs and DMLs follow a general hydrologic modeling framework, while there are some similarities and differences at each stage during the process. First, both PHMs and DMLs take the same forcing data as inputs and historical observation to improve the model performance. This process in the PHMs is called calibration, while in the DMLs is termed model training. The PHMs is based on mass balance equations

Target basin

In this study, we selected four signature watersheds with different hydroclimatic conditions to compare the employed PHMs and DMLs. Table 4 presents the basic information of the four selected watersheds, and Fig. 7 shows the locations of the four watersheds over the CONUS. More details are described as follows.

The Bushkill watershed is located in the eastern part of Pennsylvania. The water from this watershed generally flows southeast directly into the Delaware River, and the Blue Mountains

Input scenarios

In this study, we generated three input scenarios to examine the sensitivities of each model. First, mean precipitation and mean PET at each grid are used as default inputs to drive both the PHMs and DMLs (the first input scenario; S1). Further, the second input scenario (S2) and the third input scenario (S3) were generated by adding delayed precipitation/PET to drive the DMLs only. The adding of delayed forcing data could guide the DMLs to capture how water is being delayed and routed through

Results

Table 6, Table 7, Table 8, Table 9 show the model performance of simulating five employed models (SAC-SMA, CREST, XAJ, ANN, and LSTM) using k-fold CV in four selected watersheds. The statistical measurement is the average value of three folds (k = 3) in each calibration and validation set. There is a total of nine cases since the PHMs (SAC-SMA, XAJ, and CREST) respectively have one result by the first input scenario (S1), and the DMLs (ANN and LSTM) respectively have three results by the three

Discussion

This study employed three PHMs and two DMLs to cross-evaluate their capabilities in simulating daily streamflow for four signiture watersheds with different climate and geomorphological conditions. We specifically investigated the general statistics of streamflow simulation in 3-fold CV, the model performance over different seasons, and the simulation accuracy over high-, medium- and low-flow regimes. For all employed models, mean precipitation and mean PET at each grid were used as default

Conclusion

In this study, we figured out the streamflow simulation capability of PHMs and DMLs over four selected watersheds in the CONUS. To do this, three PHMs (SAC-SMA, XAJ, and CREST), and two DMLs (ANN and LSTM) were tested under different input scenarios. We first evaluated the overall simulated results against observed streamflow using four statistics, and investigated whether PHMs and DMLs have varying performances over different seasons and flow regimes. Our findings provided an in-depth

Data availability

Daily PRISM precipitation, daily FEWS NET PET, and daily streamflow data used in this study are respectively available on the PRISM Climate Group (http://prism.oregonstate.edu), the USGS FEWS NET Data Portal (https://earlywarning.usgs.gov/fews), and the USGS Surface-Water Data for USA (http://waterdata.usgs.gov/usa/nwis/sw).

The original SAC-SMA model code is written in Fortran and is publicly accessible. This study used the MATLAB version translated from the Fortran code with the same version

CRediT authorship contribution statement

Taereem Kim: Software, Formal analysis, Investigation, Writing - original draft, Methodology, Visualization, Supervision. Tiantian Yang: Conceptualization, Formal analysis, Investigation, Writing - review & editing, Project administration, Funding acquisition. Shang Gao: Software, Visualization, Methodology, Resources, Data curation. Lujun Zhang: Software, Methodology, Resources. Ziyu Ding: Software, Methodology, Resources. Xin Wen: Funding acquisition. Jonathan J. Gourley: Writing - review &

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is partially supported by the U.S. Department of Energy (DOE Prime Award # DE-IA0000018). The material is based upon work supported by the National Science Foundation under Grant No. OIA-1946093 and its subaward No. EPSCoR-2020-3, and the National Science Foundation under Grant No. NSF1802872. The financial support is also made available by the National Key R&D Program of China under Grant No. 2018YFC0407902.

References (144)

F. Abdulla et al.
Estimation of the ARNO model baseflow parameters using daily streamflow data
J. Hydrol.
(1999)
O.M. Baez-Villanueva et al.
Temporal and spatial evaluation of satellite rainfall estimates over different regions in Latin-America
Atmos. Res.
(2018)
A. Behrangi et al.
Hydrologic evaluation of satellite precipitation products over a mid-size basin
J. Hydrol.
(2011)
C. Bergmeir et al.
On the use of cross-validation for time series predictor evaluation
Inf. Sci.
(2012)
M.E. Brown et al.
An integrated modeling system for estimating glacier and snow melt driven streamflow from remote sensing and earth system data products in the Himalayas
J. Hydrol.
(2014)
N.W. Chaney et al.
POLARIS: A 30-meter probabilistic soil series map of the contiguous United States
Geoderma
(2016)
W. Chu et al.
A new evolutionary search strategy for global optimization of high-dimensional problems
Inf. Sci.
(2011)
M.N. French et al.
Rainfall forecasting in space and time using a neural network
J. Hydrol.
(1992)
C. Furl et al.
Hydrometeorology of the catastrophic Blanco river flood in South Texas, May 2015
J. Hydrol. Reg. Stud.
(2018)
Y. Gan et al.
A comprehensive evaluation of various sensitivity analysis methods: a case study with a hydrological model
Environ. Modell. Softw.
(2014)

Akbari Asanjan, A., Yang, T., Hsu, K., Sorooshian, S., Lin, J. and Peng, Q. 2018. Short-Term Precipitation Forecast...

H. Aksoy et al.

Artificial neural network models for forecasting monthly precipitation in Jordan

Stoch. Env. Res. Risk Assess.

(2009)

R.G. Allen et al.

Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. Fao

Rome

(1998)

ASCE

Artificial neural networks in hydrology. I: preliminary concepts

J. Hydrol. Eng.

(2000)

M. Ashfaq et al.

High-resolution ensemble projections of near-term regional climate over the continental United States

J. Geophys. Res. Atmos.

(2016)

W.H. Asquith et al.

Documented and potential extreme peak discharges and relation between potential extreme peak discharges and probable maximum flood peak discharges in Texas

(1995)

A. Aytek et al.

An application of artificial intelligence for rainfall-runoff modeling

J. Earth Syst. Sci.

(2008)

S. Azadi et al.

Annual precipitation forecast for west, southwest, and south provinces of Iran using artificial neural networks

Theor. Appl. Climatol.

(2012)

D.P. Boyle et al.

Toward improved streamflow forecasts: value of semidistributed modeling

Water Resour. Res.

(2001)

Brandes, D. (2001) Urban Drainage Modeling, pp....

Brazil, L. (1989) Multilevel calibration strategy for complex hydrologic simulation models, US Department of Commerce,...

J. Buchtele et al.

Runoff components simulated by rainfallrunoff models

Hydrol. Sci. J.

(1996)

Burnash, R.J., Ferral, R.L. and McGuire, R.A. (1973) A generalized streamflow simulation system: Conceptual modeling...

Burnash, R. (1995) The NWS River Forecast System-Catchment Modeling. In: Singh, V., Ed., Computer Models of Watershed...

G.C. Cawley et al.

On over-fitting in model selection and subsequent selection bias in performance evaluation

J. Mach. Learn. Res.

(2010)

N.W. Chaney et al.

Harnessing big data to rethink land heterogeneity in Earth system models

Hydrol. Earth Syst. Sci.

(2018)

S. Chen et al.

Neural networks for nonlinear dynamic system modelling and identification

Comparison of an artificial neural network and a conceptual rainfall–runoff model in the simulation of ephemeral streamflow

Hydrol. Sci. J.

(2016)

C. Daly et al.

The PRISM Climate and Weather System—an Introduction

(2013)

Cited by (69)

Improving the simulations of the hydrological model in the karst catchment by integrating the conceptual model with machine learning models
2024, Science of the Total Environment
Hydrological modelling can be complex in nonhomogeneous catchments with diverse geological, climatic, and topographic conditions. In this study, an integrated conceptual model including the snow module with machine learning modelling approaches was implemented for daily rainfall-runoff modelling in mostly karst Ljubljanica catchment, Slovenia, which has heterogeneous characteristics and is potentially exposed to extreme events that make the modelling process more challenging and crucial. In this regard, the conceptual model CemaNeige Génie Rural à 6 paramètres Journalier (CemaNeige GR6J) was combined with machine learning models, namely wavelet-based support vector regression (WSVR) and wavelet-based multivariate adaptive regression spline (WMARS) to enhance modelling performance. In this study, the performance of the models was comprehensively investigated, considering their ability to forecast daily extreme runoff. Although CemaNeige GR6J yielded a very good performance, it overestimated low flows. The WSVR and WMARS models yielded poorer performance than the conceptual and hybrid models. The hybrid model approach improved the performance of the machine learning models and the conceptual model by revealing the linkage between variables and runoff in the conceptual model, which provided more accurate results for extreme flows. Accordingly, the hybrid models improved the forecasting performance of the maximum flows up to 40 % and 61 %, and minimum flows up to 73 % and 72 % compared to the CemaNeige GR6J and stand-alone machine learning models. In this regard, the hybrid model approach can enhance the daily rainfall-runoff modelling performance in nonhomogeneous and karst catchments where the hydrological process can be more complicated.
Comparing conceptual and super ensemble deep learning models for streamflow simulation in data-scarce catchments
2024, Journal of Hydrology: Regional Studies
Sore and Masha river catchments in Baro Akobo river basin: Ethiopia.
This research addresses the challenges associated with conventional data-driven streamflow modelling, which often exhibits inconsistent performance across different variability states. To bridge this gap, we explore the efficiency of model ensembles, a popular hydrological approach that harnesses the strengths of multiple models while preserving the fundamental characteristics of the data. Specifically, we compare three modified super ensemble meta-learners with eight single and hybrid machine learning base learners. The study also incorporates a semi-distributed HBV-light conceptual hydrological model and utilizes a decade of remote sensing and ground hydro-meteorological daily time series data. Different remote sensing-based vegetation index data products are employed to simulate single-step daily streamflow. The Recursive Feature Elimination (RFE) algorithm is applied to extract influential input parameters.
Our findings consistently demonstrate that the three super ensemble learners outperform both the eight base models and the HBV-light model. The top-ranked Extra Tree Regression Super Ensemble (ETRSE) model exhibits superior performance, surpassing the HBV-light model by 24% according to the R² performance measure. This study highlights the positive impact of selecting influential input parameters on the overall performance of machine learning models, providing valuable insights for hydrological modelling in the region.
Coupling machine learning and physical modelling for predicting runoff at catchment scale
2024, Journal of Environmental Management
In this paper, we present an approach that combines data-driven and physical modelling for predicting the runoff occurrence and volume at catchment scale. With that aim, we first estimated the runoff volume from recorded storms aided by the Green-Ampt infiltration model. Then, we used machine learning algorithms, namely LightGBM (LGBM) and Deep Neural Network (DNN), to predict the outputs of the physical model fed on a set of atmospheric variables (relative humidity, temperature, atmospheric pressure, and wind velocity) collected before or immediately after the beginning of the storm. Results for a small urban catchment in Madrid show DNN performed better in predicting the runoff occurrence and volume. Moreover, enriching the input primary atmospheric variables with auxiliary variables (e.g., storm intensity data recorded during the first hour, or rain volume and intensity estimates obtained from auxiliary regression methods) largely increased the model performance. We show in this manuscript data-driven algorithms shaped by physical criteria can be successfully generated by allowing the data-driven algorithm learn from the output of physical models. It represents a novel approach for physics-informed data-driven algorithms shifting from common practices in hydrological modelling through machine learning.
Adapting subseasonal-to-seasonal (S2S) precipitation forecast at watersheds for hydrologic ensemble streamflow forecasting with a machine learning-based post-processing approach
2024, Journal of Hydrology
Accurate and reliable precipitation predictions made by dynamical forecast models could provide crucial information for human socioeconomic activities by enabling hydrologic forecasts at the Subseasonal-to-Seasonal (S2S) timescale. To utilize available S2S precipitation predictions for hydrologic forecasts, post-processing techniques have been applied to adapt the raw S2S precipitation to local watersheds. However, conventional statistical-based post-processing techniques are more focused on correcting the forecast bias, but rather limited in improving the predictive skill of available S2S precipitation forecasts. In this study, we combine the Random Forest classifiers (RF) with the Bias Correction and Spatial Disaggregation (BCSD) to adapt the 10-member ensemble precipitation forecast from the NASA Goddard Earth Observing System model version 5 (GEOS5) at 4 watersheds located in the NCEI South climate region of the United States. The adapted S2S precipitation is further applied for streamflow forecast by forcing a classical lumped hydrologic model. The performance of S2S precipitation as well as the corresponding streamflow predictions are benchmarked with the randomly resampled precipitation and the corresponding Ensemble Streamflow Prediction (ESP) framework-generated streamflow predictions. Evaluation statistics of Kling-Gupta Efficiency (KGE), Continuous Ranked Probability Skill Score (CRPSS), Reliability, Resolution, and Sharpness are employed to evaluate the predictive skill of precipitation and streamflow both deterministically and probabilistically. Our results indicate that dynamical S2S precipitation after forecast adaptation leads to consistently higher deterministic skill over ESP at all forecast lead times and across study watersheds. However, at longer forecast lead times beyond 10–15 days, S2S precipitation with a limited ensemble size does not present higher probabilistic skill than ESP. Our results shows that the joint application of RF and BCSD improves the predictive skill of the raw S2S precipitation at study watersheds in contrast to BCSD. Further, the added predictive skill of S2S precipitation brought by RF propagates into streamflow predictions, predominantly at longer forecast lead times exceeding 10 days. Overall, our results highlight the potential success of future work to apply other data-driven approaches to adapt the raw precipitation to local watersheds for more accurate and reliable streamflow forecasts at the S2S timescale.
A conceptual metaheuristic-based framework for improving runoff time series simulation in glacierized catchments
2024, Engineering Applications of Artificial Intelligence
Glacio-hydrological modeling is a key task for assessing the influence of snow and glaciers on water resources, essential for water resources management. The present study aims to enhance a conceptual hydrological model (namely Glacial Snow Melt (GSM)) by data-driven and swarm computing for enhancing the accuracy of rainfall runoff prediction. The proposed framework combines the conceptual hydrological model (i.e. GSM) with the time series predictor model (SVR) and optimization-driven parameter tuning of the firefly algorithm (SVR-FFA). This integration uniquely captures the complex interplay between meteorological variables, glacier processes, and hydrological responses. Applying the hybrid framework proved better results than the standalone GSM and ordinary SVR in simulating runoff time series. The performance of the proposed conceptual integrated metaheuristic-based framework (W-SG-SVR-FFA) demonstrated several enhancements over the standalone GSM model. During the calibration (validation) period, the evaluation metric coefficient of determination (R²) was 0.77 (0.77) for the standalone GSM model and 0.98 (0.91) for the W-SG-SVR-FFA model. The Kling-Gupta Efficiency (KGE) values were 0.81 (0.77) and 0.97 (0.87), respectively. Applying the method in glacierized catchments underscores its importance in areas undergoing swift climate change and glacial melting. This approach enables readers to witness the intricate equilibrium between the model's complexity and the accuracy of simulation outcomes.
Applications of artificial intelligence technologies in water environments: From basic techniques to novel tiny machine learning systems
2023, Process Safety and Environmental Protection
Artificial intelligence (AI) and machine learning (ML) are novel techniques to detect hidden patterns in environmental data. Despite their capabilities, these novel technologies have not been seriously used for real-world problems, such as real-time environmental monitoring. This survey established a framework to advance the novel applications of AI and ML techniques such as Tiny Machine Learning (TinyML) in water environments. The survey covered deep learning models and their advantages over classical ML models. The deep learning algorithms are the heart of TinyML models and are of paramount importance for practical uses in water environments. This survey highlighted the capabilities and discussed the possible applications of the TinyML models in water environments. This study indicated that the TinyML models on microcontrollers are useful for a number of cutting-edge problems in water environments, especially for monitoring purposes. The TinyML models on microcontrollers allow for in situ real-time environmental monitoring without transferring data to the cloud. It is concluded that monitoring systems based on TinyML models offer cheap tools to autonomously track pollutants in water and can replace traditional monitoring methods.

View all citing articles on Scopus

View full text

Research papersCan artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS

Highlights

Abstract

Introduction

Section snippets

Methodology

Target basin

Input scenarios

Results

Discussion

Conclusion

Data availability

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

J. Hydrol.

Atmos. Res.

J. Hydrol.

Inf. Sci.

J. Hydrol.

Geoderma

Inf. Sci.

J. Hydrol.

J. Hydrol. Reg. Stud.

Environ. Modell. Softw.

Artif. Intell. Eng.

J. Hydrol.

J. Hydrol.

J. Hydrol.

Neurocomputing

J. Hydrol.

J. Hydrol.

J. Hydrol.

Comput. Chem. Eng.

J. Hydrol.

J. Hydrol.

J. Hydrol.

J. Hydrol.

Application of artificial neural networks to rainfall forecasting in Queensland, Australia

Adv. Atmos. Sci.

Calibration of a semi-distributed hydrologic model for streamflow estimation along a river system

J. Hydrol.

Artificial neural network models for forecasting monthly precipitation in Jordan

Stoch. Env. Res. Risk Assess.

Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. Fao

Rome

Artificial neural networks in hydrology. I: preliminary concepts

J. Hydrol. Eng.

High-resolution ensemble projections of near-term regional climate over the continental United States

J. Geophys. Res. Atmos.

Documented and potential extreme peak discharges and relation between potential extreme peak discharges and probable maximum flood peak discharges in Texas

An application of artificial intelligence for rainfall-runoff modeling

J. Earth Syst. Sci.

Annual precipitation forecast for west, southwest, and south provinces of Iran using artificial neural networks

Theor. Appl. Climatol.

Toward improved streamflow forecasts: value of semidistributed modeling

Water Resour. Res.

Runoff components simulated by rainfallrunoff models

Hydrol. Sci. J.

On over-fitting in model selection and subsequent selection bias in performance evaluation

J. Mach. Learn. Res.

Harnessing big data to rethink land heterogeneity in Earth system models

Hydrol. Earth Syst. Sci.

Neural networks for nonlinear dynamic system modelling and identification

Int. J. Control

Improving the shuffled complex evolution scheme for optimization of complex nonlinear hydrological systems: application to the calibration of the Sacramento soil-moisture accounting model

Water Resour. Res.

A solution to the crucial problem of population degeneration in high-dimensional evolutionary optimization

IEEE Syst. J.

Comment on “High-dimensional posterior exploration of hydrologic models using multiple-try DREAM (ZS) and high-performance computing” by Eric Laloy and Jasper A Vrugt

Water Resour. Res.

A unified approach for process-based hydrologic modeling: 1. Modeling concept

Water Resour. Res.

Artificial neural network models for forecasting intermittent monthly precipitation in arid regions

Meteorol. Appl.

Comparison of an artificial neural network and a conceptual rainfall–runoff model in the simulation of ephemeral streamflow

Hydrol. Sci. J.

The PRISM Climate and Weather System—an Introduction

Research papers
Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS