Research papers
Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS

https://doi.org/10.1016/j.jhydrol.2021.126423Get rights and content

Highlights

  • DMLs have a great potential in interchangeability over PHMs with proper input scenarios.

  • DMLs showed consistently better performance than PHMs in the high-flow regime.

  • However, one single model cannot consistently prevail in a particular flow regime.

  • Both DMLs and PHMs have pros and cons in capturing rainfall-runoff relationships.

Abstract

With recent developments in computational techniques, Data-driven Machine Learning Models (DMLs) have shown great potential in simulating streamflow and capturing the rainfall-runoff relationship in given watersheds, which are traditionally fulfilled by Process-based Hydrologic Models (PHMs). There are debates on whether the DMLs can outperform and possibly replace the classical PHMs for streamflow simulation and river forecasting, but no clear conclusions have been made. This study aims to investigate whether the newer DMLs have any potential in further improving the simulation accuracy of classical PHMs, and vice versa. To do this, we compared a few popular PHMs and DMLs over four watersheds across the Continental US (CONUS) that are associated with different input, climate, and regional conditions. A total of five hydrologic models were chosen, including (1) two classical lumped models, i.e., the Sacramento Soil Moisture Accounting (SAC-SMA) and Xinanjiang (XAJ); (2) one modern distributed model, termed Coupled Routing and Excess Storage (CREST); (3) and two DMLs including an Artificial Neural Networks (ANN) and a deep learning model, termed Long Short Term Memory (LSTM). Our results demonstrated that the DMLs still significantly biased when using the baseline input scenario with the PHMs. However, the DMLs fed with delayed input scenarios had great potential and can reach high simulation accuracy. The DMLs, especially the ANN, outperformed other employed models under the rainfall-runoff relationship in which rainfall dominantly drives. The DMLs also showed better performance in the high-flow regime, while the PHMs had a better performance for the low-flow regime, implying both PHMs and DMLs have their own merits and are worthy of joint development. In general, our study indicated a great potential of using DMLs to simulate streamflow, but further studies are still needed to verify the transferability and scalability of DMLs in large-scale experiments, such as the Distributed Model Intercomparison Projects 1&2 conducted by National Weather Services but to compare modern DMLs and PHMs.

Introduction

Hydrologic models are powerful tools for forecasting potential floods or droughts and managing surface and subsurface water resources. From the beginning of the 1850s to the present, hydrologic models have rapidly developed with the advanced numerical mathematics and computer revolutions (Chow, 1964, Singh, 2018, Singh and Woolhiser, 2002). The current development of the hydrologic model is towards integrating the Earth system of climate and weather, global atmospheric circulation, and geospatial characteristics (Brown et al., 2014, Senatore et al., 2020, Sorooshian et al., 2008, Tian et al., 2020). Our understanding of the water cycle’s physical processes has greatly improved, and many new types of data are also become available to be used in hydrologic models. Along with these changes, hydrologic models have evolved from simple and conceptual models to various process-based and more complicated models. These models could be grouped into Process-based Hydrologic Models (PHMs) and and Data-driven Machine Learning Models (DMLs). The former models were originated from the classical bucket model, and they are also called conceptual, mechanistic, or process-driven models (Islam, 2011). The later models were derived from the traditional statistical analysis relying on mathematical regression, and their latest developments are towards using advanced computational intelligence (Clark et al., 2015, Grayson et al., 1992, Koutsoyiannis, 2003, Machiwal and Jha, 2012, Salas, 1980, Solomatine et al., 2008). The DMLs have begun to permeate the hydrologic community with rapid growth, leveraged by the blossom of different machine learning algorithms for classification and regression problems (Aytek et al., 2008, Liu and Xu, 2017, Mosavi et al., 2018, Rasouli et al., 2012, Wang et al., 2009). However, it is still unknown whether the DMLs will gradually prevail over the PHMs or vice versa, and how the traditional hydrologic models will further develop, given the distinct philosophy used in those two model groups. It is important to carry out a detailed evaluation of these two types of models and to identify which models are more reliable and powerful than others under what conditions, rather than simply developing new models or applying them separately in different study domains. Therefore, we set up a few benchmark hydrological simulation cases and comprehensively compared the pros and cons of a few popular PHMs and DMLs. Our research goals are to identify in what conditions the PHMs or the DMLs could generate the most accurate streamflow simulation and to investigate whether the newer DMLs have any potential in further improving the simulation accuracy of classical PHMs, and vice versa. The following introduction section reviews the employed PHMs and DMLs in this study.

The PHMs can be categorized into three types, which are the lumped models, semi-distributed models, and fully distributed models. In the present study, we used two lumped models (Sacramento Soil Moisture Accounting; SAC-SMA and XinAnJiang; XAJ) and one distributed model (Coupled Routing and Excess STorage; CREST). We include both lumped and distributed models because in prior studies, i.e., the two phases of the Distributed Model Intercomparison Project (DMIP 1&2) conducted by the National Weather Service, it was found that they sometimes outperform one another in simulating the streamflow and there were no major differences in their performances throughout many study cases (Grayson et al., 1992, Khakbaz et al., 2012, Koren et al., 2004). The SAC-SMA model has been well-recognized and widely used in both operational agencies and research communities as a key component of the National Weather Service River Forecast System (NWSRFS) for rainfall-runoff modeling (Behrangi et al., 2011, Boyle et al., 2001, Chu et al., 2010, Sorooshian et al., 1993). Numerous studies have applied the SAC-SMA model for flood forecasting, soil properties studies, and baseflow generation around the world (Abdulla et al., 1999, Ajami et al., 2004, Buchtele et al., 1996, Hogue et al., 2006, Hogue et al., 2000, Moreda et al., 2006). Another lumped model in this study is the XAJ model. The XAJ model has been widely used for rainfall-runoff simulation, flood forecasting, and water resources planning and management in large-scale humid and semi-humid regions, because it requires fewer forcing data to execute as compared to other pumped hydrologic models (Lu and Li, 2014, Ren-Jun, 1992, Xu et al., 2013, Zeng et al., 2018, Zhijia et al., 2013). Several studies have used these two lumped models simultaneously to forecast streamflow and achieved satisfactory simulation results under different climate conditions (Hao et al., 2018, Huang et al., 2016, Huo et al., 2019). While, other studies argued that these lumped models were limited in simulating the nonlinear process, and they only performed well in semi-humid and humid regions without low transferability to other climate regions (Hu et al., 2005, Huang et al., 2016, Yao et al., 2009). However, the lumped hydrologic models remain prevailing in the water community because of their proven effectiveness and robustness through the years. The CREST model is a distributed hydrologic model. The key concept of the distributed hydrologic models is to divide the watershed into smaller hydrologic response units and to build individual and smaller lumped models for each hydrologic unit. All hydrologic response units are connected by mass-balance equations, and the water is routed through all meshed units and then become the total watershed discharge. In the CREST model, four excess storage reservoirs represent the interception by the vegetation canopy and subsurface water storage in one underlying soil layer (Wang et al., 2011). We chose the CREST model because it was adopted as an operational tool across the US NWS for flash flood forecasting by local NWS Forecast Offices in the Flooded Locations and Simulated Hydrographs Project (Gourley et al., 2017). With the convenience of coupling remotly sensed data with distributed hydrologic models, the number of applications of the CREST model is increasing in recent years (Kan et al., 2017, Li et al., 2018, Shen et al., 2017).

Similar to the PHMs, the DMLs have also been widely used to solve various classification and regression problems in hydrologic sciences. The DMLs can identify the statistical relationship between input and output data without the explicit requirement of users to know of the physical processes (Reichstein et al., 2019, Solomatine and Ostfeld, 2008). In some recent studies, researchers have used the DMLs to simulate complicated hydrologic processes, and some studies have achieved better performances than traditional PHMs (Chaney et al., 2018, Chaney et al., 2016, Ham et al., 2019, Kim et al., 2019, Shen, 2018, Zhao et al., 2019). In the present study, two popular machine learning algorithms, namely the Artificial Neural Networks (ANN) and Long Short Term Memory (LSTM), are implemented to simulate the watershed discharge and compared with the PHMs (i.e., the SAC-SMA, XAJ, and the CREST models). The ANN model showed superior performance in hydrologic simulation under complex geophysical processes with the growing popularity from the literature (Kim et al., 2020, Yang et al., 2017b). It can link the climate information without requiring explicit interpretation, which the PHMs could not achieve using the micro-scale mass-balance governing equations (Abbot and Marohasy, 2012, Aksoy and Dahamsheh, 2009, Azadi and Sepaskhah, 2012, Dahamsheh and Aksoy, 2009, French et al., 1992, Hung et al., 2009, Kim et al., 2019). Beyond the ANN model, the Recurrent Neural Networks (RNN) is another ANN class, where connections between nodes are designed to exhibit temporal dynamic behavior by processing the inputs in its sequential order (Kratzert et al., 2018, Rumelhart et al., 1986, Rumelhart et al., 1994). Unlike the traditional ANN model, the RNN model has a memory that remembers some information about a sequence, which can greatly improve the predictive performance if the inputs and outputs are correlated. Most recently, the LSTM model, a class of the RNN model, has gained lots of popularity in hydrological sciences, and has been successfully applied to solve hydrological and environmental forecasting problems (Akbari Asanjan et al., 2018, Kumar et al., 2019, Srivastava and Lessmann, 2018).

The present study also builds on many existing studies related to streamflow simulation by comparing hydrologic model performance. Previous studies have preliminarily compared the model simulation performance between PHMs and DMLs (Daliakopoulos and Tsanis, 2016, Hsu et al., 1995, Ju et al., 2009, Rauf and Ghumman, 2018, Rezaeianzadeh et al., 2013, Roodsari et al., 2019, Srivastava et al., 2006, Tokar and Markus, 2000, Wang et al., 2017). One common agreement in these studies is that each model, PHMs or DMLs, may differ model performance depending on the hydrological and climate condition (such as climate, weather, terrain, and soil), data availability, and simulation objectives in the study regions. Besides, there are many discussions among hydrologists, in which concerns are raised about the lack of physical constraints and formulation of the water routing dynamics in all DMLs, though there are cases that DMLs outperform the PHMs using a flexible set of information as inputs (Kratzert et al., 2019, Sellars, 2018). Following this discussion, the designed experiments in this study will answer the questions that (1) whether the DMLs and PHMs can supplement each other at a particular simulation condition such as season or flow regimes in our study cases, and (2) how water managers could acquire a deeper understanding of the pros and cons of both PHMs and DMLs and develop more accurate streamflow simulation by taking advantage of both types of models. To better understand the performances of the PHMs to the DMLs in simulating streamflow, this study compares a few popular PHMs and DMLs over four watersheds across the Continental US (CONUS) that are associated with different climate and regional conditions. Based on the simulation results, we further examined how the model performance will vary based on different seasonalities and magnitude of flow regimes and explore whether the PHMs and DMLs could be interchangeable under certain conditions.

The rest of this paper is organized as follows. Section 2 presents the methodologies applied in this study, including the three PHMs and two DMLs. Section 3 provides detailed information on study basins and datasets. Section 4 summarizes daily streamflow simulation results at four watersheds in the CONUS. Section 5 provides the result analysis and discussion, and Section 6 summarizes our conclusions and recommendations for the development of PHMs and DMLs.

Section snippets

Methodology

Fig. 1 shows the conceptual comparison between PHMs (left side) and DMLs (right side). Both PHMs and DMLs follow a general hydrologic modeling framework, while there are some similarities and differences at each stage during the process. First, both PHMs and DMLs take the same forcing data as inputs and historical observation to improve the model performance. This process in the PHMs is called calibration, while in the DMLs is termed model training. The PHMs is based on mass balance equations

Target basin

In this study, we selected four signature watersheds with different hydroclimatic conditions to compare the employed PHMs and DMLs. Table 4 presents the basic information of the four selected watersheds, and Fig. 7 shows the locations of the four watersheds over the CONUS. More details are described as follows.

The Bushkill watershed is located in the eastern part of Pennsylvania. The water from this watershed generally flows southeast directly into the Delaware River, and the Blue Mountains

Input scenarios

In this study, we generated three input scenarios to examine the sensitivities of each model. First, mean precipitation and mean PET at each grid are used as default inputs to drive both the PHMs and DMLs (the first input scenario; S1). Further, the second input scenario (S2) and the third input scenario (S3) were generated by adding delayed precipitation/PET to drive the DMLs only. The adding of delayed forcing data could guide the DMLs to capture how water is being delayed and routed through

Results

Table 6, Table 7, Table 8, Table 9 show the model performance of simulating five employed models (SAC-SMA, CREST, XAJ, ANN, and LSTM) using k-fold CV in four selected watersheds. The statistical measurement is the average value of three folds (k = 3) in each calibration and validation set. There is a total of nine cases since the PHMs (SAC-SMA, XAJ, and CREST) respectively have one result by the first input scenario (S1), and the DMLs (ANN and LSTM) respectively have three results by the three

Discussion

This study employed three PHMs and two DMLs to cross-evaluate their capabilities in simulating daily streamflow for four signiture watersheds with different climate and geomorphological conditions. We specifically investigated the general statistics of streamflow simulation in 3-fold CV, the model performance over different seasons, and the simulation accuracy over high-, medium- and low-flow regimes. For all employed models, mean precipitation and mean PET at each grid were used as default

Conclusion

In this study, we figured out the streamflow simulation capability of PHMs and DMLs over four selected watersheds in the CONUS. To do this, three PHMs (SAC-SMA, XAJ, and CREST), and two DMLs (ANN and LSTM) were tested under different input scenarios. We first evaluated the overall simulated results against observed streamflow using four statistics, and investigated whether PHMs and DMLs have varying performances over different seasons and flow regimes. Our findings provided an in-depth

Data availability

Daily PRISM precipitation, daily FEWS NET PET, and daily streamflow data used in this study are respectively available on the PRISM Climate Group (http://prism.oregonstate.edu), the USGS FEWS NET Data Portal (https://earlywarning.usgs.gov/fews), and the USGS Surface-Water Data for USA (http://waterdata.usgs.gov/usa/nwis/sw).

The original SAC-SMA model code is written in Fortran and is publicly accessible. This study used the MATLAB version translated from the Fortran code with the same version

CRediT authorship contribution statement

Taereem Kim: Software, Formal analysis, Investigation, Writing - original draft, Methodology, Visualization, Supervision. Tiantian Yang: Conceptualization, Formal analysis, Investigation, Writing - review & editing, Project administration, Funding acquisition. Shang Gao: Software, Visualization, Methodology, Resources, Data curation. Lujun Zhang: Software, Methodology, Resources. Ziyu Ding: Software, Methodology, Resources. Xin Wen: Funding acquisition. Jonathan J. Gourley: Writing - review &

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is partially supported by the U.S. Department of Energy (DOE Prime Award # DE-IA0000018). The material is based upon work supported by the National Science Foundation under Grant No. OIA-1946093 and its subaward No. EPSCoR-2020-3, and the National Science Foundation under Grant No. NSF1802872. The financial support is also made available by the National Key R&D Program of China under Grant No. 2018YFC0407902.

References (144)

  • A.T. Goh

    Back-propagation neural networks for modeling complex systems

    Artif. Intell. Eng.

    (1995)
  • H.V. Gupta et al.

    Decomposition of the mean squared error and NSE performance criteria: implications for improving hydrological modelling

    J. Hydrol.

    (2009)
  • T.S. Hogue et al.

    A ‘user-friendly’approach to parameter estimation in hydrologic models

    J. Hydrol.

    (2006)
  • P. Huang et al.

    Event-based hydrological modeling for detecting dominant hydrological process and suitable model strategy for semi-arid catchments

    J. Hydrol.

    (2016)
  • Q. Ju et al.

    Division-based rainfall-runoff simulations with BP neural networks and Xinanjiang model

    Neurocomputing

    (2009)
  • B. Khakbaz et al.

    From lumped to distributed via semi-distributed: calibration strategies for semi-distributed hydrologic models

    J. Hydrol.

    (2012)
  • H. Kling et al.

    Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios

    J. Hydrol.

    (2012)
  • V. Koren et al.

    Hydrology laboratory research modeling system (HL-RMS) of the US national weather service

    J. Hydrol.

    (2004)
  • J. Leonard et al.

    Improvement of the backpropagation algorithm for training neural networks

    Comput. Chem. Eng.

    (1990)
  • F. Moreda et al.

    Parameterization of distributed hydrological models: learning from the experiences of lumped modeling

    J. Hydrol.

    (2006)
  • J.E. Nash et al.

    River flow forecasting through conceptual models part I—A discussion of principles

    J. Hydrol.

    (1970)
  • K. Rasouli et al.

    Daily streamflow forecasting by machine learning methods with weather and climate inputs

    J. Hydrol.

    (2012)
  • Z. Ren-Jun

    The Xinanjiang model applied in China

    J. Hydrol.

    (1992)
  • J. Abbot et al.

    Application of artificial neural networks to rainfall forecasting in Queensland, Australia

    Adv. Atmos. Sci.

    (2012)
  • N.K. Ajami et al.

    Calibration of a semi-distributed hydrologic model for streamflow estimation along a river system

    J. Hydrol.

    (2004)
  • Akbari Asanjan, A., Yang, T., Hsu, K., Sorooshian, S., Lin, J. and Peng, Q. 2018. Short-Term Precipitation Forecast...
  • H. Aksoy et al.

    Artificial neural network models for forecasting monthly precipitation in Jordan

    Stoch. Env. Res. Risk Assess.

    (2009)
  • R.G. Allen et al.

    Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. Fao

    Rome

    (1998)
  • ASCE

    Artificial neural networks in hydrology. I: preliminary concepts

    J. Hydrol. Eng.

    (2000)
  • M. Ashfaq et al.

    High-resolution ensemble projections of near-term regional climate over the continental United States

    J. Geophys. Res. Atmos.

    (2016)
  • W.H. Asquith et al.

    Documented and potential extreme peak discharges and relation between potential extreme peak discharges and probable maximum flood peak discharges in Texas

    (1995)
  • A. Aytek et al.

    An application of artificial intelligence for rainfall-runoff modeling

    J. Earth Syst. Sci.

    (2008)
  • S. Azadi et al.

    Annual precipitation forecast for west, southwest, and south provinces of Iran using artificial neural networks

    Theor. Appl. Climatol.

    (2012)
  • D.P. Boyle et al.

    Toward improved streamflow forecasts: value of semidistributed modeling

    Water Resour. Res.

    (2001)
  • Brandes, D. (2001) Urban Drainage Modeling, pp....
  • Brazil, L. (1989) Multilevel calibration strategy for complex hydrologic simulation models, US Department of Commerce,...
  • J. Buchtele et al.

    Runoff components simulated by rainfallrunoff models

    Hydrol. Sci. J.

    (1996)
  • Burnash, R.J., Ferral, R.L. and McGuire, R.A. (1973) A generalized streamflow simulation system: Conceptual modeling...
  • Burnash, R. (1995) The NWS River Forecast System-Catchment Modeling. In: Singh, V., Ed., Computer Models of Watershed...
  • G.C. Cawley et al.

    On over-fitting in model selection and subsequent selection bias in performance evaluation

    J. Mach. Learn. Res.

    (2010)
  • N.W. Chaney et al.

    Harnessing big data to rethink land heterogeneity in Earth system models

    Hydrol. Earth Syst. Sci.

    (2018)
  • S. Chen et al.

    Neural networks for nonlinear dynamic system modelling and identification

    Int. J. Control

    (1992)
  • Chow, V.T. 1964. Handbook of applied...
  • W. Chu et al.

    Improving the shuffled complex evolution scheme for optimization of complex nonlinear hydrological systems: application to the calibration of the Sacramento soil-moisture accounting model

    Water Resour. Res.

    (2010)
  • W. Chu et al.

    A solution to the crucial problem of population degeneration in high-dimensional evolutionary optimization

    IEEE Syst. J.

    (2011)
  • W. Chu et al.

    Comment on “High-dimensional posterior exploration of hydrologic models using multiple-try DREAM (ZS) and high-performance computing” by Eric Laloy and Jasper A Vrugt

    Water Resour. Res.

    (2014)
  • M.P. Clark et al.

    A unified approach for process-based hydrologic modeling: 1. Modeling concept

    Water Resour. Res.

    (2015)
  • A. Dahamsheh et al.

    Artificial neural network models for forecasting intermittent monthly precipitation in arid regions

    Meteorol. Appl.

    (2009)
  • I.N. Daliakopoulos et al.

    Comparison of an artificial neural network and a conceptual rainfall–runoff model in the simulation of ephemeral streamflow

    Hydrol. Sci. J.

    (2016)
  • C. Daly et al.

    The PRISM Climate and Weather System—an Introduction

    (2013)
  • Cited by (69)

    View all citing articles on Scopus
    View full text