Sensitivity testing of a coupled Escherichia coli – Hydrologic catchment model

https://doi.org/10.1016/j.jhydrol.2007.02.037Get rights and content

Summary

A conceptual model of microbial behaviour in catchments coupled with a standard hydrological model, known as the EG model, has been previously developed and tested for Escherichia coli. Due to the unavailability of pathogen data, E. coli has been used as a pathogen indicator. However, the model uses a broad conceptual approach and therefore should be tested for other microbes in future. This paper presents work done on sensitivity of the EG model, as well as its further refinement. Sensitivity of the model results to all E. coli calibration parameters was carried out. The EG model was then tested for its sensitivity to the number of events used to calibrate the model. The data collected at three different Australian drinking water catchments were used. Of the four parameters in the E. coli component of the EG model, two proved to be insensitive while the other two proved to be important. The sensitive parameters were the coefficients associated with the ‘wash-off’ functions in the model, while the two insensitive coefficients were associated with the E. coli decay functions in the model. However, the model became more sensitive towards the decay parameters in cleaner catchments. This indicates that the hydrologic aspects of the E. coli transport processes dominate rather than the E. coli decay functions.

Apart from one catchment (that was partly urbanised and much smaller than the other two), the model was successfully calibrated using a small number of monitored events. It was concluded that the EG model could be simplified further by not modelling the decay of the pathogen indicator, E. coli.

Introduction

Waterborne illness is a significant public health issue killing approximately 3.3 million people per year (WHO, 2001). Waterborne illness is still a considerable disease burden within developed countries. For example, the USA EPA estimates that waterborne illness causes in the vicinity of 900 000 illnesses per year (AWWA, 1999). The bulk of this disease burden is from microbiological pathogens washed from catchments as opposed to chemical or radiological impacts. To evaluate the pathogen risks associated with various land-uses and hydrological characteristics of drinking water catchments, it is necessary to reliably model the concentrations of pathogens or at least their indicators washed from these catchments into nearby streams.

There is surprisingly little published on rainfall-runoff models that also incorporate modelling of pathogens. A review by Ferguson et al. (2003) of the major knowledge gaps in pathogen research in catchments cited seven published pathogen modelling studies. However, most of these models are existing rainfall-runoff or sediment transport catchment models that are simply adopted for pathogen evaluation. In other words, these models were not originally intended to incorporate pathogen movement, but equations developed for other processes were simply adopted for modelling of pathogens. Recently, the authors have developed a lumped conceptual pathogen model coupled to a hydrological model, known as the EG model (Haydon and Deletic, 2004, Haydon and Deletic, 2006). The model was successfully tested for the pathogen indicator Escherichia coli on three catchments in Australia (Haydon and Deletic, 2006). The model is still to be tested for pathogens, such as Salmonella, Camplyobacter, Cryptosporidum, Giardia, and other indicators such as Enteroccocci.

To the authors’ knowledge, no major study has been published on uncertainty or sensitivity of pathogen or pathogen indicator models. However, this is crucial for proper development and consequent application of any model. For example, Rabitz (1989) cites Kolb who claims that modelling without sensitivity analysis is ‘intellectually dishonest’.

Butts et al. (2004) points out that the main sources on uncertainty or sensitivity in flow modelling are:

  • 1.

    model structure,

  • 2.

    uncertainties due to sub-optimal parameter values,

  • 3.

    uncertainty in model output data used in the process of calibration and verification,

  • 4.

    uncertainty in input data (e.g., uncertainties in measured data used to feed the models).

It could be argued that similar sources of uncertainty would occur in a coupled hydrologic-pathogen model.

Determination of modelled results sensitivity to the model parameters is usually the first step in assessing model uncertainties. The classical method used for these analyses is to plot the surface of the objective function as a function of model parameters, hoping that it is well behaved and smoothly slopes to global minima, which clearly points to the optimal parameter set (Beven, 2001). The analogy being that the objective function surface resembles a single depression on a flat plain. However in many areas of hydrologic modelling this surface is highly irregular and more like a mountain range with a series of peaks and numerous valleys.

Kuczera (1997) argues that many models are poorly formulated or ‘ill-posed’ making it difficult to identify the global optimum. Beven (2001) also discussed the notion of an optimal parameter set and concluded that the classical techniques of statistical inference do not apply well to hydrologic modelling and a more flexible approach is needed. He argues that ‘… the concept of the optimum parameter set may be ill-founded in hydrological modelling.’. Therefore, throughout this paper the parameter set that gives the minimum of a chosen objective function is referred to as the ‘reference parameter’ set rather than the optimal parameter set.

The importance of a calibration data set has been shown in the literature on pollution catchment modelling. Mourad et al. (2004) showed that urban stormwater model validation performance improved as the number of events used to calibrate increased but only up to a certain level. At some point extra events do not improve a models predictive performance. Assessing the optimal number of events needed for calibration in the case of pathogens or their indicators will have huge implications since pathogen data collection is highly expensive.

Within the literature there is not a clear definition of what constitutes sensitivity analysis and what constitutes uncertainty analysis and the terms are often used inter-changeably. For example in Saltelli et al. (2004) the entire research field is regarded as sensitivity analysis, while Amirghassemi and Modarres (1990) regard it all as uncertainty analysis. Cacuci (2003) separates the two, using sensitivity analysis to examine the effects of parameter variations on model output and uncertainty analysis to examine the effect of parameter uncertainties on model output uncertainty. Amirghassemi and Modarres (1990) made the point that uncertainty occurs both in the process description (the model) and the state of the process (behaviour of the model) implying that the models structure is also a source of uncertainty, an area of research for others (Butts et al., 2004).

To avoid further confusion, we differentiate sensitivity analysis from uncertainty analysis on the following basis (in some way similar to Cacuci, 2003):

  • Sensitivity analysis is the term used to study the influence that non-measurable factors have on the modelled results. The impact of calibration model parameters will be the first factor examined. The importance of the different parts (equations) of the model; known as the sensitivity of the model structure; also belongs to this category. The determination of a models parameter set is inherently linked to the calibration process and the amount of data used for calibration. Therefore the number of events or length of data used for calibration is also a sensitivity issue.

  • Uncertainty analysis is the term used to describe the uncertainty in the model results due to uncertainty in measurable inputs that are fed into the model. For example input data such as rainfall data, evapotranspiration, or catchment area are measured with some uncertainty and that will affect the reliability of the model results (e.g., Molini et al., 2001). In a similar way, uncertainty in the measurable data used for model calibration may impact the model results.

In other words, when you measure something you can evaluate its uncertainty, and then propagate it through a model. But when you vary non-measurable factors, such as a model calibration parameter or use a different length of input data record, there is no uncertainty about that, even though the output of the model may be largely affected by the given changes. However, it should be noted that some researchers and modellers would use different terms for the above analyses.

This paper presents results from the sensitivity analysis of the EG E. coli model. The sensitivity of the model calibration parameters was studied and conclusions drawn on the important components of the models structure. In the second part the sensitivity of the model results in relation to the amount of data used for calibration was assessed. This is regarded as important since collection of pathogen data is very expensive. The model structure was refined as well as some interesting findings drawn on the appropriate application domain of the EG model. However, it should be noted that some readers may refer to some parts of this work to be on model structure uncertainty.

The EG model can predict concentrations and loads of pathogen indicators E. coli in discharges from large catchments. The model is fully explained in another paper (Haydon and Deletic, 2006) and therefore only a brief description is presented here. As shown in Fig. 1, EG is a simple conceptual E. coli model coupled with the hydrologic model SimHyd (Chiew et al., 2002), that is able to predict concentrations of E. coli during both dry and wet weather on a continuous basis There are two model components namely surface, and sub-surface components with respective stores in each. Loss and discharge of E. coli was modelled for each store. The governing equations are detailed in Table 1 (flow rates and volumes are modelled using SimHyd). As the model is a mass-balance model, exports and losses are constrained to ensure that they do not exceed the total available amount of E. coli. So it is possible to run a store empty or a concentration to zero but it is not allowed to go into a deficit. The EG model does not explicitly incorporate resuspension of E. coli in streams or on river banks as this is a hydraulic process rather than a hydrologic one. However, the two stores will provide an equivalent function but to a lesser extent. Later research is planned to incorporate this process in a suitably modified hydraulic model linked to the EG model. Even more importantly, the EG model has a broad conceptual basis that could be adopted for modelling of other microbes and should also be tested for a number of pathogens.

The EG model coefficients a1, a2, b1, and b2 (Table 1 and Fig. 1), that must be positive or zero, should be calibrated for each catchment. Their brief description is below.

  • Parameter a1 is attached to the surface loss function, and is essentially a surface E. coli decay parameter that sets the rate of loss from the surface store of E. coli.

  • Parameter a2 is attached to the infiltration export function and effectively determines how much of the E. coli in the surface store can be transported to the sub-surface store.

  • Parameter b1 is attached to both the interflow and baseflow export functions and determines the concentration of the E. coli in the baseflow and the interflow. It essentially sets the total amount exported from the sub-surface store.

  • Parameter b2 is attached to the sub-surface loss equation and is similar to a1 in that it determines the loss or decay rate of the sub-surface store.

It should be noted that the SimHyd hydrologic model (that the EG model is coupled to) produces a notable proportion of the flows as interflow or excess from its Soil Moisture store, rather than as overland flow; SimHyd is effectively modelling discharge by putting the water through its Soil Moisture store. In similar way, the EG model is restricting the supply of E. coli to the pathogen sub-store as a function of the inflow into its Soil Moisture store. However, surface flow behaviour is catered for by the Soil Moisture Stores excess and interflow processes. In conclusion, these are all artefacts of the conceptualisation of the split of the discharge into overland and interflow.

Section snippets

Sensitivity analysis

As stated in the Introduction, the main tasks were to determine how sensitive the EG model is to its parameters (and eventually learn more about the model structure) and the amount of data used for calibration. The model was run on an hourly timestep (in the previous work it was found that this is the minimum required for the catchments studied, Haydon and Deletic, 2006), and the following was carried out for one catchment:

  • 1.

    manual calibration of the model,

  • 2.

    automatic calibration of the four EG

Model calibration

The calibration of the SimHyd models was quite straight-forward due to the large volume of the available data on rainfall and flow. The agreement between measured and modelled flow rates was assessed. The correlation coefficient between measured and modelled ranged from r2 = 0.55 to 0.74, and the Nash–Sutcliffe Coefficient of Model Efficiency ranged from E = 0.5 to 0.74.

All the sites had less than one hundred E. coli data points, grouped into baseflow and 3–7 events (depending on the catchment).

Conclusion

The performance of the E. coli model is more sensitive to two of the four models parameters. These parameters are associated with the transport (i.e., hydrologic) aspects of the model more so than the E. coli decay aspects of the model. Indicating that focussing on good calibration of the attached hydrologic model is important.

Given the paucity of E. coli data, calibration with a low number of events has been attempted. For large rural catchments few events are required for representative

Acknowledgements

The authors wish to acknowledge the support of both Melbourne Water and the Co-operative Research Centre for Water Quality & Treatment who are joint sponsors of this research.

References (20)

There are more references available in the full text version of this article.

Cited by (11)

  • Assessing uncertainties in urban drainage models

    2012, Physics and Chemistry of the Earth
    Citation Excerpt :

    Many published studies have dealt with the impact of uncertainties on model parameters, also known as sensitivity analysis (Kanso et al., 2003; Thorndahl et al., 2008; Dotto et al., 2009). Some studies used the results of a model sensitivity analysis to produce parameter probability distributions (PDs), which reflect how sensitive the model outputs are to each parameter (e.g. Marshall et al., 2004; Dotto et al., 2010a; McCarthy et al., 2010); while other studies used the sensitivity analysis to screen parameters for further analysis (e.g. Reichl et al., 2006; Haydon and Deletic, 2007). In most cases, model sensitivity results were also used to estimate confidence intervals around the model’s outputs (e.g. Yang et al., 2008; Li et al., 2010).

  • Quantifying the impact of climate change on enteric waterborne pathogen concentrations in surface water

    2011, Current Opinion in Environmental Sustainability
    Citation Excerpt :

    Moreover, Crowther et al. [94] recognise that the temporal and spatial variability of the concentration of faecal indicators and their influencing factors are poorly understood. However, more observational data [116] and a better understanding of the model outputs through validation (most modellers do this, e.g. [117]), sensitivity studies [e.g. 118] and model comparison using benchmark databases [119] may improve model performance and help further estimation of uncertainties. Process-based modelling of waterborne pathogens is still in its infancy but will be a useful decision support tool for identification of best management practices to reduce incompliance with Clean Water Acts (US), Water Framework Directives (EU) and the like [e.g. 113,116•].

  • Model output uncertainty of a coupled pathogen indicator-hydrologic catchment model due to input data uncertainty

    2009, Environmental Modelling and Software
    Citation Excerpt :

    Finally, it can be hypothesised that high uncertainties in the surface pathogen deposition rate inputs will have an important impact on the EG results. The authors have already conducted a study on the sensitivity of the EG model to variations of its parameters and the availability of calibration data (Haydon and Deletic, 2007), which was regarded as a part of the model development. This paper is the continuation of that work, and focuses solely on the effects of uncertainties in input data on the modelled results.

  • Modeling fecal indicator bacteria concentrations in natural surface waters: A review

    2014, Critical Reviews in Environmental Science and Technology
View all citing articles on Scopus
View full text