Model uncertainty analysis by variance decomposition

https://doi.org/10.1016/j.pce.2011.07.003Get rights and content

Abstract

Errors and uncertainties in hydrological, hydraulic and environmental models are often substantial. In good modelling practice, they are quantified in order to supply decision-makers with important additional information on model limitations and sources of uncertainty. Several uncertainty analysis methods exist, often with various underlying assumptions. One of these methods is based on variance decomposition. The method allows splitting the variance of the total error in the model results (as estimated after comparing model results with observations) in its major contributing uncertainty sources. This paper discusses an advanced version of that method where error distributions for rainfall, other inputs and parameters are propagated in the model and the “rest” uncertainties considered as model structural errors for different parts of the model. By expert knowledge, the iid assumption that is often made in model error analysis is addressed upfront. The method also addresses the problems of heteroscedasticity and serial dependence of the errors involved. The method has been applied by the author to modelling applications of sewer water quantity and quality, river water quality and river flooding.

Highlights

► Model uncertainty sources are compared by splitting the variance of the total error in the model results in its major contributing uncertainty sources. ► Model structural errors are estimated as “rest uncertainties” after subtracting contributions of input and parameter uncertainties from total model errors. ► Correlations and non-linear interactions between different model inputs and parameters are addressed, as well as heteroscedasticity and serial dependence of the errors involved. ► Results of three applications are shown: sewer water quantity and quality, river water quality and river flood modeling.

Introduction

Errors and uncertainties in the results of hydrological, hydraulic and environmental models are often substantial. Consideration of these errors is therefore crucially important. When water decision making is based on model results, which becomes more and more common practice in modern water management and engineering, uncertainties in model outcomes affect the decisions. Hence, good modelling practice provides not only the model results but also the uncertainties in these results. It supplies decision-makers with important additional information in support of their decision making process. After quantification of the uncertainties in the model results, water policies can be set up for which the efficiency can be guaranteed up to specified acceptable probability or risk levels. Model sensitivity analysis combined with uncertainty analysis provides the modeller and decision maker with information about the importance of various types of model limitations and sources of uncertainty, hence on how to reduce uncertainties in cost-efficient ways.

Total uncertainty in model results is typically assessed after comparing model outputs with observations, thus from goodness-of-fit statistics, such as the mean error, (root) mean square error, standard deviation or variance of model residuals. It is worth mentioning that such methods only consider “quantifiable uncertainty”. The real uncertainty may, however, be larger than this quantifiable uncertainty. It indeed might occur that the water system has specific influences, which were not observed in the past, but which may occur in an unpredictable manner in the future. This is a specific and special type of ‘lack of knowledge’, which we cannot include in our uncertainty predictions as it is not quantifiable. Such additional non-quantifiable uncertainty is usually referred to as ‘ignorance’. Ignorance occurs when we are missing relevant knowledge. Two types of ignorance can be considered (see also Harremoës, 2003):

  • Recognized ignorance or accepted ignorance: we realize and accept that we are ignorant, and communicate about this.

  • Total ignorance: we do not realize this lack of knowledge and are in complete ignorance (or: “ignorance about the fact that we are ignorant”).

Our confidence in a model may range from being certain (called ‘determinism’ by Harremoës, see Fig. 1) to accepting that we are ignorant (zero confidence, ‘indeterminacy’ in Fig. 1). Regardless of our state of confidence, we may be correct or wrong as a result of ignorance. Non-quantifiable uncertainties thus also may contribute to the model prediction uncertainty, and may in some case be more serious than the quantifiable uncertainties. Therefore, there is a need to communicate also about these uncertainties in the decision support (van der Sluijs et al., 2003). When essential processes are not incorporated in the structure of the model due to insufficient knowledge about the processes, then it would be wrong to forget about this recognisable ignorance and not to communicate this potential source of uncertainty. The communication still can be supported by quantitative information, by running some scenarios on the unknown processes, and to report on the potential differences in model results (e.g. epistemic uncertainty, comparing a set of competing models based on different assumptions for the unknown processes; as is done in ensemble modelling approaches).This paper focuses on the estimation of the quantifiable part of the uncertainty in the outputs of hydrological, hydraulic and water quality models. Many uncertainty analysis approaches take the variance of model residuals as basis for the uncertainty analysis. Model residuals are defined as EY-Yo(i)=Y(i)-Yo(i), where Y(i) is the model output and Yo(i) the observation at several time moments i (i = 1, n). Frequently applied methods for uncertainty estimation are classical Bayesian approaches (e.g. Kuczera et al., 2006, Thorndahl et al., 2008, Haydon and Deletic, 2009), pseudo Bayesian methods such as GLUE (e.g. Beven and Binley, 1992, Freni et al., 2009), recursive model and parameter identification techniques (e.g. Thiemann et al., 2001, Wagener et al., 2003, Jonsdottira et al., 2006), and methods based on frequentist statistical inference (e.g. Montanari, 2007). See for an overview also Montanari (2007). Under the frequentist statistical inference paradigm, the following concept exists: Yr = Y(X, P) + EY, where Yr are the “real” values of the model output variable(s), Y is the model output depending on model inputs X and parameters P, and EY the model output error. The model output error is the result of several sources and types of errors, including the errors on X, on P, and on the model structure. The model error EY is assumed to follow an iid error process (assumed to be normal with mean 0 and variance σ2, e.g. after transformation). This paper deals with an approach to decompose the total model output variance σ2 in its major contributing uncertainty sources, classified in model input, model parameter and model structure errors. The approach accounts for the fact that the iid assumption for EY is often not valid (Vrugt et al., 2005, Neumann and Gujer, 2008). Model residuals indeed often have a temporal correlation structure and are often non-stationary (Mantovan and Todini, 2006). This is due to model-structure simplifications and the influence these simplifications together with the input and parameter uncertainties have on the model outputs.

When the statistical properties as temporal correlation structure and non-stationarities are not properly accounted for, biased parameter and uncertainty estimates are obtained (Vrugt et al., 2005, Mantovan and Todini, 2006, Neumann and Gujer, 2008). Moreover, model input, model parameter and model structure errors each have different stochastic error properties. Model parameters are per the definition used in this paper constant in time, while model inputs (i.e. rainfall) are strongly time variable. Due to these strong differences in autocorrelation structure, model inputs, parameters and structure affect model output errors and their stochastic properties in a (often highly) different way. When model output errors are assumed to be mainly produced by model parameter uncertainties (as is assumed by several authors), next to the biased parameter uncertainty estimates obtained due to neglecting the model input and model structure uncertainties, biased uncertainty estimates are also obtained by wrongly making the iid assumption for the model errors (see e.g. Vrugt et al., 2005, Mantovan and Todini, 2006, Neumann and Gujer, 2008). This comment does not only hold for the frequentist statistical inference based methods, but also for other approaches.

Recent developments in the uncertainty analysis of hydrological and other environmental models addressed the difficult task of quantifying the uncertainty in the model outputs that arises from several error sources. Among these recent developments is the sequential data assimilation method proposed by Salamon and Feyen (2010). The method makes use of a multiplicative error model. Also these authors warned for making too simplified assumptions on the error structures. Reichert and Mieleitner (2009) explained how insights in the model error structure and its underlying error sources can help to obtain more realistic uncertainty bounds, as well as to learn about the causes of model output bias and to search for corrections to model input biases or deficits in the model structure.

This paper delivers a contribution to obtaining realistic uncertainty bounds and to split the total model output uncertainty in its major uncertainty sources based on a variance decomposition approach, while accounting for the model error structure. Error distributions from model inputs and more upstream submodels are estimated (together with their autocorrelation structure) and propagated to outputs and more downstream submodels. If this is done carefully and uncertainty sources lumped, the approach may overcome the classical problems of dependence and non-linear interactions between different inputs and parameters. Also problems related to heteroscedasticity and serial dependence of the errors involved are addressed. The latter two aspects are discussed first in Sections 2 Heteroscedasticity, 3 Serial dependence. Section 4 thereafter explains the variance decomposition method, followed by the main results after application of the method to three case study models (sewer water quality model, river water quality model and river flood model) in Section 5, and conclusions and discussion in Section 6.

Section snippets

Heteroscedasticity

In water system models (i.e. hydrological, hydrodynamic, physico-chemical water quality models), model residual variances are typically heteroscedastic; they increase with increasing model output value (discharge, water level, concentration, etc). They often can be transformed in variances that become approximately constant or independent on the model output value (homoscedastic residuals) after transformation. Transformation methods found in the literature are based on logarithmic or other

Serial dependence

When time series of model results and corresponding observations with small time step are considered (smaller than the concentration time of the system, or the characteristic time scale of model responses; e.g. hourly for a river catchment rainfall-runoff model, 5-min for a sewer model), model residuals have a serial dependence. This has to be accounted for in the uncertainty analysis. For runoff discharges, the serial dependence is often higher for dry periods in comparison with wet periods.

Variance decomposition

Next to knowledge on the model bias and uncertainty, represented by e.g. the model residual variance, it would be useful to decompose this variance in its contributing uncertainty sources. The decomposition could be done for different components (submodels) of the system under study, or per model in order to separate uncertainties related to the model inputs, the model parameters and the model structure. In case model residual errors are calculated based on (direct or indirect) observations,

Case study results

The variance decomposition method outlined above requires the following sequential steps to be followed:

  • 1.

    Split of the water system under study in submodels (based on the available measurements and expert knowledge, to select submodel output variables for which the uncertainty sources are statistically independent).

  • 2.

    For each (sub)model input variable: Assessment of model input uncertainties through a separate investigation (see Section 4.2.1; stochastic terms EX in Fig. 4 will be obtained).

  • 3.

    For

Conclusions and discussion

This paper discussed a method for uncertainty analysis on river and sewer models based on variance decomposition, referring to the results for three different applications of sewer and river water quality and flood modelling. The method addresses the problems of heteroscedasticity and serial dependence of the model errors. It allows the total model output uncertainty to be split in its major contributing uncertainty sources, classified in model inputs, parameters and submodel-structures. The

References (42)

  • A. Saltelli et al.

    How to avoid a perfunctory sensitivity analysis

    Environ. Modell. Softw.

    (2010)
  • B. Sevruk

    Adjustment of tipping-bucket precipitation gauge measurements

    Atmos. Res.

    (1996)
  • S. Thorndahl et al.

    Probabilistic modeling of overflow, surcharge, and flooding in urban drainage using the First Order Reliability Method and parameterization of local rain series

    Water Res.

    (2008)
  • S. Thorndahl et al.

    Event based uncertainty assessment in urban drainage modelling, applying the GLUE methodology

    J. Hydrol.

    (2008)
  • J.J. Warmink et al.

    Identification and classification of uncertainties in the application of environmental models

    Environ. Modell. Softw.

    (2010)
  • P. Willems

    Quantification and relative comparison of different types of uncertainties in sewer water quality modelling

    Water Res.

    (2008)
  • P. Willems

    A time series tool to support the multi-criteria performance evaluation of rainfall-runoff models

    Environ. Modell. Softw.

    (2009)
  • K.J. Beven

    Uniqueness of place and process representations in hydrological modelling

    Hydrol. Earth Syst. Sci.

    (2000)
  • K.J. Beven et al.

    The future of distributed models–model calibration and uncertainty prediction

    Hydrol. Process.

    (1992)
  • G.E.P. Box et al.

    An analysis of transformations

    J. Royal Stat. Soc. Series B

    (1964)
  • G. Coccia et al.

    Recent developments in predictive uncertainty assessment based on the model conditional processor approach

    Hydrol. Earth Syst. Sci. Discuss.

    (2010)
  • Cited by (30)

    • Uncertainty analysis in a large-scale water quality integrated catchment modelling study

      2019, Water Research
      Citation Excerpt :

      Secondly, it relies on certain hypotheses that might not be always met, as independence of error sources or homoscedasticity, which have the potential to distort the outcome, thus checking their influence is recommended. The design of this study tried to minimize the effect of these characteristics (following the recommendations of Willems (2012)). Fig. 7 provides the relationship between the estimated variance at DO and flow series in three representative points and the number of simulation samples.

    View all citing articles on Scopus
    View full text