A Bayesian state space modelling approach to probabilistic quantitative precipitation forecasting
Introduction
The provision of reliable forecasts of precipitation over the 0–6 h period at high spatial resolution, often described as nowcasting, is regarded as essential for flood forecasting by the Environment Agency (Golding, 2000). To adequately describe the spatial distribution and timing of precipitation over catchments with areas of the order of 10 km2, prediction over a spatial scale of the order of 1 km is required, with a temporal scale of the order of 10 min. In the foreseeable future forecasts of precipitation at such high resolutions will require the application of probabilistic methods since precipitation generating processes are very sensitive to the correct specification of initial and boundary conditions, which is unlikely to be resolved in operational meso-scale models for some time, largely due to problems of data assimilation and parameterisations.
Operationally, precipitation forecasts are produced over a variety of time and space scales, using a variety of methods We shall assume that nowcasting refers to the production of forecasts with lead times of 0–6 h (Golding, 2000), while short-term forecasting refers to lead times of 6–24 h (Collier, 2000). Beyond 24 h lead times it is widely accepted that NWP based approaches are optimal for providing precipitation forecasts (Golding, 2000). In the range of 0–24 h a variety of methods are used to provide quantitative precipitation forecasts including:
- •
NWP based approaches (Kuligowski and Barros, 1999);
- •
model output statistics techniques (Antolik, 2000, Fox and Collier, 2000);
- •
purely statistical extrapolation methods, generally using radar data as a first guess (Grecu and Krajewski, 2000, Toth et al., 2000, Mellor et al., 2000);
- •
expert system based approaches which model precipitation cell evolution (Pierce et al., 2000).
These different approaches have strengths and weaknesses for a range of forecast lead times, time scales and space scales. In the nowcasting range of 0–6 h NWP based forecasts are less often used because the assimilation and initialisation cycle of these models is of the order of 3 h (Golding, 2000) and there are many unresolved issues surrounding the assimilation of precipitation data into these models (Zou and Kuo, 1996). Thus, to produce nowcasts of precipitation, various statistical methods have been developed, which take advantage of the availability of high resolution radar. In the UK the radar system is only capable of providing information of instantaneous rainfall rates, so this is what is assumed to be available.
Wheater et al. (2000) review a series of elegant statistical spatio-temporal models for describing precipitation fields, based on a hierarchical decomposition of the precipitation into precipitation fields, bands and cells, each of which are modelled as point processes in space (precipitation fields), time (precipitation bands) or space and time (cells). Thus the models are defined as being continuous in space and time, and the actual precipitation field is described by a series of precipitation cells. The models can be fit to radar data, but this fitting is designed to estimate the hyper-parameters of the models, such as the rates of the corresponding Poisson processes. Thus the models provide a statistical characterisation of the precipitation field but are not directly suitable for use in operational forecasting.
In a similar vein Mellor et al. (2000) describe a statistical model for the production of a space-time precipitation field, based on their so called Modified Turning Bands method. This uses triangular prisms to define the location of the regions with potential to generate precipitation, and these are modulated by a paired sinusoidal function to produce temporal pulsing in the resulting precipitation cell generation potential field. Cells are generated from this potential as a Poisson process (the potential defines the rate of the process). Each precipitation cell is described by an inverted parabola, and has its own characteristic lifetime, width and maximal intensity. The model requires manual intervention to be fit to data and is tuned to frontal systems with a linear structure. By generating realisations of the Poisson processes ensembles can be generated, but the rather ad-hoc nature of the fitting of the model to data means that the operational use of this system is currently limited.
In the GANDOLF system, used operationally in the Met Office (Pierce et al., 2000), convection is represented using an object-oriented approach whereby each convective cell is identified and evolved through a life-cycle model, which includes initiation of daughter cells. The system uses data from a variety of sources including the Met Office mesoscale model, METEOSAT images and the radar network. While the GANDOLF system has been shown to produce reasonable convective precipitation forecasts (Pierce et al., 2000), there are several areas which could be improved. One of the most significant problems is that one single deterministic forecast is produced which, for such a non-linear system, is unlikely to characterise the true distribution of precipitation at a forecast lead time of say 3 h.
Georgakakos (2000) developed a hybrid model, which has aspects of the model output statistics approach, in that NWP derived fields are used extensively, for example to predict advection vectors, but also directly uses radar data (both near ground precipitation rates and vertically integrated water content) to produce a nowcast. The paper is novel in that a method is developed which also incorporates model uncertainty, through the specification of both observation and model errors. In that respect is it similar to the work described in this paper, however Georgakakos (2000) assumes a discrete state space (that is a grid based model, with numerical integration in the standard way) and thus can only represent the joint covariance matrix of the precipitation field in neighbouring cells, which is rather restrictive.
While there have been some interesting research developments in statistical modelling for precipitation forecasting, there remain many unanswered questions, in terms of what framework is most appropriate and how model uncertainty and observation uncertainty can be combined to capture the full probability distribution function of precipitation through space and time. In Section 1.2, one possible approach is reviewed.
To progress to the development of probabilistic models for precipitation, which can then be fed into a Bayesian hydrologic forecasting system such as presented in Krzysztofowicz (1999), we essentially have three options:
- 1.
use model output statistics techniques to post-process deterministic NWP (Antolik, 2000);
- 2.
use Monte Carlo (ensemble) methods with statistical or NWP models (Mellor et al., 2000);
- 3.
directly represent and evolve the probability density function of the precipitation process (this paper, and to some extent (Georgakakos, 2000)).
The first method, using model output statistics has a disadvantage, that the probability density function (pdf) of the precipitation process will be complex and state dependent, which may make parameterisation very difficult. In addition, many users (for instance those running distributed hydrological models) will require the joint pdf for precipitation over some spatial domain, so that the spatial structure of the precipitation is preserved, which will be very difficult to achieve.
The second option, based on sampling methods uses a large number of samples to characterise the pdf. The problem with these methods is that to correctly characterise the joint pdf of precipitation over a spatial domain may require many thousands of samples. Current operational ensemble prediction systems use around 100 samples, and are computationally very expensive, since each member must be integrated separately (Molteni et al., 1996). However ensembles maintain an advantage in that any model can readily be run as an ensemble, if the computational expense can be coped with. In addition it may be possible to post-process output to produce better approximations to the true forecast pdf (although this will again be very difficult for joint pdfs) (Mylne et al., 2002).
The third option is pursued in this paper, that is a model will be constructed which retains a probabilistic representation of the evolution of the precipitation field. Thus the model must propagate the full joint pdf of the precipitation field, and update this as observations become available. State space models provide a suitable framework, these being most readily presented in their natural Bayesian context. Any dynamical system can be represented in the state space modelling framework, which provides formalism for the updating and propagation of the pdf of the state of a system. In the linear, Gaussian case this is generally referred to as the Kalman filter. For non-linear and non-Gaussian systems there are a large variety of extensions to the Kalman filter (Chatfield, 1996), such as the extended Kalman filter, the ensemble Kalman filter (Evensen, 2001) and particle filter (Doucet et al., 2001) based methods.
State space models assume that the state of the system at time t, denoted is not directly observed (it is a latent variable). The state is taken to evolve in time according to the (Markovian) state evolution equation:where is the state evolution (or system) model which maps the state at time t to the new value at time t+1 and is the system noise. This system noise represents model error, and can often be difficult to determine. The observations at time t, are related to the state by the observation equation:where is the observation (or forward/sensor) model which maps the state variable to the observables. is the observation error and reflects the uncertainty in the observations which may arise from several sources, including incomplete knowledge of the observation process, such as arises when using radar images, as well as the intrinsic measurement uncertainty due to the observation system.
This framework is completely general, since there have been no restrictions imposed on f, h, or In the standard Kalman filter f and h are no longer functions but linear operators (matrices) and and are assumed Gaussian, as is the initial distribution of the of the state Thus for all future times the state will remain Gaussian distributed.
In this work the aim is to track the pdf of the state given all the previous observations, denoted by
The observation process is written as the conditional probability of the observations at time t given our estimate of the state at time t and the observation model, h. This reflects the assumption that the observation process in often not completely understood and that it has errors associated with it. The evolution of the system is given by the conditional probability of the state at time t+1 given our estimate of the state at time t and the system model, f. In Section 2 these are shown for the precipitation model.
In the Bayesian interpretation the state evolution (forecast) step becomes:The state update (assimilation) step can then be written as:which can be seen as the application of Bayes theorem to the inverse problem of estimating the state given a set of observations, where the prior distribution, comes from the state evolution step. These equations are then applied sequentially, to propagate the pdf of the state forward in time and update it given some observations, as illustrated in Fig. 1.
This general framework is very flexible and can adapted to almost every modelling situation. In Section 2 the state space modelling framework is extended to the probabilistic quantitative precipitation forecast model. In Section 3 the prior models are discussed and it is shown how these can be used generatively, to simulate from the model, even when no data has been seen. Section 4 discusses the results of the application of the model to both simulated (where we know the true parameters) and real data and conclusions are drawn in Section 5.
Section snippets
Model framework
The state space for the precipitation model must consist of the key variables necessary to forecast precipitation over the 0–3 h range. At a minimum there must be a representation of the instantaneous precipitation rate field, which we will denote To account for advection, the state space must be extended to include a vector field to represent the continuous spatial behaviour of the advection of the precipitation, which we denote The model dynamics are very simple, the evolution of
Generative model
The model has been completely specified, although there remain some (hyper)parameters (the system noise (co)variances) which must be defined. Ideally these parameters would be estimated from data, using a method such as Kalman smoothing. Time constraints meant this was not possible for this prototype model, thus the values for these priors were set on the basis of expert judgement. This is consistent with the Bayesian framework adopted.
The system noise covariances were estimated using arguments
Results and discussion
The model was first tested on synthetic data generated from the model itself. This has the advantage that the true (generating) parameters are known and thus we can check the inference within the model is operating correctly. An example of a time series generated from the model, together with the fit to that data (the assimilated fields at the same time) is shown in Fig. 4. While this sort of testing cannot be used to infer the ability of the model to forecast real precipitation events, it does
Conclusions
This paper has described a fully probabilistic model for precipitation nowcasting. Wherever possible the assumptions which underlie the model have been explicitly stated, so that it is possible to appreciate the drawbacks and advantages of the model. A methodology is proposed for placing ad-hoc extrapolation based nowcasting models into a consistent probabilistic framework. There is much that could be criticised about the model, and the model deficiencies might be ranked in the following order:
- •
Acknowledgements
Thanks are due to Clive Pierce of the UK Met Office and other members of the Stochastic Precipitation Nowcasting Panel for their comments and discussion about this work. Thanks are also due to Chris Collier for his helpful comments on radar errors at the review stage. I would like to acknowledge the help of MSc student Emmanuel Batail, who assisted in getting the model running on real data, and Lehel Csato who was always happy to discuss mathematical problems.
References (25)
An overview of the National Weather Service's centralised statistical quantitative precipitation forecasts
Journal of Hydrology
(2000)The impact of wind drift on the utility of very high spatial resolution radar data over unrban areas
Physics and Chemistry of the Earth
(1999)Quantitative precipitation forecasting
Journal of Hydrology
(2000)Covariance propagation and updating in the context of real-time radar data assimilation by quantitative precipitation forecast models
Journal of Hydrology
(2000)Quantitative precipitation forecasting in the UK
Journal of Hydrology
(2000)- et al.
A large-sample investigation of statistical procedures for radar-based short-term quantitative precipitation forecasting
Journal of Hydrology
(2000) - et al.
Bayesian inference for wind field retrieval
Neurocomputing Letters
(2000) - et al.
Comparision of short-term rainfall prediction models for real-time flood forecasting
Journal of Hydrology
(2000) Neural Networks for Pattern Recognition
(1995)The Analysis of Time Series: An Introduction
(1996)
Developments in radar and remote-sensing methods for measuring and forecasting rain
Philosophical Transactions of the Royal Society of London A
Sequential Monte Carlo Methods in Practice
Cited by (13)
Data assimilation for precipitation nowcasting using Bayesian inference
2009, Advances in Water ResourcesCitation Excerpt :Section 3 described the general concepts of Bayesian data assimilation, and the need for some numerical method in order to evaluate the posterior distribution of the variable of interest. The initial model developed by [9] used a maximum a posteriori method to determine the optimal state x∗, and a Laplace approximation about x∗ to estimate the posterior uncertainty via the covariance matrix. The maximum a posteriori method consists of finding the value x∗ of x which maximises the posterior probability (or, as is usually done in practice, the negative logarithm of the posterior probability).
An empirical ensemble rainfall nowcasting model using multi-scaled analogues
2020, Natural HazardsCASA prediction system over dallas–fort worth urban network: Blending of nowcasting and high-resolution numerical weather prediction model
2020, Journal of Atmospheric and Oceanic TechnologyKalman filtering-based probabilistic nowcasting of object-oriented tracked convective storms
2015, Journal of Atmospheric and Oceanic Technology