A neural network forecast for daily average PM10 concentrations in Belgium
Introduction
The adverse effects of airborne ambient particulate matter (PM) have become a well-recognised problem in environmental sciences. Besides the reduction of visibility and the deposition of trace elements, the direct impact on human health via inhalation is an important issue. In several studies a significant relation was found between health effects and elevated concentrations of atmospheric PM10 or PM2.5 (PM with an aerodynamic diameter below 10 or 2.5 μm): e.g. Dockery et al. (1993), Pope et al., 1995, Pope et al., 2002. Although the health impact is most pronounced for PM2.5 and long-term exposure, an increased PM10 concentration has been found to result in an increased mortality the day after (e.g. Samet et al., 2000).
Furthermore, for several years PM is of importance as a European policy topic. In order to reduce the health effects of PM10, the EU issued Council Directive 1999/30/EC on 22 April 1999 (European Community, 1999). It defines restrictions for the yearly and 24-h averaged PM10 concentrations for 2005 and 2010.
This paper concerns the ground level atmospheric PM10 concentrations in the central West European country of Belgium. These have been measured since 1996 in the telemetric air quality networks of the three Belgian regions. Currently 41 PM-monitoring sites are operated using both -attenuation instruments and tapered element oscillating microbalances.1 In the Brussels-Capital Region, the concentration level for informing the public in case of increasing exposure to PM10 is set at a daily average of 50 μg m−3. The concentration level at which the public is to be alarmed is set at 100 μg m−3. In case of a foreseen exceedance, a special warning bulletin is to be issued by the Belgian Interregional Cell for the Environment (IRCEL-CELINE). One of the tools currently used for this forecast is the model described in this paper, which is based on a neural network methodology.
(Artificial) neural networks (NN) form a group of machine learning techniques that are inspired by biological neurons. Their history goes back more than 50 years, but due to the availability of modern computers from the 1980s they have grown to be a competitive tool that has been applied widely since the mid 1990s. One of the reasons for their success is their capability to make regressive approximations of non-linear functions in high-dimensional spaces, something that is missing in classical statistics. The flexibility of NN has led to their use in all possible scientific branches. An overview of some applications in the atmospheric sciences during the 1990s can be found in Gardner and Dorling (1998). This article reviews the successful work of different authors on NN forecasting of air pollutants like ozone, sulphur dioxide and carbon monoxide. The main advantages of a NN forecasting tool, compared to deterministic atmospheric modelling systems, are the limited need for input data and computer power (in operational mode, training can of course be computer intensive). Compared to traditional statistical techniques a NN excels by its flexibility. The main drawback is that a NN which is trained by data from a given measuring location can only forecast for that specific location and it cannot give insight into the physics behind the data: a NN merely learns from examples and it is not suited to generalise to other situations.
Recently, several researchers started to use the NN techniques to forecast airborne PM concentrations: e.g. Perez and Reyes (2002), Lu et al. (2003), Kukkonen et al. (2003), Ordieres et al. (2005). They conclude that a NN can be a useful tool to predict PM, although the accuracy they could reach is limited (e.g. lower than that for NO2: Lu et al.,2003; Kukkonen et al., 2003). No reference was made by any of these authors to the use of such a model in an operational PM forecasting system yet.
In this paper, we describe the design of a NN forecasting tool for the ambient PM10 concentrations in Belgium. In the following section we first state our objectives and describe the available resources. In Section 3 the methodology of our research is outlined and the results are analysed in Section 4. In the final section we make a summary and state our conclusions.
Section snippets
Objectives and resources
To state our objectives clearly we first define some abbreviations that will be used:
day0: the day on which the forecast is made
dayN: days relative to day0 (N=…,−1,0,1,…)
〈…〉dayN: daily average of a quantity on dayN.
〈…〉dayN,1−9h: average of first 9 h of dayN.
The goal is to develop a forecasting model for the daily average PM10; at noon of day0 the model is developed to predict the ground level values 〈PM10〉dayN for , 1, 2 that will be measured at the different monitoring sites. The emphasis is on
Neural network approach
The problem we are faced with could be called a regression problem. On the basis of a set of known input variables (Eq. (1)) we have to produce an output variable, that is on average a good estimate for the target 〈PM10〉day1. This implies the design of a model that can fit the relation between the input and target parameters on the basis of an historical dataset. Since the input space is multidimensional and the functional relation with the target is a priori unknown and most likely non-linear,
Quality indicators
Before we construct and compare different models, we have to specify how the forecast errors will be quantified. In our case the aim is twofold. In the first place we want an accurate prediction for the whole range of observable 〈PM10〉day1 concentrations. Hence, as a first quality indicator we use the root-mean-square error RMSE, which gives a global and absolute error in units of μg m−3. It is also sensible to weight this error by the standard deviation of the observed values. This is done
The PM10 phenomenon in Belgium
Now that we have selected and compared the relevant features for the prediction of ambient PM10, we can try to use this knowledge to further analyse the PM10 behaviour in Belgium. We have to be careful, however. When a certain input parameter increases the accuracy of the neural network forecast, it is right to consider this a relevant feature (even though this does not imply any causal relation), but the opposite is not always justified. When a new input does not increase the forecast
References (16)
- et al.
Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences
Atmospheric Environment
(1998) - et al.
Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki
Atmospheric Environment
(2003) - et al.
Neural network prediction model for fine particulate matter (PM2.5) on the US–Mexico border in El Paso (Texas) and Ciudad Juárez (Chihuahua)
Environmental Modelling & Software
(2005) - et al.
Prediction of maximum of 24-h average of PM10 concentrations 30 h in advance in Santiago, Chile
Atmospheric Environment
(2002) Neural Networks for Pattern Recognition
(1995)- et al.
An association between air pollution and mortality in six US cities
The New England Journal of Medicine
(1993) - European Community, 1999. Council Directive 1999/30/EC of 22 April 1999 relating to limit values for sulphur dioxide,...
- et al.