Elsevier

Atmospheric Environment

Volume 37, Issue 32, October 2003, Pages 4539-4550
Atmospheric Environment

Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki

https://doi.org/10.1016/S1352-2310(03)00583-1Get rights and content

Abstract

Five neural network (NN) models, a linear statistical model and a deterministic modelling system (DET) were evaluated for the prediction of urban NO2 and PM10 concentrations. The model evaluation work considered the sequential hourly concentration time series of NO2 and PM10, which were measured at two stations in central Helsinki, from 1996 to 1999. The models utilised selected traffic flow and pre-processed meteorological variables as input data. An imputed concentration dataset was also created, in which the missing values were replaced, in order to obtain a harmonised database that is well suited for the inter-comparison of models. Three statistical criteria were adopted: the index of agreement (IA), the squared correlation coefficient (R2) and the fractional bias. The results obtained with various non-linear NN models show a good agreement with the measured concentration data for NO2; for instance, the annual mean of the IA values and their standard deviations range from 0.86±0.02 to 0.91±0.01. In the case of NO2, the non-linear NN models produce a range of model performance values that are slightly better than those by the DET. NN models generally perform better than the statistical linear model, for predicting both NO2 and PM10 concentrations. In the case of PM10, the model performance statistics of the NN models were not as good as those for NO2 over the entire range of models considered. However, the currently available NN models are neither applicable for predicting spatial concentration distributions in urban areas, nor for evaluating air pollution abatement scenarios for future years.

Introduction

During recent years, statistical models including those based on artificial neural networks (NN) have been increasingly applied and evaluated for the regression analysis and forecasting of air quality. In their overview of applications of NN in the atmospheric sciences, Gardner and Dorling (1998) concluded that NN generally give as good or better results compared with statistical linear methods, especially where the problem being analysed includes non-linear behaviour. The NN methods can also be used in combination with traditional deterministic modelling techniques.

Gardner and Dorling (1999) tested the benefits of using a multi-layer perceptron (MLP) NN approach to model NO2 concentrations in London, relative to other statistical modelling approaches. They found out that the temporal variation of emissions could be represented by using the input variables of time of day and day of week. In addition, simple meteorological input variables were used, providing some indication of atmospheric stability, without the need for processing of the measured meteorological data. The MLP models consistently outperformed a linear regression approach.

Kolehmainen et al. (2001) evaluated various computational models using hourly concentration time series of NO2 and basic meteorological variables collected for the city of Stockholm in 1994–1998. They concluded that the MLP NN yielded more accurate regression analysis and forecasting of air quality, compared with the results obtained using the self-organising map (SOM) or a linear time series method. Gardner and Dorling (1999) obtained a similar result using a different set of data.

The review by Gardner and Dorling (1998) does not include any applications of NN to the modelling of particulate matter. Perez et al. (2000), however, compared the forecasting of air quality for fine particulate matter produced by three different methods: a multi-layer NN, linear regression and persistence methods (the latter assigns hourly values for the subsequent day to be equal to the equivalent values for the current day). The three methods were applied to the hourly averaged PM2.5 data for the years 1994–1995 measured at one location in the downtown area of Santiago, Chile. The NN gave the best results overall in the forecasting of the hourly concentrations of PM2.5.

Gardner (1999) undertook a model inter-comparison using linear regression, MLP and classification and regression tree (CART) approaches for hourly PM10 modelling in Christchurch, New Zealand, for the period 1989–1992. The MLP method outperformed CART and linear regression across the range of performance measures employed. The most important predictor variables in the MLP approach were time of day, temperature, vertical temperature gradient and wind speed.

The study presented here is part of an EU-funded research project “Air Pollution Episodes: Modelling Tools for Improved Smog Management-APPETISE” (2000–2002; http://www.uea.ac.uk/env/appetise/), reviewed by Greig et al. (2000). The APPETISE project has represented the first concerted attempt to undertake a model inter-comparison exercise between advanced statistical and present day deterministic air quality modelling approaches. The final aim of the project was to produce recommendations on the suitability of various models, or various classes of models, for specific applications. The APPETISE project concentrated on four pollutants: nitrogen oxides, particulate matter, tropospheric ozone and sulphur dioxide.

Schlink et al. (2003) have performed a model inter-comparison exercise within the APPETISE project for the statistical regression of tropospheric ozone, using 14 different statistical modelling techniques and a deterministic Lagrangian trajectory model including chemistry. Ten measurement sites were selected, located in Germany, Italy, United Kingdom and the Czech Republic. The authors recommended those methods that are able to model static non-linearities; these include NN and generalised additive models. The best predictions were obtained for multi-variate approaches using observed meteorological data.

This paper focuses on the model evaluation and inter-comparison for NO2 and PM10 concentrations in urban areas. We only address the “now-casting” of air quality, i.e., air quality forecasting using numerical weather forecasting models is outside the scope of the study. The main motivation of this study was practical, regarding the comparison of the numerical performance of various statistical methods and the deterministic modelling system (DET). Both kinds of methods are currently used in regulatory air pollution forecasting by the local authorities worldwide. However, there have been no rigorous inter-comparisons in previous scientific literature of the methods based on artificial NN and the deterministic modelling methodologies for this purpose.

The term “prediction” is used in this paper to mean establishing the relationship between observed independent variables (predictors, such as meteorological variables) and an observed dependent variable (predictand; in this case concentration). When the predictors are forecast by some method, we can “forecast” or “predict” the predictand. Computations using statistical methods could be interpreted as fitting the data with predictors.

Section snippets

Emission inventory and deterministic models

We have used an emission inventory of NOx in the Helsinki Metropolitan Area (Karppinen et al., 2000a) that has been updated to include data for the years 1996 and 1997. The inventory includes the emissions from various mobile sources (road traffic, harbours and marine traffic, and aviation) and stationary sources (power plants, other point sources and residential heating). The vehicular emission factors are based on the LIISA modelling system (Mäkelä et al., 1996; Laurikko, 1998). The

Compliance of measured concentrations with air quality guidelines and limit values

The national guidelines are defined on a monthly basis, as the 99th percentile of the hourly values for NO2 and the second highest daily mean value, both for NO2 and PM10. The national guideline value for both pollutants is 70 μg/m3. During the period 1996–2001 at the stations of Töölö and Vallila, there have been nine exceedences of the guideline threshold levels both for NO2 and PM10 (during 72 months). For NO2, most of these have occurred in winter, early spring or late summer, and for PM10,

Conclusions

This investigation presents the most extensive evaluation of NN models currently available for the prediction of urban NO2 and PM10 concentrations, both regarding the variety of models to be evaluated, and the amount of experimental data included. Besides non-linear NN models, the evaluation also included a statistical LIN and a DET. Previous studies have not analysed rigorously the relative performance of NN models with regard to DETs. The aim was to produce information on the suitability of

Acknowledgements

We wish to acknowledge the financial support of the European Commission for the APPETISE and FUMAPEX projects, and that of the Academy of Finland for the FORECAST project. We also wish to thank the Helsinki Metropolitan Area Council (YTV) for help in the utilisation of the air quality monitoring data, and W. Burrows, U. Schlink and G. Nunnari for their valuable comments.

References (29)

  • P. Tiitta et al.

    Measurements and modelling of PM2.5 concentrations near a major road in Kuopio, Finland

    Atmospheric Environment

    (2002)
  • J. Viidanoja et al.

    Organic and black carbon in PM2.5 and PM101 year data from an urban site in Helsinki, Finland

    Atmospheric Environment

    (2002)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • Foxall, R.J., Cawley, G.C., Dorling, S.R., Mandic, D.P., 2002. Error functions for prediction of episodes of poor air...
  • Cited by (307)

    • Predicting PM10 and PM2.5 concentration in container ports: A deep learning approach

      2023, Transportation Research Part D: Transport and Environment
    View all citing articles on Scopus
    View full text