Deep learning-based downscaling of tropospheric nitrogen dioxide using ground-level and satellite observations
Graphical abstract
Introduction
Air pollution continues to be a widespread issue despite government efforts to decrease the amount of pollution in the air since the 1970s. Urban areas are especially at risk for poor air quality and the associated health effects. According to the American Lung Association, approximately 45.8% of Americans live in counties with unhealthy air (American Lung Association, 2020). The timely and precise production and dissemination of air quality information (e.g., PM2.5, and PM10, and ozone) with a high-spatiotemporal resolution to urban citizens would be of great importance for them to make daily activity decisions for protecting their health and saving their lives eventually. Nitrogen oxides are a category of gases regulated by the United States Environmental Protection Agency (US EPA)—nitrogen dioxide being among the most important. Nitrogen dioxide (NO2) is mainly produced from the consumption of fossil fuel. By far, the leading contributors to nitrogen dioxide emissions are power plants, cars and trucks and non-road equipment. Breathing in high levels of NO2 can lead to respiratory problems. NO2 can cause coughing and wheezing symptoms by irritating the lining of the lungs, and impairs the ability of human bodies to defend pulmonary infections. Due its high sensitivity, NO2 is also an essential indicator of industrial production and can be utilized in the assessment of economic conditions (Duncan et al., 2016).
Reliable and comprehensive NO2 emission estimates are needed to evaluate air quality mitigation strategies, estimate industrial production, and as input to models for simulating and forecasting air pollution. Ground-level observations of NO2 are regularly measured by weather stations where air quality sensors are mounted, such as AirNOW, Purple Air, and IQAir. However, the discrete air quality observations are limited where no observed measurements are available. Satellite instruments, including Global Ozone Monitoring Experiment (GOME), Ozone Monitoring Instrument (OMI), and TROPOspheric Monitoring Instrument (TROPOMI), retrieve atmospheric trace gas concentrations in the atmosphere using spectroscopy. NO2 column density can be determined by measuring the backscattered light, and tropospheric NO2 column and stratospheric NO2 column are separated using a data assimilation system (Veefkind et al., 2012). The advantage of satellite NO2 observation is the capability of providing a comprehensive perspective on the spatial distribution of global emissions. However, TROPOMI's daily overpass limits the benefit of satellite NO2 observation in the temporal dimension, whereas NO2 values show a high daily variability (Blond et al., 2007). Accurate emission estimates remain clearly needed at the sub-urban scale on an hourly basis. Furthermore, high-resolution satellite observations for NO2 column densities are with relatively short historical records, such as TROPOMI that is only available since 2018. Climatological analysis is usually done with lower spatial resolutions using sensors such as OMI which has a resolution of 0.25 degree (Liu et al., 2018). A reliable method for estimating NO2 emissions with dataset that has a longer availability period is crucial for environmental analysis.
High resolution NO2 emission forecasts can be produced by numerical simulations, such as the Community Multiscale Air Quality Modeling System (CMAQ, Uno et al., 2007) and the Weather Research and Forecasting (WRF) model coupled with Chemistry (WRF-CHEM, Ghude et al., 2013). Although the simulated NO2 emissions correlate in a good agreement with satellite observations, high-resolution numerical simulations require time- and memory-consuming computations (Fuhrer et al., 2018), in addition, the high-resolution numerical weather prediction (NWP) data might not be available to all the public users (Baklanov et al., 2002). Spatiotemporal downscaling based on heterogeneous observations can provide an alternative approach to complement the spatiotemporal resolutions from different data sources. Existing downscaling methods include dynamical downscaling and statistical downscaling. Dynamical downscaling simulates using high-resolution physical local-area models based on low-resolution boundary conditions; however, it is computational demanding (Hong et al., 2017; Yahya et al., 2017; Wang et al., 2020). Statistical downscaling trains linear or nonlinear statistical models to estimate high-resolution information, but the downscaled variable is generally the same as the low-resolution origin (Zhu et al., 2016; Ahmed et al., 2018; Oteros et al., 2019; Khan et al., 2019). In addition, most existing downscaling applications for climate and meteorological data are based on structured grid, while few have explored on unstructured grid, such as generating high resolution information based on observations from discrete weather stations.
To fill the aforementioned gaps and produce a high-spatiotemporal resolution NO2 tropospheric column density product, this research proposes and compares two deep learning methods that learn the relationship between the ground-level NO2 observation from AirNOW and the tropospheric NO2 column density from TROPOMI. The input predictors include the locations of AirNOW stations, AirNOW NO2 observations, boundary layer height, other meteorological status, elevation, major roads, and power plants. The learned relationship can be used to produce NO2 emission estimates at the sub-urban scale on an hourly basis. The two methods include 1) an integrated method between inverse weighted distance and a feed forward neural network (IDW + DNN), and 2) a deep matrix network (DMN) that maps the discrete AirNOW observations directly to the distribution of TROPOMI observations. We compared the accuracy of both models in estimating tropospheric NO2 in the larger Los Angeles area, analyzed the feature importance of the input predictors, and examined the spatial distribution of prediction errors. The proposed methods and results can also be utilized on long-term climatic and environmental analysis with high spatiotemporal resolutions by inputting historical record of model predictors.
Section snippets
Spatial interpolation of airborne pollutants
Spatial interpolation is one of the most widely used methods to estimate the air pollution distribution where no observed measurements are available. A variety of spatial interpolation methods utilizes nonlocal geometric similarities to construct high-resolution images (Zhu et al., 2016), analyze the spatiotemporal variograms to conduct spatiotemporal kriging-based interpolation (Ahmed et al., 2018; Oteros et al., 2019), or examined the adjacent slope to perform the interpolation (Khan et al.,
Data
For model training and evaluation, the predictors are NO2 observed by ground-level stations, station location (longitude, latitude), boundary layer height, surface pressure, surface net solar radiation, 2 m temperature, 10 m UV wind components, and the predictand is the 5 km tropospheric NO2. Specifically, the ground-level NO2 observations are from EPA AirNOW, the surface meteorological variables are from ERA-Interim, and the tropospheric NO2 is from the TROPOMI's daily overpass from May 2018
Method
Fig. 2 illustrates the workflow of our downscaling methods. Before training, the TROPOMI tropospheric NO2 is preprocessed into regular 5 km grids (Section 4.1). In addition, a preliminary statistical analysis is conducted to examine the correlation between TROPOMI tropospheric NO2 and AirNOW NO2 based on spatiotemporal collocation of the two observations (Section 4.2). The two comparing models that we developed and explored are introduced in 4.3 IDW + DNN, 4.4 Deep Matrix Networks (DMN). And
Experiments
To compare different model architectures with respect to downscaling performance, we consider sample-wise deviations between target predictands and model predictions and investigate the extent to which the predictions depend on particular predictors. To examine the importance of different types of predictors, the models are trained with four different predictor configurations, including the station based NO2 and location only; providing boundary layer height predictor; providing more
Discussion
The meteorological variables used in this study are from ERA-Interim with 0.125° spatial resolution. ERA-Interim is only available till August 31, 2019 and has been superseded by the ERA5 reanalysis. Currently, ERA-5 has a resolution of 0.25°, and no other higher-resolution re-gridded datasets are available. We explored changing ERA-Interim to ERA-5 with the same temporal range to test the model performance, and the average RMSE decreased from 1– 2 to 3– 4 (unit: 1015 molec.cm−2). The decreased
Conclusions
In this study, we proposed, compared, and evaluated two deep learning methods for downscaling discrete ground-level NO2 observations to estimate tropospheric NO2 column density. The two specific methods are 1) an integrated method between inverse weighted distance and a feed forward neural network (IDW + DNN), and 2) a deep matrix network (DMN) that maps the discrete AirNOW observations directly to the distribution of TROPOMI observations. We investigated the network performance using the
CRediT authorship contribution statement
Manzhu Yu: Data curation, Conceptualization, Methodology, Code implementation, Experiment, Result analysis, Paper Writing. Qian Liu: Conceptualization, Result analysis, Paper review and editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors would like to acknowledge the anonymous reviewers for their insightful comments. The authors also acknowledge Jeremy Diaz and Taylor Blackman on recommending the inclusion of land use and road networks to the predictors.
References (33)
- et al.
Spatiotemporal interpolation of air pollutants in the Greater Cairo and the Delta, Egypt
Environ. Res.
(2018) - et al.
Spatiotemporal imputation of MAIAC AOD using deep learning with downscaling
Remote Sens. Environ.
(2020) - et al.
Improve ground-level PM2. 5 concentration mapping using a random forests-based geostatistical approach
Environ. Pollut.
(2018) - et al.
Low-cost NO2 Monitoring and Predictions of Urban Exposure Using Universal Kriging and Land-Use Regression Modelling in Mysore, India
(2020) - et al.
Spatial interpolation of current airborne pollen concentrations where no monitoring exists
Atmos. Environ.
(2019) - et al.
TROPOMI on the ESA Sentinel-5 Precursor: a GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications
Remote Sens. Environ.
(2012) - et al.
Full-coverage high-resolution daily PM2. 5 estimation using MAIAC AOD in the Yangtze River Delta of China
Remote Sens. Environ.
(2017) - et al.
High-performance computing for the simulation of dust storms
Comput. Environ. Urban. Syst.
(2010) - et al.
Decadal application of WRF/Chem for regional air quality and climate modeling over the US under the representative concentration pathways scenarios. Part 1: model evaluation and impact of downscaling
Atmos. Environ.
(2017) - et al.
Tropospheric emissions: monitoring of pollution (TEMPO)
J. Quant. Spectrosc. Radiat. Transf.
(2017)
Potential and shortcomings of numerical weather prediction models in providing meteorological data for urban air pollution forecasting
Water, Air and Soil Pollution: Focus
Atmospheric conservation properties in ERA-Interim
Q. J. R. Meteorol. Soc.
An improved tropospheric NO 2 column retrieval algorithm for the ozone monitoring instrument
Atmospheric Measurement Techniques
Cited by (14)
Predictive geochemical mapping using machine learning in western Kenya
2023, Geoderma RegionalMapping contiguous XCO<inf>2</inf> by machine learning and analyzing the spatio-temporal variation in China from 2003 to 2019
2023, Science of the Total EnvironmentCitation Excerpt :For example, machine learning methods have been applied to improve PM2.5 estimation in China by the use of AOD observations, meteorological factors, PM2.5 observations at ground stations and analyze the spatial and temporal changes of PM2.5 for health risk assessment and provide suggestions for pollution reduction (Kim et al., 2020; Wei et al., 2020; Xue et al., 2019). And machine learning methods have also been used in estimating ground NO2 by vertical column density and auxiliary information (Kang et al., 2021; Yu and Liu, 2021). For CO2, the neural network has been developed for the inversion of XCO2 from OCO-2 L1b data (David et al., 2021; Zhao et al., 2022) and to estimate regional-scale anthropogenic CO2 emissions (Mustafa et al., 2021).
Stereoscopic hyperspectral remote sensing of the atmospheric environment: Innovation and prospects
2022, Earth-Science ReviewsCitation Excerpt :First, based on the a prior knowledge obtained from the laws of physics, a feature selection and learning method based on deep residual networks is proposed for use in discovering the key symbiotic features and changes in multi-source and heterogeneous monitoring data (Li and Wu, 2021). Second, multi-source pollutant concentration data such as multi-satellite (i.e., TROPOMI and EMI) monitoring data and simulated atmospheric chemistry model (i.e., WRF-Chem and WRF-CMAQ) data, as well as external data such as meteorological parameters, topographic parameters, and road traffic network data should be fused to obtain a multi-source database (Yu and Liu, 2021). Third, a deep circulation network model is constructed by combining the dynamic characteristics of the pollutants on different time scales.
Intraurban NO<inf>2</inf> hotspot detection across multiple air quality products
2023, Environmental Research Letters