1 Introduction

At the time of writing, much of the world remains in the grip of the Covid-19 pandemic. The full social and economic consequences of the pandemic and its restrictions on our day-to-day activities will be far-reaching and will take time to fully identify and quantify. The primary method of slowing the infection rate has been to impose strict social distancing regulations known as ‘lockdowns’ where people are restricted to their own homes and all but essential economic activity ceases. Interestingly, one consequence of these lockdowns that became apparent from an early stage was a perceived improvement in air quality. As many shops and businesses closed, industrial activity and vehicle use in cities fell dramatically and reports emerged of pollution levels being considerably below those experienced in normal conditions. These reports first emerged in China but have since appeared in many other countries (New York Times 2020; Guardian 2020; Independent 2020; Space 2020). Such improvements in air quality and the likely associated health benefits have raised the prospect of an unlikely silver lining to the otherwise overwhelmingly negative impacts of the pandemic.

These apparent improvements in air quality have, however, raised a number of questions. First, which pollutants have actually fallen? Most media reports refer only to a reduction in Nitrogen Dioxide (NO\(_{2}\)) with no discussion of changes in other pollutants. Second, what benchmark is being used to measure any reduction? If pollution levels are changing year-on-year, comparisons with previous years may be misleading. Similarly, concentrations of many pollutants are significantly influenced by local weather conditions again making it difficult to compare emission levels with previous years or contemporaneously with other cities. Finally, what are the likely health benefits of any reductions in pollution? Using the example of the first Covid-19 lockdown, that happened in Wuhan, China, this paper addresses each of these questions in turn.

There are a number of reasons why it is important to understand how the response to the Covid-19 pandemic has affected pollution and pollution-related mortality. This pandemic and society’s response to it has been unprecedented in modern times and the social and economic impacts will be diverse and long-lasting. In order to hasten the recovery and to learn lessons for future pandemics it is vital that we understand every aspect of the economic, social and health impacts of Covid-19 and the policies used to tackle it. More specifically, it is important to understand how pollution (and health) responds to changes in social and economic activity. A city-level lockdown is an extreme case but, non-linearities aside, the pollution response to it informs us how different pollutants may respond to milder forms of restrictions on human activities such as congestion charging, pedestrianised zones and urban planning more generally. Furthermore, calculating the pollution and health benefits of the lockdown provides a unique opportunity to identify the costs incurred by society in going about its day-to-day business during “normal” times.

It is also important to understand how improved air quality as a result of the lockdown has lessened the strain on health services within cities such as Wuhan by reducing morbidity and mortality. Air pollution in China regularly exceeds World Health Organisation (WHO) guidelines and, in the absence of these pollution reductions, hospital admissions would almost certainly have been even higher given the clear links previously found between pollution, hospital admissions and mortality (eg. Maddison 2005, Shang et al. 2013; Lagravinese et al. 2014; Cheung et al. 2020; Deryugina et al. 2019).Footnote 1 Relatedly, there have been reports that exposure to pollution may increase Covid-19 mortality raising the possibility that Covid-19 death rates in cities such as Wuhan may have been even higher if the lockdown had not improved air quality (see for example Wu et al. 2020). Summing these latter two points, the cleaner air resulting from the lockdown may have increased the ability of hospitals to accommodate Covid-19 patients and directly reduced the number of Covid-19 deaths.

While it would therefore be useful if we could identify the impact of the Wuhan lockdown on pollution and health, isolating the effect of a policy intervention on pollutant concentrations is challenging since such concentrations are jointly influenced by meteorological conditions and emission levels. The influence of wind speed, wind direction, and temperature on pollution concentrations will often be greater than the effect of any policy intervention thereby significantly complicating attempts to isolate policy effects (Anh et al. 1997). Traditional attempts to address the impacts of weather on pollution trends have been econometric in nature and tend to struggle with the fact that the effects of weather on observed pollution trends tend to be non-linear, subject to interaction effects and not independent of each other. Attention has therefore turned to the use of predictive machine learning methods as a means of more effectively capturing the influence of meteorological variables on pollution. Grange et al. (2018) for instance develop a weather normalisation technique based on the random forest machine learning model and remove the effect of weather conditions from Swiss PM10 concentrations data. They argue that this technique performs better than more traditional techniques and benefits from the fact that it does not need to conform to strict parametric assumptions. Grange and Carslaw (2019) use the same technique to examine NO\(_{2}\) and NOx concentrations in London to isolate the effect of the Central London congestion charge. They identify clear features in the pollution data that were not detectable prior to weather normalisation. Finally, Vu et al. (2019) utilise a similar random forest machine learning approach to weather normalise six key pollutants in Beijing from 2013 to 2017. Their analysis reveals the extent to which meteorological variables influence observed pollution data and allows them to identify the effect of the 2013 Beijing Clean Air Action plan.

The purpose of this paper therefore is to quantify the causal impact of the Wuhan lockdown on local air pollution levels. To do this we apply a combination of state-of-the-art machine learning and synthetic control methods to a number of different air pollutants in Wuhan, China. Wuhan is a city of approximately 11.1 million people and was the largest of the 17 cities in Hubei province to be locked down at 10:00 am on 23rd January 2020.Footnote 2 Other large cities in China did not lockdown for at least another 2 weeks, providing us with a unique natural experiment to investigate how air pollution levels respond to a sudden and abrupt decrease in economic activity. Our contribution is three-fold. First, we apply the latest weather normalisation techniques developed by Grange et al. (2018) and Vu et al. (2019) to remove the effect of local weather conditions on pollution concentrations. To do so we utilise hourly city-level concentrations of four pollutants: sulphur dioxide (SO\(_{2}\)); nitrogen dioxide (NO\(_{2}\)); carbon monoxide (CO); and particulate matter (PM10) between January 2013 and February 2020, for thirty Chinese cities. Second, taking the weather normalised concentrations data, we apply the recently developed “ridge augmented synthetic control approach” (ridge ASCM) developed by Ben-Michael et al. (2019) to investigate the causal impact of the Wuhan lockdown on local air pollution levels. Ben-Michael et al. (2019) improve upon the standard synthetic control method by removing the bias that can result from imbalance in pre-treatment outcomes. Third, using a selection of mortality estimates from the existing literature we calculate the potential deaths prevented in Wuhan city, Hubei province and China as a whole, due to improved air quality.

To briefly summarise our results, we find that Wuhan experienced a significant reduction in concentrations of NO\(_{2}\) and PM10 as a result of the Covid-19 lockdown. Concentrations of NO\(_{2}\) fell by as much as 24 \(\upmu\)g/m\(^3\) during our analysis period in January/February 2020 (a reduction of 63% from the pre-lockdown level of 38 \(\upmu\)g/m\(^3\)), while PM10 concentrations fell by approximately 22 \(\upmu\)g/m\(^3\), albeit for a shorter period (a reduction of 35% from the pre-lockdown level of 62 \(\upmu\)g/m\(^3\)). It is notable that these reductions brought NO\(_{2}\) concentrations from a level very close to the WHO safe limit (40 \(\upmu\)g/m\(^3\)) to well within the limit, while PM10 fell from a level way beyond the safe limit (20 \(\upmu\)g/m\(^3\)) to a level that was still in excess of the safe limit. Perhaps surprisingly, we find no significant reductions in concentrations of SO\(_{2}\) or CO. Our analysis of the mortality effects associated with the reduced NO\(_{2}\) concentrations suggests that the lockdowns may have prevented up to 496 deaths in Wuhan, 3368 in Hubei province and 10,822 in China as a whole.

The remainder of the paper proceeds as follows. Section 2 describes our data and methodology. Section 3 provides our results and Sect. 4 examines the health implications of our findings. Section 5 concludes.

2 Data and Methodology

City-level hourly concentrations of four pollutants (SO\(_{2}\), NO\(_{2}\), CO, PM10) for thirty Chinese cities were collected from ‘Qingyue Open Environmental Data Center’ between 18th January 2013 and 29th February 2020.Footnote 3 City-level hourly pollution concentrations were calculated by averaging across all of the monitoring stations for each city. Similar data from the same source has been used in a number of other studies including He et al. (2016) and Qin and Zhu (2018). The meteorological data is from the “worldmet” R package (NOAA 2016) developed by Carslaw (2017) and includes information on temperature, relative humidity, wind direction, wind speed and air pressure. We then match the city-level hourly pollution data with the city-level hourly meteorological data to generate our data sample.Footnote 4 Table 1 provides the sources and health impacts of each pollutant and shows they are produced by differing combinations of electricity generation, industrial processes and road traffic. Available evidence suggests that road traffic is likely to be the largest source of CO and NO\(_{2}\) concentrations in Chinese cities while electricity generation and coal burning will be the largest sources of SO\(_{2}\). Sources of PM10 are difficult to quantify but will include all of those previously mentioned (Zhao et al. 2013; US EPA 2020).

Table 1 Sources and health effects of our four pollutants.

This paper uses two steps to identify the causal impact of the Wuhan lockdown on local air pollution levels. The first step is to conduct a random forest-based weather normalisation on four pollutants separately for thirty Chinese cities to obtain both hourly-observed and weather normalised pollution concentrations. The second step is to aggregate the hourly weather normalised pollution numbers into daily values, and then to take these daily observations for the thirty cities and use a (ridge) augmented synthetic control method on this data to estimate how concentrations levels in Wuhan have changed relative to the synthetic control. The weather normalisation procedure is conducted on the observed hourly pollution data before running the synthetic control method for two main reasons: First, for policy evaluation analysis within environmental economics, it is difficult to evaluate the efficacy of policy on air pollutant concentration levels since the change of pollutant concentration levels are co-influenced by both meteorological conditions and emission levels. This means that it is difficult to clearly identify whether an improvement in air quality is due to a true fall in emissions or is simply a result of weather conditions that give the appearance of lower concentrations at the measurement stations (Grange et al. 2018; Grange and Carslaw 2019). As these studies have shown, the best way of accounting for local meteorological conditions is to remove their impact from the pollution concentration observations. Stripping out the local weather effects allows policy makers and social planners to make better informed decisions on the efficacy of previous air pollution interventions which, in turn, will help guide future policy decisions (Wise and Comrie 2005).

The second advantage of weather normalising is that, according to Abadie (2019), a key principle behind the synthetic control method is the comparative case study, where the impact of a policy intervention can be estimated by comparing the movement of the outcome variable of interest between a single treatment unit and a number of control units. Ideally, the control units should be as similar as possible to the single treatment unit but not exposed to the policy intervention. Abadie (2019) emphasises that, if the outcome variable is highly volatile, researchers will not be able to detect the effect of the policy intervention, no matter what the size of the ‘real’ intervention effect might be. In our case, daily air pollution concentration levels in Wuhan are extremely volatile which would lead to potential problems of overfitting. If there exists substantial volatility in the outcome variable, Abadie (2019) advises that it is removed in both the treatment and control units prior to applying the synthetic control method. Therefore, weather noise is removed from the observed air pollution concentrations for all thirty cities using the machine learning algorithm. The result is a more reliable estimation of the Wuhan lockdown effect on local air pollution levels, i.e. the pure human activity induced pollution with the natural variability due to weather conditions removed.

2.1 Machine Learning

In recent years the use of machine learning (ML) techniques has grown rapidly due, in part, to the availability of ‘big data’ and improved computational power. Supervised ML focuses on prediction problems, given a set of data that contains the outcome variable (the variable of interest) and predictors (a set of independent variables that are used to predict the outcome variables). The whole dataset is split into a training set (used to build up the prediction model) and a test dataset (used to test the prediction accuracy /performance of the model). It is referred to as supervised ML because the outcome variable is available to guide and oversee the process of building the prediction model.

Machine learning is a powerful tool as it provides a way to analyse both linear and nonlinear relationships within the data (Varian 2014). Increasingly, economists use ML methods in combination with other data analytical approaches. For example, Mullainathan and Spiess (2017) demonstrate the importance of using supervised ML methods in regression analysis while Athey and Imbens (2019) discuss the differences between econometrics and machine learning in terms of goals, methods and settings, and demonstrate the gains from interacting ML and econometrics.

2.1.1 Decision Trees and Random Forests

The meteorological normalisation technique applied in this paper is based on the random forest algorithm. A regression tree is obtained by binary recursively partitioning a single predictor each time over a threshold until a purity of the node is reached (i.e., the node cannot be further split) (Breiman et al. 1984). A decision tree model is easy to train and is highly interpretable. However, decision trees can be prone to overfitting i.e. decision tree predictions can be inaccurate (Hastie et al. 2009). Hence, predictions obtained from decision trees alone are optimal for a given (training) dataset, but could result in low prediction accuracy for a new dataset (Athey and Imbens 2019).

To overcome the inherent disadvantage with decision trees, Breiman (2001) introduced the random forest algorithm. The performance of an algorithm is improved if it can be used on a larger number of datasets. One solution, when there is just one dataset, is to add randomness to the data by use of the bootstrap and bagging method (Varian 2014). Bootstrapping refers to randomly sampling (with replacement) observations from the original dataset, and bagging refers to the process of obtaining an estimation by averaging results from a large number of bootstrapped samples. The random forest algorithm essentially consists of a large number of individual decision trees (grown from different bootstrap samples), and is obtained by averaging the estimations from the whole forest. Compared with a single decision tree, the random forest approach can greatly increase the performance of the prediction.

The random forest approach is relatively simple and fast to train and performs well even when using high dimensionality data (i.e. a large number of predictors/features/independents). Random forests also allow for a more flexible relationship than that allowed by a simple linear model, as it relaxes the critical assumptions on data that are always required by conventional regression methods (e.g., sample normality, homoscedasticity independence, etc.). In addition, interactions and correlations between the predictors are not restricted. More importantly, a random forest approach provides a measure of the importance of different variables and predictor selections (Varian 2014; Ziegler and König 2014).

2.1.2 Weather Normalisation

Grange et al. (2018) were the first to introduce machine learning techniques to weather normalise trends in air pollution data (i.e. the time and meteorological variables). Their approach was to apply a random forest algorithm to predict concentrations of different pollutants at a specific time using a ‘re-sampled predictor data set’. Take March 15, 2013 for an example. The time and meteorological variables on any given day in the original predictor dataset are randomly selected. The random forest predictive algorithm is repeated 1000 times and then the different predictors are fed into the random forest model which in turn predicts the concentrations of the different pollutants on March 15, 2013. The final weather-normalised concentrations on this day are obtained by averaging these 1000 predictions for each pollutant. Note that Grange et al. (2018) not only de-weathered from observed concentration levels, but also removed time trends from the data. The disadvantage is that this approach of de-weathering and removing the time trends can lead to an inability to detect the seasonal variation in the weather normalised concentrations data. This also makes it harder to compare the same time period in different years (which we utilize later in our sensitivity checks).

The solution to the seasonal/time trend problem discussed above is to extend the weather normalizing procedure by de-weathering using only the pollution concentration observations (Vu et al. 2019). The Vu et al. (2019) algorithm includes a new predictor data set that is generated by randomly selecting only the weather variables from the original dataset. For example, for 09:00, 15 March 2013, only the weather variables were randomly selected from the original data set within a 4-week range to construct the new predictors data (i.e. at 09:00 on any date between 1 and 29 March on any year between 2013 and 2018). This process is conducted 1000 times, and the results fed into the random forest model to give 1000 predicted concentrations for that specific hour of 09:00, 15 March 2013 using 1000 columns of randomly sampled weather predictors. The final weather normalised concentration level at 09:00, 15 March 2013 is calculated by averaging the 1000 predicted concentrations. This means it is still possible to detect seasonal variation within the weather normalized concentrations (Vu et al. 2019).

In this paper we apply the weather normalised procedure of Vu et al. (2019) using the ‘rmweather’ R packages developed by Grange et al. (2018). A decision-tree-based random forest model is grown for each of our four air pollutant concentrations for each of our thirty cities, as dependent (output) variables, and the time and meteorological variables as predictors (input variables). Each variable is shown in Table 2. For an illustration of the process for building a random forest model and how the weather normalisation process is conducted see Vu et al. (2019). The whole observed data was randomly sampled into a training set (80%) and a test set (20%). The training set was used to train the random forest model and the test set to test model performance.

Table 2 A list of input and output variables used in this study

Following Vu et al. (2019), a forest of 300 (\(n\_trees = 300\)) is used and the number of times we sample the whole data and then predict is 300 (\(n\_samples = 300\)). The number of variables that may split at each node is three (\(mtry = 3\)) and the minimum size of terminal nodes for the model is three (\(min\_node\_size = 3\)). For the weather normalisation procedure, the meteorological variables are randomly selected 300 times (within a 4-week range) from the observed meteorological dataset (between January 2013 and February 2020). The selection was repeated 300 times and then fed into the random forest model to predict the concentration levels. The final weather normalised concentrations are found by averaging the 300 predicted values from each hour.Footnote 5

2.2 The Augmented Synthetic Control Method

The synthetic control method (SCM) was first developed by Abadie and Gardeazabal (2003) and has since been used to investigate a number of different questions particularly in labour, development and health economics (see e.g. Cavallo et al. 2013; Kleven et al. 2013; Kreif et al. 2016; Dustmann et al. 2017; Mohen 2017; Xu 2017; Johnston and Mas 2018). Athey and Imbens (2017) argue that the synthetic control method is “arguably the most important innovation in the policy evaluation literature in the last 15 years”.

The design of the SCM is similar to that of the traditional difference-in-difference setting where the goal is to find an appropriate control unit that is comparable to the treatment unit (the city or country that is exposed to an intervention). In this paper, as we are interested in testing the effect of the Wuhan lockdown on local air pollution levels, the ideal solution would be to find a city in China that did not experience a lockdown but is very similar to Wuhan across a range of different characteristics (e.g., the level of economic development, industrial structure, population, current pollution levels, etc.). However, in reality no one city is likely to match Wuhan that closely. By taking a SCM approach we employ a data-driven procedure that uses a weighted average of a group of control cities to simulate or construct an artificial or ‘synthetic’ Wuhan. The goal of the synthetic Wuhan is to reproduce the trajectory of the air pollution levels in real Wuhan before the lockdown. Then, after the lockdown, the difference in the trajectories between the synthetic and real Wuhan can be summarised as the causal impact of the lockdown. In a sense, the synthetic Wuhan is the counterfactual air pollution evolution that Wuhan would have experienced had it not been locked down (Abadie et al. 2015).

There are a number of advantages with taking a SCM approach. For example, no extrapolation is needed and the synthetic weights are calculated and chosen without using the post intervention data that rules out the risk of specification cherry picking or p-hacking. Moreover, the contribution of each control unit to the overall synthetic unit is explicitly presented so the transparency of the counterfactual allows one to validate the weights using expert knowledge (Abadie 2019). However, Abadie et al. (2015) caution that the SCM may not provide meaningful estimations if the outcome trajectory of the synthetic unit does not closely match the outcome trajectory of the treatment unit before the intervention.

One solution to concerns about outcome trajectories is proposed by Ben-Michael et al. (2019) who propose an augmented synthetic control method (ASCM). The ASCM extends the SCM to those cases where a good pre-intervention match between treatment and synthetic unit is not achievable. ASCM uses an outcome model to estimate the bias due to the poor pre-intervention match and then corrects for the bias in the original SCM estimate. The Ben-Michael et al. (2019) approach is to use a ridge-regularized linear regression model that relaxes the non-negative weights restriction of the original SCM and allows for negative weights within the Ridge ASCM.

In this paper we follow the conventional panel data setting used by Ben-Michael et al. (2019) given by:

$$\begin{aligned} Y_{it} = {\left\{ \begin{array}{ll} Y_{it}(0)\quad {if}\quad W_{i} = 0\quad {or}\quad t\le T_{0} \\ Y_{it}(1)\quad {if}\quad W_{i} = 1\quad {and}\quad t > T_{0} \end{array}\right. } \end{aligned}$$
(1)

where \(Y_{it}\) is the outcome variable of interest, the weather normalised air pollutant concentration levels for four different pollutants, for city i and date t (where i = 1,...,N and t = 1,..., T), \(W_{i}\) refers to the indicator that city i received the order to lockdown at time \(T_{0}\) \(\le T\), where \(W_{i}=0\) is that there never was a lockdown intervention. \(T_{0}\) refers to the date of lockdown. \(Y_{it}\)(0) and \(Y_{it}\)(1) refer to the outcome variable of city i in date t within the control group and treatment group (Wuhan in our case), separately.

The estimated treatment effect of interest, the effect of the Wuhan lockdown on local air pollution levels, is given by: \(Y_{1}(1) - Y_{1}(0) = Y_{1} - Y_{1}(0)\). The SCM imputes the \(Y_{1}\)(0) as a weighted average of the outcome variable within the control group, \(Y_{0}^{'}\gamma\). Ben-Michael et al. (2019) explain that the way to choose the weights is the solution to a constrained optimization problem. In the special case where the working outcome model is a ridge-regularized linear model, the bias corrector estimator for \(Y_{1}\)(0) can be written as:

$$\begin{aligned} {\hat{Y}}_{1}^{aug}(0) = \sum _{W_i=0}{\hat{\gamma }}^{scm} Y_i + \left(X_1 - \sum _{W_1=0}{\hat{\gamma }}^{scm} X_i\right){\hat{\eta }}^r \end{aligned}$$
(2)

The ridge ASCM can enhance the pre-intervention fit between the synthetic and treatment units compared to the SCM alone by allowing for negative weights. It can also directly penalize the potential extrapolation. Within the ridge ASCM, the hyper-parameter \(\gamma\) plays a significant role in identifying the trade-off between a better pre-intervention match and a larger approximation error.

Our target city, Wuhan, was given the order to lockdown on January 23rd, 2020. The other 29 cities in our sample did not lockdown on this date.Footnote 6 However, although they did not lockdown immediately, the majority of the cities in the control group entered a lockdown period between the 3rd and 5th of February 2020. Therefore, in the analysis we examine data up to the 3rd February. This means our analysis is limited to a twelve-day post lockdown period.Footnote 7 We use a 1-month (thirty days to be precise) pre-period to construct our synthetic Wuhan. Ordinarily, we would set the Wuhan lockdown date as January 23rd 2020 to match the official government announcement that Wuhan would be locked down at 10:00 am on that day. However, following Abadie (2019), if there is an anticipation effect, the researcher should backdate the intervention date to allow for the full extent of the policy intervention to be fully estimated. We therefore test a number of different starting dates and reassuringly our results are not sensitive to the choice of date.

Nevertheless, we set January 21st 2020 as the intervention starting date, since human activity that might affect local air pollution levels may already have been adjusted before the official announcement. More importantly, it is reasonable to believe that some lockdown measures and regulations were being adopted by local government officials prior to the official announcement as it is likely that local officials would have known some time in advance despite things moving so fast during this difficult period. Our choice was also influenced by the trend in NO\(_{2}\) concentrations that showed a clear reaction on that date.

Finally, before we show the results it is worth putting the Wuhan lockdown in context for those less familiar with the economy of Wuhan and how it relates to our group of control cities. Table 7 in the “Appendix” includes summary statistics for Wuhan and the averages for the other 29 cities in the control group while Fig. 13 presents a map of China showing Wuhan and the control cities. The other cities are fairly evenly distributed. Table 7 indicates that Wuhan is only slightly larger than the control cities in terms of population but has a higher population growth rate. However, it is geographically smaller on average (less than half the size) so has a higher population density. On average it is richer than the average of the control cities and is ranked around fifth in terms of per capita gross regional product. Wuhan is a city of 11 million people and a major industrial hub. The dominant industries include automobiles, manufacturing of electronic and optical communication equipment, pharmaceutical and chemical manufacturing, and iron and steel manufacturing. The automobile sector is a particularly important and includes, for example, the $9.4 billion Chinese automotive company Dongfeng Motor Corp, that has joint venture partners with Nissan and Honda.

In terms of air pollution Wuhan is not particularly out of line with other cities of comparable size. Figure 1 plots the annual average observed concentrations for the four pollutants that we use in the paper across the 30 cities in our data sample for the period 2013 to 2019. Wuhan is denoted by the red line. As can be seen, Wuhan is towards the lower end in terms of SO\(_{2}\), but has relatively higher levels of NO\(_{2}\). In terms of CO and PM10, Wuhan is around the average. This figure shows that Wuhan is fairly typical.

Fig. 1
figure 1

The annual average observed concentrations of SO\(_{2}\), NO\(_{2}\), CO and PM10 in Wuhan and 29 control cities between 2013 and 2019. Note: Wuhan is denoted by the red line

In terms of the lockdown itself, all transport in and out of Wuhan was shut down, including the closure of public transit, trains, airports, and major highways. In addition, in a now familiar story across the world, all shops were closed except those selling essentials, all private vehicles were banned (except for those with a special permit), all public transport was banned (except for a small number of taxis), public gatherings were prohibited, and there was a policy of enclosed community management. However, key producers of steel, chemicals and semiconductors remained in operation as well as electric utilities.

In addition, it is possible to get some idea of the reduced movement of people within Wuhan. If one looks at Baidu Migration data (provided by Baidu which is the dominant search engine in China), based on the Location Based Service platform of Baidu Maps, we can observe real-time population movements including a “daily out-flow migration index of a city”, a “daily in-flow migration index of a city” and a “daily within-city migration index of a city” (Fang et al. 2020). For this paper we looked at the “within city migration index” to give an indication of the intensity of the within city traffic movement before the Wuhan lockdown (22 Jan 2020) and after the lockdown. What the results show is that movement levels fell to very low levels in Wuhan compared to the other 29 cities. This indicates how effective the lockdown was in terms of restricting movement and how such restrictions were certainly not in place in the other 29 cities for this period. The reduced movement of people also helps to explain the reduction of NO\(_2\) which is a result of the “traffic lockdown”, i.e. the restriction of traffic mobility/or the reduction in traffic-relate emissions.

3 Results

3.1 Machine Learning Results

Figure 2 presents a plot of daily pollution concentrations to show the overall trends in the observed data (grey line) and the weather normalised (red line) data for SO\(_{2}\) (\(\upmu\)g/m\(^3\)), NO\(_{2}\) \(\upmu\)g/m\(^3\)), CO (mg/m\(^3\)) and PM10 (\(\upmu \hbox {g}/\hbox {m}^3\)) respectively between January 2013 and February 2020.Footnote 8 It can be seen that both observed and weather normalised concentration levels have generally fallen over time, particularly so for SO\(_{2}\). This reduction has been driven in part by strict government regulations and a desire to reduce local air pollutants. More importantly, Fig. 2 illustrates the impact of our weather normalization process with clear differences being seen between the observed concentrations and weather normalised concentrations with the latter being a much smoother data series.

Fig. 2
figure 2

Daily averages of observed and weather normalised concentrations of SO\(_{2}\), NO\(_{2}\), CO and PM10 in Wuhan between January 2013 and February 2020

Concentrating on the more recent period, Fig. 3 presents the daily plots of observed and weather normalised trends for SO\(_{2}\), NO\(_{2}\), CO and PM10 in Wuhan between 21st December 2019 and the 3rd of February 2020. Again, it is clear that the trends in the weather normalised pollutants are less volatile and noisy compared to the observed values and shows the extent to which weather conditions influence recorded pollution levels from stations. Figure 3 also illustrates how difficult and potentially misleading it would be to identify the effect of the lockdown (the dotted vertical line) on pollution concentrations using observed values of pollution only.

Fig. 3
figure 3

The comparison of daily observed and weather normalised concentrations of SO\(_{2}\), NO\(_{2}\), CO and PM10 in Wuhan between 21st December 2019 and 3rd February 2020

3.2 The Impact of the Wuhan Lockdown on Local Air Pollution Using Ridge ASCM

To present the results we consider each of our four pollutants in turn. The plots in Figs. 4, 5, 6, 7, 8 and 9 are all plotted using plus or minus one standard error. Figure 4 (left) plots the difference in the weather normalised NO\(_{2}\) (NO\(_{2}\)wn) levels between synthetic Wuhan and Wuhan. Figure 4 (right) plots the trend in the weather normalised NO\(_{2}\) level of both synthetic and real Wuhan. The vertical line again refers to the Wuhan lockdown date. For NO\(_{2}\) we chose January 21st 2020 as the intervention start date as we found a significant anticipation effect for NO\(_{2}\)wn, i.e., the NO\(_{2}\)wn in Wuhan began to fall significantly and substantially below that of synthetic Wuhan from January 21st, 2020. As can be seen, the synthetic Wuhan does a good job in simulating the NO\(_{2}\)wn trend in Wuhan before the lockdown. Both trends were around 45–52 \(\upmu\)g/m\(^3\) between December 21st and the 27th (notably above the WHO safe limit of 40 \(\upmu\)g/m\(^3\)) before they began to fall in January 2020 to around 35–40 \(\upmu\)g/m\(^3\). The fall coincides with the Spring break in China where economic activity usually drops considerably. After the 21st January 2020, a large and significant gap opens up between NO\(_{2}\)wn emissions in Wuhan and synthetic Wuhan with a peak difference of around 24 \(\upmu\)g/m\(^3\), equivalent to a reduction of 63% of the level of NO\(_{2}\) concentrations (38 \(\upmu\)g/m\(^3\)) immediately prior to the lockdown. As time goes on the gap between the series closes a little but is still more than 15 \(\upmu\)g/m\(^3\) at the end of the twelve-day period. Notably, NO\(_{2}\) has fallen to a limit that is now comfortably below the WHO safe limit. The right figure plots the trend between synthetic and real Wuhan between December 21st 2019 and 3rd February 2020. The blue vertical line again refers to the intervention date (21st January 2020). The synthetic Wuhan weather normalised NO\(_{2}\) levels were consistently between 33 and 40 \(\upmu\)g/m\(^3\), whereas the level in Wuhan dropped substantially to around 20 \(\upmu\)g/m\(^3\) 3–4 days after lockdown, and remained below 20 \(\upmu\)g/m\(^3\) until the end of study period. The results show that the lockdown led to a large reduction in weather normalised NO\(_{2}\) level in Wuhan.

Fig. 4
figure 4

Ridge ASCM results on weather normalised NO\(_{2}\) concentrations in Wuhan. Note: Left hand figure shows point estimate ± one standard error of the ATT

We now consider SO\(_{2}\). Figure 5 (left) plots the difference in the weather normalised SO\(_{2}\) (SO\(_{2}\)wn) level between synthetic Wuhan and real Wuhan while Fig. 5 (right) plots the evolution of trends in weather normalised SO\(_{2}\) levels for both synthetic and real Wuhan. The vertical line is drawn on the Wuhan lockdown date of January 22, 2020. As shown, differences in SO\(_{2}\)wn between the synthetic and real Wuhan are negligible suggesting that the other 29 cities did a good job in simulating the trajectory of pollution concentrations in Wuhan. After the lockdown, the SO\(_{2}\)wn level in Wuhan was around 1.7 μg/m3 lower than if Wuhan had not been locked down. However, the reduction disappears 3–4 days after lockdown and returns to the same trend that the other 29 cities were following. It is worth noting that even at the 3–4-days mark, which is equivalent to a 1.7 μg/m3 reduction in SO\(_{2}\) in Wuhan, the reduction is only marginally significant.

Fig. 5
figure 5

Ridge ASCM results on weather normalised SO\(_{2}\) concentrations in Wuhan. Note: Left hand figure shows point estimate ± one standard error of the ATT

Moving on to CO, Fig. 6 plots the results for weather normalised CO levels. In this case, synthetic Wuhan is not a good match with the pre-policy real Wuhan. This means we cannot confidently draw conclusions on the impact of the Wuhan lockdown on local CO levels.

Fig. 6
figure 6

Ridge ASCM results on weather normalised CO concentrations in Wuhan. Note: Left hand figure shows point estimate ± one standard error of the ATT

Finally, Fig. 7 (left) plots the difference in the weather normalised PM10 (PM10wn) level between synthetic Wuhan and real Wuhan. Figure 7 (right) plots the trend of weather normalised PM10 level of both synthetic and real Wuhan. The vertical line coincides with a Wuhan lockdown date of January 22nd, 2020. The trajectories of synthetic Wuhan and real Wuhan were closely matched prior to the lockdown. After the lockdown the trends begin to diverge with the difference increasing to 22 \(\upmu\)g/m\(^3\) four to five days after lockdown (a reduction of 35% from the pre-lockdown level of 62 \(\upmu\)g/m\(^3\)). The fall in Wuhan became significant on the third or fourth day. Notice that after seven to eight days the difference in the trends became insignificant. Thus, the lockdown of Wuhan led to a significant but short-lived reduction in PM10 levels and did so from levels that were way above the WHO safe limit of 20 \(\upmu\)g/m\(^3\) to levels that were still beyond this limit.

Fig. 7
figure 7

Ridge ASCM results on weather normalised PM10 concentrations in Wuhan. Note: Left hand figure shows point estimate ± one standard error of the ATT

To summarise the baseline results, they demonstrate that, relative to the control, the Wuhan lockdown led to a large and significant reduction in NO\(_{2}\) concentrations, a smaller and more short-term reduction in PM10, but no significant fall in SO\(_{2}\) levels. For CO the pre-policy fit was not considered strong enough for us to draw any firm conclusions.

3.3 Placebo Tests

To validate our baseline results we follow Abadie et al. (2015) and conduct a series of placebo tests. We begin with an in-time placebo test and then estimate an in-place placebo test and finally we rerun our estimations using alternative control groups for NO\(_{2}\) and PM10. The placebo test results give us confidence that our main findings are not through chance.

3.3.1 In-time Placebo Test

For the in-time placebo test we assume that the Wuhan shutdown happened on the same date but one or two years earlier, in either 2018 or 2019. Figure 8 shows the results of in-time placebo tests for NO\(_{2}\)wn. Apart from the date of the lockdown we use the exact setting and run the exact same code for the placebo test. On the left-hand side of Fig. 8 we focus on the data period between 21st December 2018 and 3rd February 2019 and then set the fake lockdown to be 21st January 2019. The right-hand side of Fig. 8 presents the results for the period between 21st December 2017 and 3rd February 2018 with a fake lockdown set at 21st January 2018. For both in-time placebo tests we did not find any significant reductions in NO\(_{2}\) for these two fake lockdown dates. Figure 9 presents the results for PM10wn. Again, for both 2018 and 2019 there was no significant difference between synthetic Wuhan and real Wuhan.

Fig. 8
figure 8

The in-time placebo test results of NO\(_{2}\)wn using 21st January 2019 (left) and 21st January 2018 (right) as Wuhan lockdown date. Note: Both figures show point estimate ± one standard error of the ATT

Fig. 9
figure 9

The in-time placebo test results of PM10wn using 22nd January 2019 (left) and 22nd January 2018 (right) as Wuhan lockdown date. Note: Both figures show point estimate ± one standard error of the ATT

3.3.2 In-place Placebo Test

Our second placebo test is an in-place test. We randomly assign the lockdown policy to one of the other 29 control cities. Given there was no lockdown in any other city on that date we would not expect to find any sizable reduction effect. Our approach is to assign each of the other cities to be a ‘synthetic Wuhan’ and again use the exact same setting and code to run the ridge ASCM model. Figure 10 plots the difference between a synthetic trend of 29 different lines using 29 different control cities, plus our main findings on NO\(_{2}\) for the real Wuhan lockdown (the red line). The real Wuhan stands out from the other 29 lines, none of which showed a similar reduction (over 20 \(\upmu\)g/m\(^3\)).

Fig. 10
figure 10

The results of in-place placebo test on NO\(_{2}\)wn. Note: We randomly assign the lockdown policy to one of the other 29 control cities and compare with Wuhan (in red)

As shown in Figure 11, the results for PM10wn are a little different from the NO\(_{2}\)wn results in that we found similar size effects for four of our synthetic Wuhan lines. However, the results for these four cities are not significant. If we drop these four lines (representing Shijiazhuang, Jinan, Hangzhou and Huhehaote) we have the right-hand figure, where the red line stands out in the early period of the lockdown. The results are consistent with our baseline findings for Wuhan weather normalised PM10 levels where we only found a significant reduction two to seven days after lockdown which is where the red line on the right figure shows the largest reduction compared to the remaining grey lines.

Fig. 11
figure 11

The results of in-place placebo test on PM10wn. Note: The left figure plots the results using all 30 cities, the right figure plots the results after dropping Shijiazhuang, Jinan, Hangzhou, Huhehaote

3.3.3 Alternative Control Groups

Our final sensitivity check is to use a range of different control groups to run the ridge ASCM model to check whether the results are sensitive to the initial choice of our 29 large cities. In addition to the full 29 city control group, we also re-estimate the results using four alternative control groups that we call synthetic control groups 1, 2, 3 and 4 (Syn_CG1, Syn_CG2, Syn_CG3 and Syn_CG4). The detailed list of each control group is provided in “Appendix” Table 6. The alternative control groups use Province capitals only; Northern cities only (that may be more similar to Wuhan); a smaller group of cities that did not experience a lock down before March 2020; and a final group that did lock down after 3rd February.

Figure 12 shows the results from creating a synthetic Wuhan from four alternative control groups on weather normalised NO\(_{2}\) and PM10, respectively. Figure 12 shows that all five control groups closely match the pre-lockdown trends for NO\(_{2}\)wn and PM10wn. The five different controls also show similar post intervention trajectories suggesting that our findings of the causal impact of Wuhan lockdown on local NO\(_{2}\) and PM10 level are not sensitive to the choice of control group.

Fig. 12
figure 12

The results of alternative control group tests on NO\(_{2}\)wn (left) and PM10wn (right). Note: “Appendix” Table 6 defines each control group

4 The Health Implications of China’s Falling Pollution

Having established the impact of Wuhan’s lockdown on pollution concentrations we undertake a simple back of the envelope exercise to calculate the potential lives saved as a result of the improved air quality. For simplicity we focus only on the reduction in NO\(_{2}\) concentrations.

Our results in Fig. 4 indicate that the reduction in concentrations of NO\(_{2}\) varied between 15 and 24 \(\upmu\)g/m\(^3\) during the period between the start of the lockdown and the end of our estimation period in early February. Since we do not have a usable control group beyond early February, we are unable to estimate how long the reductions in pollution continued for but, for the purposes of this exercise, we model lives saved if concentrations fell by 20 \(\upmu\)g/m\(^3\), and a more conservative estimate of 10 \(\upmu\)g/m\(^3\), over the full 2.5  months of the lockdown.

To begin, we draw upon a number of studies that have estimated the mortality effects of NO\(_{2}\) concentrations. Next, from the National Bureau of Statistics we calculate the monthly mortality rate in Wuhan (0.045917%) which we apply to the population of Wuhan which was 11.081m in 2019.Footnote 9 We then calculate how much lower mortality would have been over 2.5 months as a result of our estimated reduction in pollution.

Table 3 summarises the various studies that have estimated the mortality effects of NO\(_{2}\) concentrations and presents the range of estimated effects. Table 4 then utilises each of these effects, in the manner described above, to produce our estimates of lives saved. As can be seen, the estimated lives saved in Wuhan city as a result of the full 2.5 month lockdown range from 183 to 496 for a 20 \(\upmu\)g/m\(^3\) reduction in NO\(_{2}\) and between 92 and 248 for a 10 \(\upmu\)g/m\(^3\) reduction.

When Wuhan went into lockdown on 23rd January it did so along with 16 other cities within Hubei province, affecting a total population of 59.02 million. While our analysis of the reduction in NO\(_{2}\) concentrations is within Wuhan city, it does not seem unreasonable to assume all cities in Hubei province experienced a similar reduction in pollution given they were subject to an equally stringent lockdown for the same length of time. Table 4 therefore also reports lives saved as a result of a 20 \(\upmu\)g/m\(^3\) reduction in NO\(_{2}\) concentrations across the whole of Hubei province. These range from 1228 to 3368 for a 20 \(\upmu\)g/m\(^3\) reduction in NO\(_{2}\) and between 614 and 1684 for a 10 \(\upmu\)g/m\(^3\) reduction.

For completeness, we extend our analysis to all regions subject to lockdown within China. By early February 2020, a total population of over 233 million were subject to formal lockdown (including Hubei’s 59 million).Footnote 10 While it is difficult to be clear of the strength and duration of all the lockdowns outside of Hubei we here assume they resulted in the same reduction of 20 \(\upmu\)g/m\(^3\) NO\(_{2}\) concentrations and did so over a slightly shortened lockdown period of 2 months. Table 4 provides the results and indicates that lives saved range from 3940 to 10,822 for a 20 \(\upmu\)g/m\(^3\) reduction in NO\(_{2}\) and between 1970 and 5411 for a 10 \(\upmu\)g/m\(^3\) reduction.

Table 3 Previous literature on the NO\(_{2}\) mortality association
Table 4 Previous literature on the NO\(_{2}\) mortality association

It is important to stress that these are little more than back of the envelope calculations and rely on a number of assumptions, in addition to those already pointed out regarding the stringency and duration of the lockdowns. First, we are modelling lives saved as a result of a reduction in concentrations of a single pollutant, NO\(_{2}\). A similar exercise could be undertaken for our estimated reduction in PM10 concentrations. However, there remains some uncertainty as to whether health impacts of different pollutants, particularly NO\(_{2}\) and particulate matter, are truly independent of each other given how highly correlated they tend to be. Nevertheless, some evidence of independence has been found by Faustini et al. (2014) suggesting that our focus on NO\(_{2}\) may provide an underestimate of the true mortality benefits of the reduced concentrations of these two pollutants. Second, we are assuming that the mortality response is proportionate to the reduction in pollution i.e. a 20 \(\upmu\)g/m\(^3\) reduction in concentrations has double the mortality effect of a 10 \(\upmu\)g/m\(^3\) reduction. Similarly, we are assuming that a 2-month reduction in pollution has double the mortality benefits of a 1-month reduction. Third, in predicting the possible lives saved due to a lockdown-induced reduction in pollution we are ignoring any other potential mortality effects caused by the lockdown such as increased exposure to indoor pollution, mental health effects, reduced road traffic accidents and so on. Finally, there is a possibility that those most susceptible to pollution exposure, i.e. those with underlying respiratory or other health conditions, are also those most susceptible to Covid-19. As such, if these individuals are dying from Covid-19 then we may be over-estimating the lives saved due to cleaner air.

Nevertheless, our results suggest that the lockdowns in China resulted in significant reductions in mortality as a result of improvements in air quality alone.

5 Discussion and Conclusions

Faced with a pandemic that is unprecedented in modern times, governments around the world have introduced strict lockdowns to try to control the spread of Covid-19. Inevitably, such a stringent, far-reaching policy will have wide-ranging impacts in addition to that of disease control. Using the example of Wuhan’s Covid-19 lockdown, this paper has examined one such impact, the perceived reduction in air pollution due to reductions in traffic volumes and economic activity more generally.

We adopted a two-stage approach. First, to isolate the impact of the lockdown on pollution concentrations we removed the confounding effects of weather conditions using a random forests machine learning approach (Grange et al. 2018; Vu et al. 2019). This approach overcomes the difficulties of econometrically controlling for non-independent, non-linear weather conditions. Our analysis reveals the importance of removing weather conditions from pollution patterns. Analysing observed (non-weather normalised) pollution levels, or pollution levels where weather conditions have not been fully controlled for, could provide misleading conclusions as to the impact of the lockdown. Second, we adopt a new Augmented Synthetic Control Method (Ben-Michael et al. 2019) to examine how weather normalised concentrations of four pollutants responded to the lockdown using a control of 29 other Chinese cities that were not in lockdown.

Our results indicate that the impact of the lockdown varied by pollutant, a nuance that newspaper reports of cleaner post-lockdown air have generally failed to acknowledge. We find that concentrations of NO\(_{2}\), a pollutant closely tied to traffic volumes and fossil fuel use, fell by as much as 24 \(\upmu\)g/m\(^3\) following the lockdown (a 63% fall) although this reduction declined to 16 \(\upmu\)g/m\(^3\) by the end of our twelve-day window of analysis. Prior to the lockdown NO\(_{2}\) concentrations were very close to the WHO health limit and so this reduction brought those concentrations to within safe limits. Concentrations of PM10 also fell by over 20 \(\upmu\)g/m\(^3\) although this reduction was short term and not statistically significant for the duration of our twelve-day window. Interestingly, concentrations of SO\(_{2}\) and CO did not fall in a statistically significant manner following the lockdown. In the case of SO\(_{2}\) this is likely to reflect the country’s reliance on coal-fired power plants and the fact that temperatures were relatively low in Wuhan through much of this period, resulting in a need for domestic heating. It is less clear why CO, a pollutant largely emitted by transport, did not fall following the lockdown.

Finally, we employ a selection of estimates of the mortality effects associated with NO\(_{2}\) concentrations to calculate the potential lives saved as a result of the cleaner air. We find that reduced NO\(_{2}\) concentrations following lockdown may have prevented as many as 496 deaths in Wuhan city, 3368 deaths in Hubei province and 10,822 deaths in China as a whole. While these potential deaths prevented may outweigh the official Chinese death toll from Covid-19 itself, our findings should not in any way be interpreted as implying that the pandemic has yielded net benefits to China. As we have pointed out, our estimates of deaths prevented are little more than back of the envelope calculations and should be treated with a degree of caution.

While a city-level lockdown may provide some clues as to how milder forms of restrictions on human activities might impact human health such as: congestion charging; pedestrianised zones; and urban planning more generally; because these all happened at the same time during a lockdown, estimating the individual impacts would be a challenge. However, the large NO\(_{2}\) effect does suggest that policies to reduce emissions from vehicles, such as a push for the electrification of cars and buses, would have considerable health benefits. How one would measure the costs incurred by society following a return to business as usual is also a challenge. One approach is to estimate the health costs incurred at the city level using published hospital statistics data and then using micro-simulation for modelling the long term impacts (Public Health England 2020). A second approach is to elicit a value of statistical life (VSL) in an air pollution context. For example, the OECD (2012) developed a new method for calculating country-specific VSL and estimated the cost of deaths from outdoor pollution for OECD countries to be almost $ 1.5 trillion in 2013.

Finally, despite the inherent difficulties in estimating the cost savings from any new emission reductions, the purpose of our analysis is to show that a policy as stringent as a lockdown has far-reaching implications which extend well beyond the primary purpose of disease control. Indeed, since air pollution, Covid-19, and the health of the population more generally are inextricably linked, then policy makers need to be aware of these interactions when formulating policy in the ongoing fight against Covid-19 and future pandemics.