Skip to main content

Advertisement

Log in

A novel seasonal index–based machine learning approach for air pollution forecasting

  • Published:
Environmental Monitoring and Assessment Aims and scope Submit manuscript

Abstract

Novel machine learning models (MLMs) using the seasonal indexing approach that captures the variation in air quality caused due to meteorological changes have been used to provide short-term, real-time forecasts of PM2.5 concentration for one of the most polluted air quality control regions (AQCR) in the capital city of Delhi. Two MLMs—multi-linear regression and random forest—have been developed for using time series data for 1-h and 24-h average PM2.5 concentration. Short-term, real-time forecasts have been made using the developed models. Various model performance evaluation indices indicate satisfactory model performance. R2 values for the hourly and daily models varied between 0.95 and 0.72 and between 0.76 and 0.68 for the 1st to 5th h/day, respectively. The lagged values of PM2.5 concentration (persistence) and the hourly and daily indices are the most influential variables for the forecasts for immediate time steps. In contrast, seasonal indices become more important with the forecasting time horizon. The developed models can be used for making short-term, real-time air quality forecasts and issuing a warning when the pollution levels go beyond acceptable limits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Abbreviations

ANN:

Artificial neural network

AQCS:

Air quality control regions

AQMP :

Air quality management plan

ARIMA:

Auto-Regressive Integrated Moving Average

CNG :

Compressed natural gas

CPCB:

Central Pollution Control Board

CV:

Cross-validation

FLT:

Fuzzy logic theory

GPU:

Graphics Processing Unit

GRAP:

Graded Response Action Plan

ISBT:

Inter State Bus Terminal

LSTM:

Long Short Term Memory

MAE:

Mean absolute error

MLM:

Machine learning models

NCR:

National Capital Region

NOAA :

National Oceanic and Atmospheric Administration

OOB:

Out of bag

R2:

Coefficient of determination

RMSE:

Root mean squared error

RSPM:

Respirable suspended particulate matter

WHO:

World Health Organization

References

  • Abdulrazzaq, L. R., Abdulkareem, M. N., Yazid, M. R. M., Borhan, M. N., & Mahdi, M. S. (2020). Traffic congestion: Shift from private car to public transportation. Civil Engineering Journal (Iran), 6(8), 1547–1554. https://doi.org/10.28991/cej-2020-03091566

  • Agarwal, S., Sharma, S., Suresh, R., Rahman, M. H., Vranckx, S., Maiheu, B., Blyth, L., Janssen, S., Gargava, P., Shukla, V. K., & Batra, S. (2020). Air quality forecasting using artificial neural networks with real time dynamic error correction in highly polluted regions. Science of the Total Environment, 735, 139454. https://doi.org/10.1016/j.scitotenv.2020.139454

    Article  CAS  Google Scholar 

  • Anfossi, D., Brusasca, G., & Tinarelli, G. (1990). Simulation of atmospheric diffusion in low windspeed meandering conditions by a Monte Carlo dispersion method. Nuovo Cimento, C, 13(6), 995–1006. http://inis.iaea.org/Search/search.aspx?orig_q=RN:23004766

  • Angelevska, B., Atanasova, V., & Andreevski, I. (2021). Urban air quality guidance based on measures categorization in road transport. Civil Engineering Journal (Iran), 7(2), 253–267. https://doi.org/10.28991/cej-2021-03091651

  • Arroyo, Á., Herrero, Á., Tricio, V., Corchado, E., & Woźniak, M. (2018) Neural models for imputation of missing ozone data in air-quality datasets. Complexity, 2018. https://doi.org/10.1155/2018/7238015

  • Bansal, M., Aggarwal, A., & Verma, T. (2019). Air quality index prediction of Delhi using LSTM. International Journal of Emerging Trends & Technology in Computer Science, 8(5), 59–68.

    Google Scholar 

  • Berrar, D. (2018). Cross-validation. In Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics (Vols. 1–3, pp. 542–545). Elsevier. https://doi.org/10.1016/B978-0-12-809633-8.20349-X

  • Bhanarkar, A. D., Purohit, P., Rafaj, P., Amann, M., Bertok, I., Cofala, J., Rao, P. S., Vardhan, B. H., Kiesewetter, G., Sander, R., Schöpp, W., Majumdar, D., Srivastava, A., Deshmukh, S., Kawarti, A., & Kumar, R. (2018). Managing future air quality in megacities: Co-benefit assessment for Delhi. Atmospheric Environment, 186, 158–177. https://doi.org/10.1016/j.atmosenv.2018.05.026

    Article  CAS  Google Scholar 

  • Bi, J., Wildani, A., Chang, H. H., & Liu, Y. (2020). Incorporating low-cost sensor measurements into high-resolution PM2.5 modeling at a large spatial scale. Environmental Science and Technology, 54(4), 2152–2162. https://doi.org/10.1021/acs.est.9b06046

  • Bunn, D. W., & Vassilopoulos, A. I. (1999). Comparison of seasonal estimation methods in multi-item short-term forecasting. International Journal of Forecasting, 15(4), 431–443. https://doi.org/10.1016/S0169-2070(99)00005-9

    Article  Google Scholar 

  • Burnett, R. T., Arden Pope, C., Ezzati, M., Olives, C., Lim, S. S., Mehta, S., Shin, H. H., Singh, G., Hubbell, B., Brauer, M., Ross Anderson, H., Smith, K. R., Balmes, J. R., Bruce, N. G., Kan, H., Laden, F., Prüss-Ustün, A., Turner, M. C., Gapstur, S. M., & Cohen, A. (2014). An integrated risk function for estimating the global burden of disease attributable to ambient fine particulate matter exposure. Environmental Health Perspectives, 122(4), 397–403. https://doi.org/10.1289/ehp.1307049

    Article  Google Scholar 

  • Castelli, M., Clemente, F. M., Popovič, A., Silva, S., & Vanneschi, L. (2020). A machine learning approach to predict air quality in California. Complexity, 2020(Ml). https://doi.org/10.1155/2020/8049504

  • Cats, G. J., & Holtslag, A. A. M. (1980). Prediction of air pollution frequency distribution—Part I. The lognormal model. Atmospheric Environment (1967), 14(2), 255–258.

  • Chelani, A. B., & Devotta, S. (2007). Air quality assessment in Delhi: Before and after CNG as fuel. Environmental Monitoring and Assessment, 125(1–3), 257–263. https://doi.org/10.1007/s10661-006-9517-x

    Article  CAS  Google Scholar 

  • Cheng, Z., Luo, L., Wang, S., Wang, Y., Sharma, S., Shimadera, H., Wang, X., Bressi, M., de Miranda, R. M., Jiang, J., Zhou, W., Fajardo, O., Yan, N., & Hao, J. (2016). Status and characteristics of ambient PM2.5 pollution in global megacities. Environment International, 89–90, 212–221. https://doi.org/10.1016/j.envint.2016.02.003

    Article  CAS  Google Scholar 

  • CPCB. (2017). Graded Response Action Plan for Delhi & NCR. In Govt. of India. https://cpcb.nic.in/uploads/final_graded_table.pdf

  • Gardner, J. R., Everette, S. (1984). Forecasting: Methods and applications (Second Edition), Makridakis, S., Wheelwright, S. C. and McGee, V. E., New York: Wiley, 1983. Price: $47.85/$20.15 (cloth), $34.15/E14.35 (paper). Pages: 923. Journal of Forecasting, 3(4), 457–460. https://doi.org/10.1002/for.3980030408

  • Goyal, P., Chan, A. T., & Jaiswal, N. (2006). Statistical models for the prediction of respirable suspended particulate matter in urban cities. Atmospheric Environment, 40(11), 2068–2077. https://doi.org/10.1016/j.atmosenv.2005.11.041

    Article  CAS  Google Scholar 

  • Goyal, P., Gulia, S., Goyal, S. K., & Kumar, R. (2019). Assessment of the effectiveness of policy interventions for air quality control regions in Delhi city. Environmental Science and Pollution Research, 26(30), 30967–30979. https://doi.org/10.1007/s11356-019-06236-1

    Article  CAS  Google Scholar 

  • Guo, P. T., Li, M. F., Luo, W., Tang, Q. F., Liu, Z. W., & Lin, Z. M. (2015). Digital mapping of soil organic matter for rubber plantation at regional scale: An application of random forest plus residuals kriging approach. Geoderma, 237–238, 49–59. https://doi.org/10.1016/j.geoderma.2014.08.009

    Article  CAS  Google Scholar 

  • Guttikunda, S. K., & Gurjar, B. R. (2012). Role of meteorology in seasonality of air pollution in megacity Delhi. India. Environmental Monitoring and Assessment, 184(5), 3199–3211. https://doi.org/10.1007/s10661-011-2182-8

    Article  CAS  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. Springer.

    Book  Google Scholar 

  • Ho, T. K. (1995). Random decision forests. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 1, 278–282. https://doi.org/10.1109/ICDAR.1995.598994

    Article  Google Scholar 

  • Ittig, P. T. (1997). A seasonal index for business. Decision Sciences, 28(2), 335–355. https://doi.org/10.1111/j.1540-5915.1997.tb01314.x

    Article  Google Scholar 

  • Juda, K. (1989). Air pollution modelling. Encyclopedia of Environmental Control Technology, Air Pollution Control, USA: Gulf Publishing Company, 2, 83–134.

    Google Scholar 

  • Khare, M., & Sharma, P. (2002). Modelling urban vehicle emissions.

  • Kohavi, R. (2001). A study of cross-validation and bootstrap for accuracy estimation and model selection. 14.

  • Kumar, A., & Goyal, P. (2011). Forecasting of air quality in Delhi using principal component regression technique. Atmospheric Pollution Research, 2(4), 436–444. https://doi.org/10.5094/APR.2011.050

    Article  CAS  Google Scholar 

  • Liang, Y. C., Maimury, Y., Chen, A. H. L., & Juarez, J. R. C. (2020). Machine learning-based prediction of air quality. Applied Sciences (switzerland), 10(24), 1–17. https://doi.org/10.3390/app10249151

    Article  CAS  Google Scholar 

  • Liaw, A., & Wiener, M. (2002). Classification and regression by random forest. R News, 2(3), 18–22.

    Google Scholar 

  • Martin, M. P., Wattenbach, M., Smith, P., Meersmans, J., Jolivet, C., Boulonne, L., & Arrouays, D. (2011). Spatial distribution of soil organic carbon stocks in France. Biogeosciences, 8(5), 1053–1065. https://doi.org/10.5194/bg-8-1053-2011

    Article  CAS  Google Scholar 

  • NOAA. (2001). Air quality forecasting. In NOAA Aeronomy Laboratory (Issue June). https://www.esrl.noaa.gov/csd/AQRS/reports/forecasting.pdf

  • Pandey, A., Brauer, M., Cropper, M. L., Balakrishnan, K., Mathur, P., Dey, S., Turkgulu, B., Kumar, G. A., Khare, M., Beig, G., Gupta, T., Krishnankutty, R. P., Causey, K., Cohen, A. J., Bhargava, S., Aggarwal, A. N., Agrawal, A., Awasthi, S., Bennitt, F., & Dandona, L. (2021). Health and economic impact of air pollution in the states of India: The Global Burden of Disease Study 2019. The Lancet Planetary Health, 5(1), e25–e38. https://doi.org/10.1016/S2542-5196(20)30298-9

    Article  Google Scholar 

  • Rybarczyk, Y., & Zalakeviciute, R. (2018). Regression models to predict air pollution from affordable data collections. In Machine Learning - Advanced Techniques and Emerging Applications. InTech. https://doi.org/10.5772/intechopen.71848

  • Sembhi, H., Wooster, M., Zhang, T., Sharma, S., Singh, N., Agarwal, S., Boesch, H., Gupta, S., Misra, A., Tripathi, S. N., Mor, S., & Khaiwal, R. (2020). Post-monsoon air quality degradation across Northern India: Assessing the impact of policy-related shifts in timing and amount of crop residue burnt. Environmental Research Letters, 15(10), 104067. https://doi.org/10.1088/1748-9326/aba714

    Article  CAS  Google Scholar 

  • Sharma, S., Sharma, P., & Khare, M. (2017). Photo-chemical transport modelling of tropospheric ozone: A review. In Atmospheric Environment (Vol. 159, pp. 34–54). Elsevier Ltd. https://doi.org/10.1016/j.atmosenv.2017.03.047

  • Srivastava, C., Singh, S., & Singh, A. P. (2019). Estimation of air pollution in Delhi using machine learning techniques. 2018 International Conference on Computing, Power and Communication Technologies, GUCON 2018, 304–309. https://doi.org/10.1109/GUCON.2018.8675022

  • Wang, D. (2018). BRITS : Bidirectional Recurrent Imputation for Time Series. NeurIPS, 1–11.

  • Wilkinson, S., Mills, G., Illidge, R., & Davies, W. J. (2012). How is ozone pollution reducing our food supply? Journal of Experimental Botany, 63(2), 527–536. https://doi.org/10.1093/jxb/err317

    Article  CAS  Google Scholar 

  • World Population Review. (n.d.). Delhi Population 2021 (Demographics, Maps, Graphs). Retrieved June 17, 2021, from https://worldpopulationreview.com/world-cities/delhi-population

  • Xie, X., Wu, T., Zhu, M., Jiang, G., Xu, Y., Wang, X., & Pu, L. (2021). Comparison of random forest and multiple linear regression models for estimation of soil extracellular enzyme activities in agricultural reclaimed coastal saline land. Ecological Indicators, 120, 106925. https://doi.org/10.1016/j.ecolind.2020.106925

    Article  CAS  Google Scholar 

  • Zannetti, P. (1989). Simulating short-term, short-range air quality dispersion phenomena. Encyclopedia of Environmental Control Technology, 2, 159–191.

    Google Scholar 

  • Zhang, H., Wu, P., Yin, A., Yang, X., Zhang, M., & Gao, C. (2017). Prediction of soil organic carbon in an intensively managed reclamation zone of eastern China: A comparison of multiple linear regressions and the random forest model. Science of the Total Environment, 592, 704–713. https://doi.org/10.1016/j.scitotenv.2017.02.146

    Article  CAS  Google Scholar 

  • Zhang, Y., Bocquet, M., Mallet, V., Seigneur, C., & Baklanov, A. (2012). Real-time air quality forecasting, part I: History, techniques, and current status. Atmospheric Environment, 60, 632–655. https://doi.org/10.1016/j.atmosenv.2012.06.031

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumit Sharma.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khan, A., Sharma, S., Chowdhury, K.R. et al. A novel seasonal index–based machine learning approach for air pollution forecasting. Environ Monit Assess 194, 429 (2022). https://doi.org/10.1007/s10661-022-10092-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10661-022-10092-x

Keywords

Navigation