skip to main content
10.1145/3490099.3511150acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

Authors Info & Claims
Published:22 March 2022Publication History

ABSTRACT

Air pollution forecast has become critical because of its direct impact on human health and its increased production caused by rapid industrialization. Machine learning (ML) solutions are being drastically explored in this domain because they can potentially produce highly accurate results with access to historical data. However, experts in the environmental area are skeptical about adopting ML solutions in real-world applications and policy making due to their black-box nature. In contrast, despite having low accuracy sometimes, the existing traditional simulation model (e.g., CMAQ) are widely used and follows well-defined and transparent equations. Therefore, presenting the knowledge learned by the ML model can make it transparent as well as comprehensible. In addition, validating the ML model’s learning with the existing domain knowledge might aid in addressing their skepticism, building appropriate trust, and better utilizing ML models. In collaboration with three experts with an average of five years of research experience in the air pollution domain, we identified that feature (meteorological feature like wind) contribution, towards the final forecast as the major information to be verified with domain knowledge. In addition, the accuracy of ML models compared with traditional simulation models and raw wind trajectories are essential for domain experts to validate the feature contribution. Based on the identified information, we designed and developed AQX, a visual analytics system to help experts validate and verify the ML model’s learning with their domain knowledge. The system includes multiple coordinated views to present the contributions of input features at different levels of aggregation in both temporal and spatial dimensions. It also provides a performance comparison of ML and traditional models in terms of accuracy and spatial map, along with the animation of raw wind trajectories for the input period. We further demonstrated two case studies and conducted expert interviews with two domain experts to show the effectiveness and usefulness of AQX.

References

  1. Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access 6(2018), 52138–52160.Google ScholarGoogle Scholar
  2. Antoine Alléon, Grégoire Jauvion, Boris Quennehen, and David Lissmyr. 2020. PlumeNet: Large-scale air quality forecasting using a convolutional LSTM network. arXiv preprint arXiv:2006.09204(2020).Google ScholarGoogle Scholar
  3. Gennady Andrienko, Natalia Andrienko, Wei Chen, Ross Maciejewski, and Ye Zhao. 2017. Visual analytics of mobility and transportation: State of the art and further research directions. IEEE Transactions on Intelligent Transportation Systems 18, 8(2017), 2232–2249.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gennady Andrienko, Natalia Andrienko, Urska Demsar, Doris Dransch, Jason Dykes, Sara Irina Fabrikant, Mikael Jern, Menno-Jan Kraak, Heidrun Schumann, and Christian Tominski. 2010. Space, time and visual analytics. International journal of geographical information science 24, 10(2010), 1577–1600.Google ScholarGoogle Scholar
  5. K Wyat Appel, Alice B Gilliland, Golam Sarwar, and Robert C Gilliam. 2007. Evaluation of the Community Multiscale Air Quality (CMAQ) model version 4.5: sensitivities impacting model performance: part I—ozone. Atmospheric Environment 41, 40 (2007), 9603–9615.Google ScholarGoogle ScholarCross RefCross Ref
  6. Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58(2020), 82–115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. V Athira, P Geetha, Rab Vinayakumar, and KP Soman. 2018. Deepairnet: Applying recurrent networks for air quality prediction. Procedia computer science 132 (2018), 1394–1403.Google ScholarGoogle Scholar
  8. Sagar V Belavadi, Sreenidhi Rajagopal, R Ranjani, and Rajasekar Mohan. 2020. Air quality forecasting using LSTM RNN and wireless sensor networks. Procedia Computer Science 170 (2020), 241–248.Google ScholarGoogle ScholarCross RefCross Ref
  9. Colin Bellinger, Mohomed Shazan Mohomed Jabbar, Osmar Zaïane, and Alvaro Osornio-Vargas. 2017. A systematic review of data mining and machine learning for air pollution epidemiology. BMC public health 17, 1 (2017), 1–19.Google ScholarGoogle Scholar
  10. Daewon Byun and Kenneth L Schere. 2006. Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. (2006).Google ScholarGoogle Scholar
  11. Carrie J Cai, Jonas Jongejan, and Jess Holbrook. 2019. The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th international conference on intelligent user interfaces. 258–262.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. 2018. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839–847.Google ScholarGoogle Scholar
  13. Wei Chen, Fangzhou Guo, and Fei-Yue Wang. 2015. A survey of traffic data visualization. IEEE Transactions on Intelligent Transportation Systems 16, 6(2015), 2970–2984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Weiyu Cheng, Yanyan Shen, Yanmin Zhu, and Linpeng Huang. 2018. A neural attention model for urban air quality inference: Learning the weights of monitoring stations. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  15. Mark W Craven and Jude W Shavlik. 1996. Extracting tree-structured representations of trained networks. Advances in neural information processing systems (1996), 24–30.Google ScholarGoogle Scholar
  16. Zikun Deng, Di Weng, Jiahui Chen, Ren Liu, Zhibin Wang, Jie Bao, Yu Zheng, and Yingcai Wu. 2019. Airvis: Visual analytics of air pollution propagation. IEEE transactions on visualization and computer graphics 26, 1(2019), 800–810.Google ScholarGoogle Scholar
  17. Mohamed Ben Ellefi, Zohra Bellahsene, and Konstantin Todorov. 2015. Datavore: a vocabulary recommender tool assisting Linked Data modeling. In ISWC: International Semantic Web Conference.Google ScholarGoogle Scholar
  18. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics(2001), 1189–1232.Google ScholarGoogle Scholar
  19. Oscar Gomez, Steffen Holter, Jun Yuan, and Enrico Bertini. 2020. ViCE: visual counterfactual explanations for machine learning models. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 531–535.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Md Naimul Hoque and Klaus Mueller. 2021. Outcome-explorer: A causality guided interactive visual interface for interpretable algorithmic decision making. arXiv preprint arXiv:2101.00633(2021).Google ScholarGoogle Scholar
  21. Kenneth I Joy. 2007. Numerical methods for particle tracing in vector fields. On-Line Visualization Notes(2007), 1–7.Google ScholarGoogle Scholar
  22. Ilias Kalamaras, Ioannis Xygonakis, Konstantinos Glykos, Sigmund Akselsen, Arne Munch-Ellingsen, Hai Thanh Nguyen, Andreas Jacobsen Lepperod, Kerstin Bach, Konstantinos Votis, and Dimitrios Tzovaras. 2019. Visual analytics for exploring air quality data in an AI-enhanced IoT environment. In Proceedings of the 11th International Conference on Management of Digital EcoSystems. 103–110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Niranjan Kamat, Prasanth Jayachandran, Karthik Tunga, and Arnab Nandi. 2014. Distributed and interactive cube exploration. In 2014 IEEE 30th International Conference on Data Engineering. IEEE, 472–483.Google ScholarGoogle ScholarCross RefCross Ref
  24. Marilena Kampa and Elias Castanas. 2008. Human health effects of air pollution. Environmental pollution 151, 2 (2008), 362–367.Google ScholarGoogle Scholar
  25. Jintao Ke, Hai Yang, Hongyu Zheng, Xiqun Chen, Yitian Jia, Pinghua Gong, and Jieping Ye. 2018. Hexagon-based convolutional neural network for supply-demand forecasting of ride-sourcing services. IEEE Transactions on Intelligent Transportation Systems 20, 11(2018), 4160–4173.Google ScholarGoogle ScholarCross RefCross Ref
  26. Daniel Keim, Gennady Andrienko, Jean-Daniel Fekete, Carsten Görg, Jörn Kohlhammer, and Guy Melançon. 2008. Visual analytics: Definition, process, and challenges. In Information visualization. Springer, 154–175.Google ScholarGoogle Scholar
  27. Lester B Lave and E Seskin. 1973. Air pollution and human health. Readings in Biology and Man 169 (1973), 294.Google ScholarGoogle Scholar
  28. Doyup Lee, Suehun Jung, Yeongjae Cheon, Dongil Kim, and Seungil You. 2018. Forecasting taxi demands with fully convolutional networks and temporal guided embedding. In NIPS 2018 Spatiotemporal Workshop.Google ScholarGoogle Scholar
  29. Jiwei Li, Xinlei Chen, Eduard Hovy, and Dan Jurafsky. 2015. Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066(2015).Google ScholarGoogle Scholar
  30. Yuxuan Liang, Songyu Ke, Junbo Zhang, Xiuwen Yi, and Yu Zheng. 2018. Geoman: Multi-level attention networks for geo-sensory time series prediction.. In IJCAI. 3428–3434.Google ScholarGoogle Scholar
  31. Dongyu Liu, Panpan Xu, and Liu Ren. 2018. TPFlow: Progressive partition and multidimensional pattern extraction for large-scale spatio-temporal data analysis. IEEE transactions on visualization and computer graphics 25, 1(2018), 1–11.Google ScholarGoogle Scholar
  32. Zhicheng Liu, Biye Jiang, and Jeffrey Heer. 2013. imMens: Real-time visual querying of big data. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 421–430.Google ScholarGoogle Scholar
  33. Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874(2017).Google ScholarGoogle Scholar
  34. Yao Ming, Huamin Qu, and Enrico Bertini. 2018. Rulematrix: Visualizing and understanding classifiers with rules. IEEE transactions on visualization and computer graphics 25, 1(2018), 342–352.Google ScholarGoogle Scholar
  35. Takayuki Miura, Satoshi Hasegawa, and Toshiki Shibahara. 2021. MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI. arXiv preprint arXiv:2107.08909(2021).Google ScholarGoogle Scholar
  36. Christoph Molnar. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.Google ScholarGoogle Scholar
  37. Sayali Nemade. 2019. A Survey on Different Machine Learning Techniques for Air Quality Forecasting for Urban Air Pollution. International Journal for Research in Applied Science and Engineering Technology 7 (04 2019), 2185–2194. https://doi.org/10.22214/ijraset.2019.4395Google ScholarGoogle ScholarCross RefCross Ref
  38. Quoc Phong Nguyen, Kar Wai Lim, Dinil Mon Divakaran, Kian Hsiang Low, and Mun Choon Chan. 2019. GEE: A gradient-based explainable variational autoencoder for network anomaly detection. In 2019 IEEE Conference on Communications and Network Security (CNS). IEEE, 91–99.Google ScholarGoogle ScholarCross RefCross Ref
  39. Huamin Qu, Wing-Yi Chan, Anbang Xu, Kai-Lun Chung, Kai-Hon Lau, and Ping Guo. 2007. Visual analysis of the air pollution problem in Hong Kong. IEEE Transactions on visualization and Computer Graphics 13, 6(2007), 1408–1415.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Dominik Sacha, Matthias Kraus, Daniel A Keim, and Min Chen. 2018. Vis4ml: An ontology for visual analytics assisted machine learning. IEEE transactions on visualization and computer graphics 25, 1(2018), 385–395.Google ScholarGoogle Scholar
  42. Sam Sattarzadeh, Mahesh Sudhakar, Konstantinos N Plataniotis, Jongseong Jang, Yeonjeong Jeong, and Hyunwoo Kim. 2021. Integrated Grad-Cam: Sensitivity-Aware Visual Explanation of Deep Convolutional Networks Via Integrated Gradient-Based Scoring. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1775–1779.Google ScholarGoogle Scholar
  43. Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design Study Methodology: Reflections from the Trenches and the Stacks. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2431–2440. https://doi.org/10.1109/TVCG.2012.213Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.Google ScholarGoogle ScholarCross RefCross Ref
  45. Qiaomu Shen, Yanhong Wu, Yuzhe Jiang, Wei Zeng, KH Alexis, Anna Vianova, and Huamin Qu. 2020. Visual interpretation of recurrent neural network on multi-dimensional time-series forecast. In 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 61–70.Google ScholarGoogle ScholarCross RefCross Ref
  46. Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo. 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv preprint arXiv:1506.04214(2015).Google ScholarGoogle Scholar
  47. Akshat Shrivastava and Jeffrey Heer. 2020. ISEQL: Interactive sequence learning. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 43–54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Hyesook Son, Seokyeon Kim, Hanbyul Yeon, Miyeon Lee, Yejin Kim, and Yun Jang. [n. d.]. Visual Deep Learning Models Analysis for Air Pollution Predictions. ([n. d.]).Google ScholarGoogle Scholar
  49. Thilo Spinner, Udo Schlegel, Hanna Schäfer, and Mennatallah El-Assady. 2019. explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE transactions on visualization and computer graphics 26, 1(2019), 1064–1074.Google ScholarGoogle Scholar
  50. Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2017. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE transactions on visualization and computer graphics 24, 1(2017), 667–676.Google ScholarGoogle Scholar
  51. Junpeng Wang, Liang Gou, Han-Wei Shen, and Hao Yang. 2018. Dqnviz: A visual analytics approach to understand deep q-networks. IEEE transactions on visualization and computer graphics 25, 1(2018), 288–298.Google ScholarGoogle Scholar
  52. Junpeng Wang, Liang Gou, Hao Yang, and Han-Wei Shen. 2018. Ganviz: A visual analytics approach to understand the adversarial game. IEEE transactions on visualization and computer graphics 24, 6(2018), 1905–1917.Google ScholarGoogle Scholar
  53. Senzhang Wang, Jiannong Cao, and Philip Yu. 2020. Deep learning for spatio-temporal data mining: A survey. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarGoogle ScholarCross RefCross Ref
  54. Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jianmin Wang, and S Yu Philip. 2018. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In International Conference on Machine Learning. PMLR, 5123–5132.Google ScholarGoogle Scholar
  55. Yunbo Wang, Mingsheng Long, Jianmin Wang, Zhifeng Gao, and Philip S Yu. 2017. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 879–888.Google ScholarGoogle Scholar
  56. Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2019. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9154–9162.Google ScholarGoogle ScholarCross RefCross Ref
  57. Daniel Karl I Weidele, Justin D Weisz, Erick Oduor, Michael Muller, Josh Andres, Alexander Gray, and Dakuo Wang. 2020. AutoAIViz: opening the blackbox of automated artificial intelligence with conditional parallel coordinates. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 308–312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Chenyang Xu and Jerry L Prince. 1997. Gradient Vector Flow: A New External Force for Snakes. In Proceedings of IEEE International Conference on Computer Vision. 66–71.Google ScholarGoogle Scholar
  59. Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, and Xiuwen Yi. 2016. DNN-based prediction model for spatio-temporal data. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 1–4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Pengpeng Zhao, Anjing Luo, Yanchi Liu, Fuzhen Zhuang, Jiajie Xu, Zhixu Li, Victor S Sheng, and Xiaofang Zhou. 2020. Where to go next: A spatio-temporal gated network for next poi recommendation. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarGoogle Scholar
  61. Zhiguang Zhou, Zhifei Ye, Yanan Liu, Fang Liu, Yubo Tao, and Weihua Su. 2017. Visual analytics for spatial clusters of air-quality data. IEEE computer graphics and applications 37, 5 (2017), 98–105.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces
          March 2022
          888 pages
          ISBN:9781450391443
          DOI:10.1145/3490099

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 March 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate746of2,811submissions,27%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format