research-article

AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

Authors:
Reshika Palaniyappan Velumani

Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong

Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong
View Profile

,
Meng Xia

School of Computing, Korea Advanced Institute of Science and Technology, Republic of Korea and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong

School of Computing, Korea Advanced Institute of Science and Technology, Republic of Korea and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong
View Profile

,
Jun Han

Computer Science and Engineering, University of Notre Dame, United States and Computer Science and Engineering, University of Notre Dame, United States

Computer Science and Engineering, University of Notre Dame, United States and Computer Science and Engineering, University of Notre Dame, United States
View Profile

,
Chaoli Wang

Computer Science and Engineering, University of Notre Dame, United States and Computer Science and Engineering, University of Notre Dame, United States

Computer Science and Engineering, University of Notre Dame, United States and Computer Science and Engineering, University of Notre Dame, United States
View Profile

,
ALEXIS K LAU

Division of Environment and Sustainability, The Hong Kong Uni of Sci and Technology, Hong Kong and Division of Environment and Sustainability, The Hong Kong Uni of Sci and Technology, Hong Kong

Division of Environment and Sustainability, The Hong Kong Uni of Sci and Technology, Hong Kong and Division of Environment and Sustainability, The Hong Kong Uni of Sci and Technology, Hong Kong
View Profile

,
Huamin Qu

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China and Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China and Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China
View Profile

IUI '22: Proceedings of the 27th International Conference on Intelligent User InterfacesMarch 2022Pages 720–733https://doi.org/10.1145/3490099.3511150

Published:22 March 2022Publication History

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Pages 720–733

ABSTRACT

Air pollution forecast has become critical because of its direct impact on human health and its increased production caused by rapid industrialization. Machine learning (ML) solutions are being drastically explored in this domain because they can potentially produce highly accurate results with access to historical data. However, experts in the environmental area are skeptical about adopting ML solutions in real-world applications and policy making due to their black-box nature. In contrast, despite having low accuracy sometimes, the existing traditional simulation model (e.g., CMAQ) are widely used and follows well-defined and transparent equations. Therefore, presenting the knowledge learned by the ML model can make it transparent as well as comprehensible. In addition, validating the ML model’s learning with the existing domain knowledge might aid in addressing their skepticism, building appropriate trust, and better utilizing ML models. In collaboration with three experts with an average of five years of research experience in the air pollution domain, we identified that feature (meteorological feature like wind) contribution, towards the final forecast as the major information to be verified with domain knowledge. In addition, the accuracy of ML models compared with traditional simulation models and raw wind trajectories are essential for domain experts to validate the feature contribution. Based on the identified information, we designed and developed AQX, a visual analytics system to help experts validate and verify the ML model’s learning with their domain knowledge. The system includes multiple coordinated views to present the contributions of input features at different levels of aggregation in both temporal and spatial dimensions. It also provides a performance comparison of ML and traditional models in terms of accuracy and spatial map, along with the animation of raw wind trajectories for the input period. We further demonstrated two case studies and conducted expert interviews with two domain experts to show the effectiveness and usefulness of AQX.

References

Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access 6(2018), 52138–52160.Google Scholar
Antoine Alléon, Grégoire Jauvion, Boris Quennehen, and David Lissmyr. 2020. PlumeNet: Large-scale air quality forecasting using a convolutional LSTM network. arXiv preprint arXiv:2006.09204(2020).Google Scholar
Gennady Andrienko, Natalia Andrienko, Wei Chen, Ross Maciejewski, and Ye Zhao. 2017. Visual analytics of mobility and transportation: State of the art and further research directions. IEEE Transactions on Intelligent Transportation Systems 18, 8(2017), 2232–2249.Google ScholarDigital Library
Gennady Andrienko, Natalia Andrienko, Urska Demsar, Doris Dransch, Jason Dykes, Sara Irina Fabrikant, Mikael Jern, Menno-Jan Kraak, Heidrun Schumann, and Christian Tominski. 2010. Space, time and visual analytics. International journal of geographical information science 24, 10(2010), 1577–1600.Google Scholar
K Wyat Appel, Alice B Gilliland, Golam Sarwar, and Robert C Gilliam. 2007. Evaluation of the Community Multiscale Air Quality (CMAQ) model version 4.5: sensitivities impacting model performance: part I—ozone. Atmospheric Environment 41, 40 (2007), 9603–9615.Google ScholarCross Ref
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58(2020), 82–115.Google ScholarDigital Library
V Athira, P Geetha, Rab Vinayakumar, and KP Soman. 2018. Deepairnet: Applying recurrent networks for air quality prediction. Procedia computer science 132 (2018), 1394–1403.Google Scholar
Sagar V Belavadi, Sreenidhi Rajagopal, R Ranjani, and Rajasekar Mohan. 2020. Air quality forecasting using LSTM RNN and wireless sensor networks. Procedia Computer Science 170 (2020), 241–248.Google ScholarCross Ref
Colin Bellinger, Mohomed Shazan Mohomed Jabbar, Osmar Zaïane, and Alvaro Osornio-Vargas. 2017. A systematic review of data mining and machine learning for air pollution epidemiology. BMC public health 17, 1 (2017), 1–19.Google Scholar
Daewon Byun and Kenneth L Schere. 2006. Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. (2006).Google Scholar
Carrie J Cai, Jonas Jongejan, and Jess Holbrook. 2019. The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th international conference on intelligent user interfaces. 258–262.Google ScholarDigital Library
Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. 2018. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839–847.Google Scholar
Wei Chen, Fangzhou Guo, and Fei-Yue Wang. 2015. A survey of traffic data visualization. IEEE Transactions on Intelligent Transportation Systems 16, 6(2015), 2970–2984.Google ScholarDigital Library
Weiyu Cheng, Yanyan Shen, Yanmin Zhu, and Linpeng Huang. 2018. A neural attention model for urban air quality inference: Learning the weights of monitoring stations. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Mark W Craven and Jude W Shavlik. 1996. Extracting tree-structured representations of trained networks. Advances in neural information processing systems (1996), 24–30.Google Scholar
Zikun Deng, Di Weng, Jiahui Chen, Ren Liu, Zhibin Wang, Jie Bao, Yu Zheng, and Yingcai Wu. 2019. Airvis: Visual analytics of air pollution propagation. IEEE transactions on visualization and computer graphics 26, 1(2019), 800–810.Google Scholar
Mohamed Ben Ellefi, Zohra Bellahsene, and Konstantin Todorov. 2015. Datavore: a vocabulary recommender tool assisting Linked Data modeling. In ISWC: International Semantic Web Conference.Google Scholar
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics(2001), 1189–1232.Google Scholar
Oscar Gomez, Steffen Holter, Jun Yuan, and Enrico Bertini. 2020. ViCE: visual counterfactual explanations for machine learning models. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 531–535.Google ScholarDigital Library
Md Naimul Hoque and Klaus Mueller. 2021. Outcome-explorer: A causality guided interactive visual interface for interpretable algorithmic decision making. arXiv preprint arXiv:2101.00633(2021).Google Scholar
Kenneth I Joy. 2007. Numerical methods for particle tracing in vector fields. On-Line Visualization Notes(2007), 1–7.Google Scholar
Ilias Kalamaras, Ioannis Xygonakis, Konstantinos Glykos, Sigmund Akselsen, Arne Munch-Ellingsen, Hai Thanh Nguyen, Andreas Jacobsen Lepperod, Kerstin Bach, Konstantinos Votis, and Dimitrios Tzovaras. 2019. Visual analytics for exploring air quality data in an AI-enhanced IoT environment. In Proceedings of the 11th International Conference on Management of Digital EcoSystems. 103–110.Google ScholarDigital Library
Niranjan Kamat, Prasanth Jayachandran, Karthik Tunga, and Arnab Nandi. 2014. Distributed and interactive cube exploration. In 2014 IEEE 30th International Conference on Data Engineering. IEEE, 472–483.Google ScholarCross Ref
Marilena Kampa and Elias Castanas. 2008. Human health effects of air pollution. Environmental pollution 151, 2 (2008), 362–367.Google Scholar
Jintao Ke, Hai Yang, Hongyu Zheng, Xiqun Chen, Yitian Jia, Pinghua Gong, and Jieping Ye. 2018. Hexagon-based convolutional neural network for supply-demand forecasting of ride-sourcing services. IEEE Transactions on Intelligent Transportation Systems 20, 11(2018), 4160–4173.Google ScholarCross Ref
Daniel Keim, Gennady Andrienko, Jean-Daniel Fekete, Carsten Görg, Jörn Kohlhammer, and Guy Melançon. 2008. Visual analytics: Definition, process, and challenges. In Information visualization. Springer, 154–175.Google Scholar
Lester B Lave and E Seskin. 1973. Air pollution and human health. Readings in Biology and Man 169 (1973), 294.Google Scholar
Doyup Lee, Suehun Jung, Yeongjae Cheon, Dongil Kim, and Seungil You. 2018. Forecasting taxi demands with fully convolutional networks and temporal guided embedding. In NIPS 2018 Spatiotemporal Workshop.Google Scholar
Jiwei Li, Xinlei Chen, Eduard Hovy, and Dan Jurafsky. 2015. Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066(2015).Google Scholar
Yuxuan Liang, Songyu Ke, Junbo Zhang, Xiuwen Yi, and Yu Zheng. 2018. Geoman: Multi-level attention networks for geo-sensory time series prediction.. In IJCAI. 3428–3434.Google Scholar
Dongyu Liu, Panpan Xu, and Liu Ren. 2018. TPFlow: Progressive partition and multidimensional pattern extraction for large-scale spatio-temporal data analysis. IEEE transactions on visualization and computer graphics 25, 1(2018), 1–11.Google Scholar
Zhicheng Liu, Biye Jiang, and Jeffrey Heer. 2013. imMens: Real-time visual querying of big data. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 421–430.Google Scholar
Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874(2017).Google Scholar
Yao Ming, Huamin Qu, and Enrico Bertini. 2018. Rulematrix: Visualizing and understanding classifiers with rules. IEEE transactions on visualization and computer graphics 25, 1(2018), 342–352.Google Scholar
Takayuki Miura, Satoshi Hasegawa, and Toshiki Shibahara. 2021. MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI. arXiv preprint arXiv:2107.08909(2021).Google Scholar
Christoph Molnar. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.Google Scholar
Sayali Nemade. 2019. A Survey on Different Machine Learning Techniques for Air Quality Forecasting for Urban Air Pollution. International Journal for Research in Applied Science and Engineering Technology 7 (04 2019), 2185–2194. https://doi.org/10.22214/ijraset.2019.4395Google ScholarCross Ref
Quoc Phong Nguyen, Kar Wai Lim, Dinil Mon Divakaran, Kian Hsiang Low, and Mun Choon Chan. 2019. GEE: A gradient-based explainable variational autoencoder for network anomaly detection. In 2019 IEEE Conference on Communications and Network Security (CNS). IEEE, 91–99.Google ScholarCross Ref
Huamin Qu, Wing-Yi Chan, Anbang Xu, Kai-Lun Chung, Kai-Hon Lau, and Ping Guo. 2007. Visual analysis of the air pollution problem in Hong Kong. IEEE Transactions on visualization and Computer Graphics 13, 6(2007), 1408–1415.Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.Google ScholarDigital Library
Dominik Sacha, Matthias Kraus, Daniel A Keim, and Min Chen. 2018. Vis4ml: An ontology for visual analytics assisted machine learning. IEEE transactions on visualization and computer graphics 25, 1(2018), 385–395.Google Scholar
Sam Sattarzadeh, Mahesh Sudhakar, Konstantinos N Plataniotis, Jongseong Jang, Yeonjeong Jeong, and Hyunwoo Kim. 2021. Integrated Grad-Cam: Sensitivity-Aware Visual Explanation of Deep Convolutional Networks Via Integrated Gradient-Based Scoring. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1775–1779.Google Scholar
Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design Study Methodology: Reflections from the Trenches and the Stacks. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2431–2440. https://doi.org/10.1109/TVCG.2012.213Google ScholarDigital Library
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.Google ScholarCross Ref
Qiaomu Shen, Yanhong Wu, Yuzhe Jiang, Wei Zeng, KH Alexis, Anna Vianova, and Huamin Qu. 2020. Visual interpretation of recurrent neural network on multi-dimensional time-series forecast. In 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 61–70.Google ScholarCross Ref
Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo. 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv preprint arXiv:1506.04214(2015).Google Scholar
Akshat Shrivastava and Jeffrey Heer. 2020. ISEQL: Interactive sequence learning. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 43–54.Google ScholarDigital Library
Hyesook Son, Seokyeon Kim, Hanbyul Yeon, Miyeon Lee, Yejin Kim, and Yun Jang. [n. d.]. Visual Deep Learning Models Analysis for Air Pollution Predictions. ([n. d.]).Google Scholar
Thilo Spinner, Udo Schlegel, Hanna Schäfer, and Mennatallah El-Assady. 2019. explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE transactions on visualization and computer graphics 26, 1(2019), 1064–1074.Google Scholar
Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2017. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE transactions on visualization and computer graphics 24, 1(2017), 667–676.Google Scholar
Junpeng Wang, Liang Gou, Han-Wei Shen, and Hao Yang. 2018. Dqnviz: A visual analytics approach to understand deep q-networks. IEEE transactions on visualization and computer graphics 25, 1(2018), 288–298.Google Scholar
Junpeng Wang, Liang Gou, Hao Yang, and Han-Wei Shen. 2018. Ganviz: A visual analytics approach to understand the adversarial game. IEEE transactions on visualization and computer graphics 24, 6(2018), 1905–1917.Google Scholar
Senzhang Wang, Jiannong Cao, and Philip Yu. 2020. Deep learning for spatio-temporal data mining: A survey. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarCross Ref
Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jianmin Wang, and S Yu Philip. 2018. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In International Conference on Machine Learning. PMLR, 5123–5132.Google Scholar
Yunbo Wang, Mingsheng Long, Jianmin Wang, Zhifeng Gao, and Philip S Yu. 2017. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 879–888.Google Scholar
Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2019. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9154–9162.Google ScholarCross Ref
Daniel Karl I Weidele, Justin D Weisz, Erick Oduor, Michael Muller, Josh Andres, Alexander Gray, and Dakuo Wang. 2020. AutoAIViz: opening the blackbox of automated artificial intelligence with conditional parallel coordinates. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 308–312.Google ScholarDigital Library
Chenyang Xu and Jerry L Prince. 1997. Gradient Vector Flow: A New External Force for Snakes. In Proceedings of IEEE International Conference on Computer Vision. 66–71.Google Scholar
Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, and Xiuwen Yi. 2016. DNN-based prediction model for spatio-temporal data. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 1–4.Google ScholarDigital Library
Pengpeng Zhao, Anjing Luo, Yanchi Liu, Fuzhen Zhuang, Jiajie Xu, Zhixu Li, Victor S Sheng, and Xiaofang Zhou. 2020. Where to go next: A spatio-temporal gated network for next poi recommendation. IEEE Transactions on Knowledge and Data Engineering (2020).Google Scholar
Zhiguang Zhou, Zhifei Ye, Yanan Liu, Fang Liu, Yubo Tao, and Weihua Su. 2017. Visual analytics for spatial clusters of air-quality data. IEEE computer graphics and applications 37, 5 (2017), 98–105.Google ScholarDigital Library

Index Terms

AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

Index terms have been assigned to the content through auto-classification.

Recommendations

Integrating machine learning with knowledge acquisition through direct interaction with domain experts

Knowledge elicitation from experts and empirical machine learning are two distinct approaches to knowledge acquisition with differing and mutually complementary capabilities. Learning apprentices have provided environments in which a knowledge engineer ...
Read More
Long-term time-series pollution forecast using statistical and deep learning methods
Abstract
Tackling air pollution has become of utmost importance since the last few decades. Different statistical as well as deep learning methods have been proposed till now, but seldom those have been used to forecast future long-term pollution trends. ...
Read More
Design of Knowledge-Based Systems with a Knowledge-Based Assistant
Special Issue on Artificial Intelligence in Software Applications

The authors propose a model for an intelligent assistant to aid in building knowledge-based systems (KBSs) and discuss a preliminary implementation. The assistant participates in KBS construction, including acquisition of an initial model of a problem ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces
March 2022
888 pages
ISBN:9781450391443
DOI:10.1145/3490099

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 March 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Air pollution
Explainable AI
Machine Learning
Spatio-Temporal Data
Validation
Visual Analytics
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate746of2,811submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 311
  Total Downloads
- Downloads (Last 12 months)70
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Integrating machine learning with knowledge acquisition through direct interaction with domain experts

Long-term time-series pollution forecast using statistical and deep learning methods

Design of Knowledge-Based Systems with a Knowledge-Based Assistant

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

AQX: Explaining Air Quality Forecast for Verifying Domain Knowledge using Feature Importance Visualization

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Integrating machine learning with knowledge acquisition through direct interaction with domain experts

Long-term time-series pollution forecast using statistical and deep learning methods

Design of Knowledge-Based Systems with a Knowledge-Based Assistant

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media