Skip to main content

STEEX: Steering Counterfactual Explanations with Semantics

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13672))

Abstract

As deep learning models are increasingly used in safety-critical applications, explainability and trustworthiness become major concerns. For simple images, such as low-resolution face portraits, synthesizing visual counterfactual explanations has recently been proposed as a way to uncover the decision mechanisms of a trained classification model. In this work, we address the problem of producing cotual explanations for high-quality images and complex scenes. Leveraging recent semantic-to-image models, we propose a new generative counterfactual explanation framework that produces plausible and sparse modifications which preserve the overall scene structure. Furthermore, we introduce the concept of “region-targeted counterfactual explanations”, and a corresponding framework, where users can guide the generation of counterfactuals by specifying a set of semantic regions of the query image the explanation must be about. Extensive experiments are conducted on challenging datasets including high-quality portraits (CelebAMask-HQ) and driving scenes (BDD100k). Code is available at: https://github.com/valeoai/STEEX.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access (2018)

    Google Scholar 

  2. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS ONE (2015)

    Google Scholar 

  3. Beaudouin, V., et al.: Flexible and context-specific AI explainability: a multidisciplinary approach. CoRR abs/2003.07703 (2020)

    Google Scholar 

  4. Bojarski, M., et al.: Visualbackprop: efficient visualization of CNNs for autonomous driving. In: ICRA (2018)

    Google Scholar 

  5. Bora, A., Jalal, A., Price, E., Dimakis, A.G.: Compressed sensing using generative models. In: ICML (2017)

    Google Scholar 

  6. Browne, K., Swift, B.: Semantics and explanation: why counterfactual explanations produce adversarial examples in deep neural networks. CoRR abs/2012.10076 (2020)

    Google Scholar 

  7. Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: FG (2018)

    Google Scholar 

  8. Chang, C., Creager, E., Goldenberg, A., Duvenaud, D.: Explaining image classifiers by counterfactual generation. In: ICLR (2019)

    Google Scholar 

  9. Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., Su, J.: This looks like that: deep learning for interpretable image recognition. In: NeurIPS (2019)

    Google Scholar 

  10. Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR abs/1706.05587 (2017)

    Google Scholar 

  11. Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. In: NeurIPS (2018)

    Google Scholar 

  12. Das, A., Rad, P.: Opportunities and challenges in explainable artificial intelligence. (XAI), a survey. CoRR (2020)

    Google Scholar 

  13. Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: ICCV (2017)

    Google Scholar 

  14. Freiesleben, T.: Counterfactual explanations & adversarial examples - common grounds, essential differences, and potential transfers. CoRR abs/2009.05487 (2020)

    Google Scholar 

  15. Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. In: Workshop on Comprehensibility and Explanation in AI and ML @AI*IA (2017)

    Google Scholar 

  16. Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: an overview of interpretability of machine learning. In: DSSA (2018)

    Google Scholar 

  17. Goodfellow, I.J., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) NeurIPS (2014)

    Google Scholar 

  18. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)

    Google Scholar 

  19. Goyal, Y., Wu, Z., Ernst, J., Batra, D., Parikh, D., Lee, S.: Counterfactual visual explanations. In: ICML (2019)

    Google Scholar 

  20. Harradon, M., Druce, J., Ruttenberg, B.E.: Causal learning and explanation of deep neural networks via autoencoded activations. CoRR (2018)

    Google Scholar 

  21. Hendricks, L.A., Hu, R., Darrell, T., Akata, Z.: Grounding visual explanations. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 269–286. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_17

    Chapter  Google Scholar 

  22. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local NASH equilibrium. In: NeurIPS (2017)

    Google Scholar 

  23. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)

    Google Scholar 

  24. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  25. Lang, O., et al.: Explaining in style: training a GAN to explain a classifier in stylespace. In: ICCV (2021)

    Google Scholar 

  26. Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. In: CVPR (2020)

    Google Scholar 

  27. Lee, W., Kim, D., Hong, S., Lee, H.: High-fidelity synthesis with disentangled representation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 157–174. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_10

    Chapter  Google Scholar 

  28. Li, Z., Xu, C.: Discover the unknown biased attribute of an image classifier. In: In: The IEEE International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  29. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)

    Google Scholar 

  30. Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: NeurIPS (2017)

    Google Scholar 

  31. Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate method to fool deep neural networks. In: CVPR (2016)

    Google Scholar 

  32. Park, T., Liu, M., Wang, T., Zhu, J.: Semantic image synthesis with spatially-adaptive normalization. In: CVPR (2019)

    Google Scholar 

  33. Pawelczyk, M., Joshi, S., Agarwal, C., Upadhyay, S., Lakkaraju, H.: On the connections between counterfactual explanations and adversarial examples. CoRR abs/2106.09992 (2021)

    Google Scholar 

  34. Rebuffi, S., Fong, R., Ji, X., Vedaldi, A.: There and back again: revisiting backpropagation saliency methods. In: CVPR (2020)

    Google Scholar 

  35. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: explaining the predictions of any classifier. In: SIGKDD (2016)

    Google Scholar 

  36. Rodríguez, P., et al.: Beyond trivial counterfactual explanations with diverse valuable explanations. In: ICCV (2021)

    Google Scholar 

  37. Schönfeld, E., Sushko, V., Zhang, D., Gall, J., Schiele, B., Khoreva, A.: You only need adversarial supervision for semantic image synthesis. In: ICLR (2021)

    Google Scholar 

  38. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV (2017)

    Google Scholar 

  39. Shen, Y., et al.: To explain or not to explain: a study on the necessity of explanations for autonomous vehicles. CoRR (2020)

    Google Scholar 

  40. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: ICML (2017)

    Google Scholar 

  41. Singla, S., Pollack, B., Chen, J., Batmanghelich, K.: Explanation by progressive exaggeration. In: ICLR (2020)

    Google Scholar 

  42. Srivastava, A., et al.: Improving the reconstruction of disentangled representation learners via multi-stage modelling. CoRR abs/2010.13187 (2020)

    Google Scholar 

  43. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML (2017)

    Google Scholar 

  44. Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)

    Google Scholar 

  45. Tian, Y., Pei, K., Jana, S., Ray, B.: DeeptEST: automated testing of deep-neural-network-driven autonomous cars. In: ICSE (2018)

    Google Scholar 

  46. Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Deep image prior. IJCV (2020)

    Google Scholar 

  47. Verma, S., Dickerson, J.P., Hines, K.: Counterfactual explanations for machine learning: a review. CoRR abs/2010.10596 (2020)

    Google Scholar 

  48. Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard J. Law Technol. (2017)

    Google Scholar 

  49. Wagner, J., Köhler, J.M., Gindele, T., Hetzel, L., Wiedemer, J.T., Behnke, S.: Interpretable and fine-grained visual explanations for convolutional neural networks. In: CVPR (2019)

    Google Scholar 

  50. Wang, P., Vasconcelos, N.: SCOUT: self-aware discriminant counterfactual explanations. In: CVPR (2020)

    Google Scholar 

  51. Xu, Y., et al.: Explainable object-induced action decision for autonomous vehicles. In: CVPR (2020)

    Google Scholar 

  52. Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: CVPR (2020)

    Google Scholar 

  53. Zablocki, É., Ben-Younes, H., Pérez, P., Cord, M.: Explainability of vision-based autonomous driving systems: review and challenges. CoRR abs/2101.05307 (2021)

    Google Scholar 

  54. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  55. Zhang, M., Zhang, Y., Zhang, L., Liu, C., Khurshid, S.: DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In: IEEE ASE (2018)

    Google Scholar 

  56. Zhang, Q., Yang, X.J., Robert, L.P.: Expectations and trust in automated vehicles. In: CHI (2020)

    Google Scholar 

  57. Zhang, Q., Wu, Y.N., Zhu, S.: Interpretable convolutional neural networks. In: CVPR (2018)

    Google Scholar 

  58. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: ICLR (2015)

    Google Scholar 

  59. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)

    Google Scholar 

  60. Zhu, P., Abdal, R., Qin, Y., Wonka, P.: SEAN: image synthesis with semantic region-adaptive normalization. In: CVPR (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Éloi Zablocki .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 16713 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jacob, P., Zablocki, É., Ben-Younes, H., Chen, M., Pérez, P., Cord, M. (2022). STEEX: Steering Counterfactual Explanations with Semantics. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13672. Springer, Cham. https://doi.org/10.1007/978-3-031-19775-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19775-8_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19774-1

  • Online ISBN: 978-3-031-19775-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics