Skip to main content
Log in

Understanding deep learning defenses against adversarial examples through visualizations for dynamic risk assessment

  • S.I. : Cybersecurity Applications of Computational Intelligence
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, deep neural network models have been developed in different fields, where they have brought many advances. However, they have also started to be used in tasks where risk is critical. Misdiagnosis of these models can lead to serious accidents or even death. This concern has led to an interest among researchers to study possible attacks on these models, discovering a long list of vulnerabilities, from which every model should be defended. The adversarial example attack is a widely known attack among researchers, who have developed several defenses to avoid such a threat. However, these defenses are as opaque as a deep neural network model, how they work is still unknown. This is why visualizing how they change the behavior of the target model is interesting in order to understand more precisely how the performance of the defended model is being modified. For this work, three defense strategies, against adversarial example attacks, have been selected in order to visualize the behavior modification of each of them in the defended model. Adversarial training, dimensionality reduction, and prediction similarity were the selected defenses, which have been developed using a model composed of convolution neural network layers and dense neural network layers. In each defense, the behavior of the original model has been compared with the behavior of the defended model, representing the target model by a graph in a visualization. This visualization allows identifying the vulnerabilities of the model and shows how the defenses try to avoid them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. http://2020.cisisconference.eu/.

  2. https://www.kaggle.com/paultimothymooney/breast-histopathology-images.

  3. https://github.com/bethgelab/foolbox.

  4. https://www.kaggle.com/paultimothymooney/breast-histopathology-images.

References

  1. Akhtar N, Mian A (2018) Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Trans Image Process 6:14410–14430

    Google Scholar 

  2. Bhagoji AN, Cullina D, Sitawarin C, Mittal P (2018) Enhancing robustness of machine learning systems via data transformations. In: 2018 52nd Annual Conference on Information Sciences and Systems (CISS) 1–5

  3. Carlini N, Wagner D (2017) Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, pp. 3–14. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3128572.3140444

  4. Carter S, Armstrong Z, Schubert L, Johnson I, Olah C (2019) Exploring neural networks with activation atlases. Distill

  5. Cruz-Roa A, Basavanhally A, González F, Gilmore H, Feldman M, Ganesan S, Shih N, Tomaszewski J, Madabhushi A (2014) Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks. In: Medical Imaging 2014: Digital Pathology, vol. 9041, p. 904103. International Society for Optics and Photonics

  6. Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, Li J (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  7. Dou Z, Osher SJ, Wang B (2018) Mathematical analysis of adversarial attacks. CoRR abs/1811.06492. http://arxiv.org/abs/1811.06492

  8. Echeberria-Barrio X, Gil-Lerchundi A, Goicoechea-Telleria I, Orduna-Urrutia R (2021) Deep learning defenses against adversarial examples for dynamic risk assessment. In: 13th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2020)

  9. Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS (2019) Adversarial attacks on medical machine learning. Science 363(6433):1287–1289

    Article  Google Scholar 

  10. Gu S, Rigazio L (2015) Towards deep neural network architectures robust to adversarial examples. In: 3rd International Conference on Learning Representations, ICLR 2015 - Workshop Track Proceedings, pp. 1–9

  11. Hohman F, Park H, Robinson C, Chau DHP (2019) S ummit: scaling deep learning interpretability by visualizing activation and attribution summarizations. IEEE Trans Visualiz Comput Graph 26(1):1096–1106

    Article  Google Scholar 

  12. Ilyas A, Engstrom L, Athalye A, Lin J (2018) Black-box adversarial attacks with limited queries and information. In: J. Dy, A. Krause (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 2137–2146. PMLR, Stockholmsmässan, Stockholm Sweden

  13. Janowczyk A, Madabhushi A (2016) Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform 7(1):29. https://doi.org/10.4103/2153-3539.186902

    Article  Google Scholar 

  14. Kahng M, Andrews PY, Kalro A, Chau DH (2018) Activis: visual exploration of industry-scale deep neural network models. IEEE Trans Visualiz Comput Graph 24(1):88–97. https://doi.org/10.1109/TVCG.2017.2744718

    Article  Google Scholar 

  15. Kurakin A, Goodfellow IJ, Bengio S (2019) Adversarial examples in the physical world. In: 5th International Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings, pp 1–14

  16. Lu J, Issaranon T, Forsyth D (2017) Safetynet: Detecting and rejecting adversarial examples robustly. In: 2017 IEEE International Conference on Computer Vision (ICCV) 446–454. https://doi.org/10.1109/ICCV.2017.56

  17. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations

  18. Metzen JH, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. In: Proceedings of ICLR 2017

  19. Moosavi-Dezfooli S, Fawzi A, Frossard P (2016) Deepfool: A simple and accurate method to fool deep neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574–2582

  20. Olah C, Satyanarayan A, Johnson I, Carter S, Schubert L, Ye K, Mordvintsev A (2018) The building blocks of interpretability. Distill 3(3):e10

    Article  Google Scholar 

  21. Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z, Swami A (2016) The limitations of deep learning in adversarial settings. In: Proceedings - 2016 IEEE European Symposium on Security and Privacy, EURO S and P 2016, Proceedings - 2016 IEEE European Symposium on Security and Privacy, EURO S and P 2016, pp. 372–387. Institute of Electrical and Electronics Engineers Inc., United States. https://doi.org/10.1109/EuroSP.2016.36

  22. Pezzotti N, Höllt T, Van Gemert J, Lelieveldt BP, Eisemann E, Vilanova A (2018) Deepeyes: progressive visual analytics for designing deep neural networks. IEEE Trans Visualiz Comput Graph 24(1):98–108. https://doi.org/10.1109/TVCG.2017.2744358

    Article  Google Scholar 

  23. Rathore A, Chalapathi N, Palande S, Wang B (2021) Topoact: visually exploring the shape of activations in deep learning. Comput Graph Forum 40(1):382–397. https://doi.org/10.1111/cgf.14195

    Article  Google Scholar 

  24. Sahay R, Mahfuz R, Gamal AE (2019) Combatting adversarial attacks through denoising and dimensionality reduction: A cascaded autoencoder approach. In: 2019 53rd Annual Conference on Information Sciences and Systems (CISS), pp 1–6

  25. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 618–626. https://doi.org/10.1109/ICCV.2017.74

  26. Sharma P, Austin D, Liu H (2019) Attacks on machine learning: Adversarial examples in connected and autonomous vehicles. In: 2019 IEEE International Symposium on Technologies for Homeland Security (HST), pp 1–7. https://doi.org/10.1109/HST47167.2019.9032989

  27. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: International Conference on Learning Representations

  28. Tramer F, Boneh D (2019) Adversarial training and robustness for multiple perturbations. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché Buc F, Fox E, Garnett R (eds.) Advances in Neural Information Processing Systems 32, pp 5866–5876. Curran Associates, Inc

  29. Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? CoRR abs/1810.00826. http://arxiv.org/abs/1810.00826

  30. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision - ECCV 2014. Springer International Publishing, Cham, pp 818–833

  31. Zhang L, Zhang L, Mou X, Zhang D (2011) Fsim: a feature similarity index for image quality assessment. IEEE Trans Image Process 20(8):2378–2386

    Article  MathSciNet  MATH  Google Scholar 

  32. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 586–595

  33. Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Sun M (2018) Graph neural networks: A review of methods and applications. CoRR abs/1812.08434. http://arxiv.org/abs/1812.08434

  34. Zhou W, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded under the SPARTA project, which has received funding from the European Union Horizon 2020 research and innovation programme under grant agreement No 830892.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xabier Echeberria-Barrio.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Echeberria-Barrio, X., Gil-Lerchundi, A., Egana-Zubia, J. et al. Understanding deep learning defenses against adversarial examples through visualizations for dynamic risk assessment. Neural Comput & Applic 34, 20477–20490 (2022). https://doi.org/10.1007/s00521-021-06812-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06812-y

Keywords

Navigation