skip to main content
10.1145/3559613.3563198acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open Access

SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning

Published:07 November 2022Publication History

ABSTRACT

Distributed deep learning frameworks such as split learning provide great benefits with regards to the computational cost of training deep neural networks and the privacy-aware utilization of the collective data of a group of data-holders. Split learning, in particular, achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client's private data: the server can direct the client model towards learning any task of its choice, e.g. towards outputting easily invertible values. With a concrete example already proposed (Pasquini et al., CCS '21), such training-hijacking attacks present a significant risk for the data privacy of split learning clients. In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate our method's effectiveness, compare it with potential alternatives, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries.

References

  1. George J. Annas. 2003. HIPAA Regulations - A New Era of Medical-Record Privacy? New England Journal of Medicine , Vol. 348, 15 (April 2003), 1486--1490. https://doi.org/10.1056/NEJMlim035027Google ScholarGoogle ScholarCross RefCross Ref
  2. Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Kone?ný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. arXiv:1902.01046 [cs, stat] (March 2019). http://arxiv.org/abs/1902.01046 arXiv: 1902.01046.Google ScholarGoogle Scholar
  3. Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. , Vol. 9, 3--4 (2014), 211--407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ege Erdogan, Alptekin Kupcu, and A. Ercument Cicek. 2021. UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning. arXiv preprint arXiv:2108.09033 (2021).Google ScholarGoogle Scholar
  5. Grzegorz Gawron and Philip Stubbings. 2022. Feature Space Hijacking Attacks against Differentially Private Split Learning. arXiv preprint arXiv:2201.04018 (2022).Google ScholarGoogle Scholar
  6. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. arXiv:1810.06060 [cs, stat] (Oct. 2018). http://arxiv.org/abs/1810.06060 arXiv: 1810.06060.Google ScholarGoogle Scholar
  8. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  9. Dimitris Kalimeris, Gal Kaplun, Preetum Nakkiran, Benjamin Edelman, Tristan Yang, Boaz Barak, and Haofeng Zhang. 2019. Sgd on neural networks learns functions of increasing complexity. Advances in Neural Information Processing Systems , Vol. 32 (2019), 3496--3506.Google ScholarGoogle Scholar
  10. Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs] (Jan. 2017). http://arxiv.org/abs/1412.6980 arXiv: 1412.6980.Google ScholarGoogle Scholar
  11. Jakub Kone?ný, H. Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arXiv:1610.02527 [cs] (Oct. 2016). http://arxiv.org/abs/1610.02527 arXiv: 1610.02527.Google ScholarGoogle Scholar
  12. Jakub Kone?ný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2017. Federated Learning: Strategies for Improving Communication Efficiency. arXiv:1610.05492 [cs] (Oct. 2017). http://arxiv.org/abs/1610.05492 arXiv: 1610.05492.Google ScholarGoogle Scholar
  13. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  14. Yann LeCun, Corinna Cortes, and CJ Burges. 2010. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist , Vol. 2 (2010).Google ScholarGoogle Scholar
  15. Rebecca T. Mercuri. 2004. The HIPAA-potamus in health care data security. Commun. ACM, Vol. 47, 7 (July 2004), 25--28. https://doi.org/10.1145/1005817.1005840Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the tiger: Inference attacks on split learning. In ACM CCS. 2113--2129.Google ScholarGoogle Scholar
  17. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS. 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sebastian Ruder. 2017. An overview of gradient descent optimization algorithms. arXiv:1609.04747 [cs] (June 2017). http://arxiv.org/abs/1609.04747 arXiv: 1609.04747.Google ScholarGoogle Scholar
  19. Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, and Ramesh Raskar. 2019. Detailed comparison of communication efficiency of split learning and federated learning. arXiv preprint arXiv:1909.09145 (2019).Google ScholarGoogle Scholar
  20. Neil C Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F Manso. 2020. The computational limits of deep learning. arXiv preprint arXiv:2007.05558 (2020).Google ScholarGoogle Scholar
  21. Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018a. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018).Google ScholarGoogle Scholar
  22. Praneeth Vepakomma, Tristan Swedish, Ramesh Raskar, Otkrist Gupta, and Abhimanyu Dubey. 2018b. No peek: A survey of private distributed deep learning. arXiv preprint arXiv:1812.03288 (2018).Google ScholarGoogle Scholar
  23. Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017). ioGoogle ScholarGoogle Scholar

Index Terms

  1. SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader