ABSTRACT
Distributed deep learning frameworks such as split learning provide great benefits with regards to the computational cost of training deep neural networks and the privacy-aware utilization of the collective data of a group of data-holders. Split learning, in particular, achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client's private data: the server can direct the client model towards learning any task of its choice, e.g. towards outputting easily invertible values. With a concrete example already proposed (Pasquini et al., CCS '21), such training-hijacking attacks present a significant risk for the data privacy of split learning clients. In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate our method's effectiveness, compare it with potential alternatives, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries.
- George J. Annas. 2003. HIPAA Regulations - A New Era of Medical-Record Privacy? New England Journal of Medicine , Vol. 348, 15 (April 2003), 1486--1490. https://doi.org/10.1056/NEJMlim035027Google ScholarCross Ref
- Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Kone?ný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. arXiv:1902.01046 [cs, stat] (March 2019). http://arxiv.org/abs/1902.01046 arXiv: 1902.01046.Google Scholar
- Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. , Vol. 9, 3--4 (2014), 211--407.Google ScholarDigital Library
- Ege Erdogan, Alptekin Kupcu, and A. Ercument Cicek. 2021. UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning. arXiv preprint arXiv:2108.09033 (2021).Google Scholar
- Grzegorz Gawron and Philip Stubbings. 2022. Feature Space Hijacking Attacks against Differentially Private Split Learning. arXiv preprint arXiv:2201.04018 (2022).Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.Google ScholarDigital Library
- Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. arXiv:1810.06060 [cs, stat] (Oct. 2018). http://arxiv.org/abs/1810.06060 arXiv: 1810.06060.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Dimitris Kalimeris, Gal Kaplun, Preetum Nakkiran, Benjamin Edelman, Tristan Yang, Boaz Barak, and Haofeng Zhang. 2019. Sgd on neural networks learns functions of increasing complexity. Advances in Neural Information Processing Systems , Vol. 32 (2019), 3496--3506.Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs] (Jan. 2017). http://arxiv.org/abs/1412.6980 arXiv: 1412.6980.Google Scholar
- Jakub Kone?ný, H. Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arXiv:1610.02527 [cs] (Oct. 2016). http://arxiv.org/abs/1610.02527 arXiv: 1610.02527.Google Scholar
- Jakub Kone?ný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2017. Federated Learning: Strategies for Improving Communication Efficiency. arXiv:1610.05492 [cs] (Oct. 2017). http://arxiv.org/abs/1610.05492 arXiv: 1610.05492.Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google Scholar
- Yann LeCun, Corinna Cortes, and CJ Burges. 2010. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist , Vol. 2 (2010).Google Scholar
- Rebecca T. Mercuri. 2004. The HIPAA-potamus in health care data security. Commun. ACM, Vol. 47, 7 (July 2004), 25--28. https://doi.org/10.1145/1005817.1005840Google ScholarDigital Library
- Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the tiger: Inference attacks on split learning. In ACM CCS. 2113--2129.Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS. 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle ScholarDigital Library
- Sebastian Ruder. 2017. An overview of gradient descent optimization algorithms. arXiv:1609.04747 [cs] (June 2017). http://arxiv.org/abs/1609.04747 arXiv: 1609.04747.Google Scholar
- Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, and Ramesh Raskar. 2019. Detailed comparison of communication efficiency of split learning and federated learning. arXiv preprint arXiv:1909.09145 (2019).Google Scholar
- Neil C Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F Manso. 2020. The computational limits of deep learning. arXiv preprint arXiv:2007.05558 (2020).Google Scholar
- Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018a. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018).Google Scholar
- Praneeth Vepakomma, Tristan Swedish, Ramesh Raskar, Otkrist Gupta, and Abhimanyu Dubey. 2018b. No peek: A survey of private distributed deep learning. arXiv preprint arXiv:1812.03288 (2018).Google Scholar
- Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017). ioGoogle Scholar
Index Terms
- SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning
Recommendations
UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks against Split Learning
WPES'22: Proceedings of the 21st Workshop on Privacy in the Electronic SocietyTraining deep neural networks often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning aims to address this concern by distributing the model among a client and a server. The scheme supposedly ...
Can We Use Split Learning on 1D CNN Models for Privacy Preserving Training?
ASIA CCS '20: Proceedings of the 15th ACM Asia Conference on Computer and Communications SecurityA new collaborative learning, called split learning, was recently introduced, aiming to protect user data privacy without revealing raw input data to a server. It collaboratively runs a deep neural network model where the model is split into two parts, ...
Attacks and Defenses towards Machine Learning Based Systems
CSAE '18: Proceedings of the 2nd International Conference on Computer Science and Application EngineeringRecent research1 has shown that machine learning models are venerable to attacks by adversaries almost at all phases of machine learning pipeline, such as positioning attacks on training data, attacks on the learning algorithm, input attacks based on ...
Comments