skip to main content
10.1145/3556557.3557951acmconferencesArticle/Chapter ViewAbstractPublication PagesmobicomConference Proceedingsconference-collections
research-article

Efficient federated learning under non-IID conditions with attackers

Published:22 November 2022Publication History

ABSTRACT

Federated learning (FL) has recently attracted much attention due to its advantages for data privacy. But every coin has two sides: protecting users' data (not requiring users to send their data) also makes FL more vulnerable to some types of attacks, such as targeted attacks and untargeted attacks. Many robust FL algorithms have therefore been proposed, in order to ensure training accuracy under such attacks. Some of the existing solutions assume that data conforms to the independent and identically distribution (i.i.d), so as to simplify the problem. But, limiting the data distribution to i.i.d hinders the practical application of FL, and FL under non-i.i.d conditions is more general. However, designing efficient robust algorithm for FL under non-i.i.d faces two additional challenges: identifying malicious clients and guaranteeing model accuracy. To tackle these challenges, we propose a new FL workflow named Cominer which consists of a Label Cluster process and a Vertical Comparison (VC) process. LC solves the problem of declining accuracy by supporting non-iid data diversity by classifying all clients into multiple clusters, then VC identifies and eliminates malicious clients within each cluster. We verify the improvement in accuracy achieved by Cominer in a series of experiments, and show that under Non-IID conditions, Cominer not only improves the accuracy of the federated model over previous algorithms by up to 24.85%, but also enjoys high resilience to different kinds of attacks while maintaining accuracy over 80%.

References

  1. Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2020. How To Backdoor Federated Learning. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108). PMLR, 2938--2948. https://proceedings.mlr.press/v108/bagdasaryan20a.htmlGoogle ScholarGoogle Scholar
  2. Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo. 2019. Analyzing Federated Learning through an Adversarial Lens. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97). PMLR, 634--643. https://proceedings.mlr.press/v97/bhagoji19a.htmlGoogle ScholarGoogle Scholar
  3. Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. 2017. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent. In Advances in Neural Information Processing Systems, Vol. 30. https://proceedings.neurips.cc/paper/2017/file/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdfGoogle ScholarGoogle Scholar
  4. Christopher Briggs, Zhong Fan, and Peter Andras. 2020. Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In 2020 International Joint Conference on Neural Networks (IJCNN). 1--9. Google ScholarGoogle ScholarCross RefCross Ref
  5. Yudong Chen, Lili Su, and Jiaming Xu. 2017. Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 2 (2017), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John R Douceur. 2002. The sybil attack. In International workshop on peer-to-peer systems. Springer, 251--260.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. El Mahdi El Mhamdi, Rachid Guerraoui, and Sebastien Rouault. 2018. The Hidden Vulnerability of Distributed Learning in Byzantium. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80). PMLR, 3521--3530. https://proceedings.mlr.press/v80/mhamdi18a.htmlGoogle ScholarGoogle Scholar
  8. Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. 2020. Local Model Poisoning Attacks to Byzantine-Robust Federated Learning. In 29th USENIX Security Symposium (USENIX Security 20). 1605--1622. https://www.usenix.org/conference/usenixsecurity20/presentation/fangGoogle ScholarGoogle Scholar
  9. Clement Fung, Chris JM Yoon, and Ivan Beschastnikh. 2018. Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866 (2018).Google ScholarGoogle Scholar
  10. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury. 2021. Oort: Efficient Federated Learning via Guided Participant Selection. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). 19--35. https://www.usenix.org/conference/osdi21/presentation/laiGoogle ScholarGoogle Scholar
  11. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324. Google ScholarGoogle ScholarCross RefCross Ref
  12. Liping Li, Wei Xu, Tianyi Chen, Georgios B. Giannakis, and Qing Ling. 2019. RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (Jul. 2019), 1544--1551. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Suyi Li, Yong Cheng, Wei Wang, Yang Liu, and Tianjian Chen. 2020. Learning to detect malicious clients for robust federated learning. arXiv preprint arXiv:2002.00211 (2020).Google ScholarGoogle Scholar
  14. Tian Li, Shengyuan Hu, Ahmad Beirami, and Virginia Smith. 2021. Ditto: Fair and Robust Federated Learning Through Personalization. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139). PMLR, 6357--6368. https://proceedings.mlr.press/v139/li21h.htmlGoogle ScholarGoogle Scholar
  15. Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2020. On the Convergence of FedAvg on Non-IID Data. In International Conference on Learning Representations. https://openreview.net/forum?id=HJxNAnVtDSGoogle ScholarGoogle Scholar
  16. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 54). PMLR, 1273--1282. https://proceedings.mlr.press/v54/mcmahan17a.htmlGoogle ScholarGoogle Scholar
  17. Felix Sattler, Klaus-Robert Müller, and Wojciech Samek. 2021. Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints. IEEE Transactions on Neural Networks and Learning Systems 32, 8 (2021), 3710--3722. Google ScholarGoogle ScholarCross RefCross Ref
  18. Felix Sattler, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2020. On the Byzantine Robustness of Clustered Federated Learning. In ICASSP IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8861--8865. Google ScholarGoogle ScholarCross RefCross Ref
  19. Jinhyun So, Başak Güler, and A. Salman Avestimehr. 2021. Byzantine-Resilient Secure Federated Learning. IEEE Journal on Selected Areas in Communications 39, 7 (2021), 2168--2181. Google ScholarGoogle ScholarCross RefCross Ref
  20. Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, and H Brendan McMahan. 2019. Can you really backdoor federated learning? arXiv preprint arXiv:1911.07963 (2019).Google ScholarGoogle Scholar
  21. Alysa Ziying Tan, Han Yu, Lizhen Cui, and Qiang Yang. 2022. Towards Personalized Federated Learning. IEEE Transactions on Neural Networks and Learning Systems (2022), 1--17. Google ScholarGoogle ScholarCross RefCross Ref
  22. Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, and Dimitris Papailiopoulos. 2020. Attack of the Tails: Yes, You Really Can Backdoor Federated Learning. In Advances in Neural Information Processing Systems, Vol. 33. 16070--16084. https://proceedings.neurips.cc/paper/2020/file/b8ffa41d4e492f0fad2f13e29e1762eb-Paper.pdfGoogle ScholarGoogle Scholar
  23. Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).Google ScholarGoogle Scholar
  24. Chulin Xie, Keli Huang, Pin-Yu Chen, and Bo Li. 2020. DBA: Distributed Backdoor Attacks against Federated Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=rkgyS0VFvrGoogle ScholarGoogle Scholar
  25. Yihao Xue, Chaoyue Niu, Zhenzhe Zheng, Shaojie Tang, Chengfei Lyu, Fan Wu, and Guihai Chen. 2021. Toward Understanding the Influence of Individual Clients in Federated Learning. Proceedings of the AAAI Conference on Artificial Intelligence 35, 12 (May 2021), 10560--10567. https://ojs.aaai.org/index.php/AAAI/article/view/17263Google ScholarGoogle ScholarCross RefCross Ref
  26. Dong Yin, Yudong Chen, Ramchandran Kannan, and Peter Bartlett. 2018. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80). PMLR, 5650--5659. https://proceedings.mlr.press/v80/yin18a.htmlGoogle ScholarGoogle Scholar
  27. Yuchao Zhang, Ye Tian, Wendong Wang, Peizhuang Cong, Chao Chen, Dan Li, and Ke Xu. 2020. Federated Routing Scheme for Large-scale Cross Domain Network. In IEEE INFOCOM 2020 WKSHPS. 1358--1359. Google ScholarGoogle ScholarCross RefCross Ref
  28. Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018).Google ScholarGoogle Scholar

Index Terms

  1. Efficient federated learning under non-IID conditions with attackers

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      FedEdge '22: Proceedings of the 1st ACM Workshop on Data Privacy and Federated Learning Technologies for Mobile Edge Network
      October 2022
      34 pages
      ISBN:9781450395212
      DOI:10.1145/3556557

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 November 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)130
      • Downloads (Last 6 weeks)14

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader