skip to main content
10.1145/3474123.3486760acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

Private Hierarchical Clustering and Efficient Approximation

Published:15 November 2021Publication History

ABSTRACT

In collaborative learning, multiple parties contribute their datasets to jointly deduce global machine learning models for numerous predictive tasks. Despite its efficacy, this learning paradigm fails to encompass critical application domains that involve highly sensitive data, such as healthcare and security analytics, where privacy risks limit entities to individually train models using only their own datasets. In this work, we target privacy-preserving collaborative hierarchical clustering. We introduce a formal security definition that aims to achieve balance between utility and privacy and present a two-party protocol that provably satisfies it. We then extend our protocol with: (i) an optimized version for single-linkage clustering, and (ii) scalable approximation variants. We implement all our schemes and experimentally evaluate their performance and accuracy on synthetic and real datasets, obtaining very encouraging results. For example, end-to-end execution of our secure approximate protocol for over 1M 10-dimensional data samples requires 35sec of computation and achieves 97.09% accuracy.

Skip Supplemental Material Section

Supplemental Material

CCSW-66-meng.mp4

mp4

33.5 MB

CCSW-66-meng.mp4

mp4

33.5 MB

References

  1. 2017. The Intelligent Trial: AI Comes To Clinical Trials. Clinical Informatics News. http://www.clinicalinformaticsnews.com/2017/09/29/the-intelligent- trial-ai-comes-to-clinical-trials.aspx.Google ScholarGoogle Scholar
  2. 2019. The UCI Machine Learning Data Repository. http://archive.ics.uci.edu/ ml/index.php.Google ScholarGoogle Scholar
  3. 2019. UTexas Paillier Library. http://acsc.cs.utexas.edu/libpaillier.Google ScholarGoogle Scholar
  4. 2021. AWS VPC. https://aws.amazon.com/vpc.Google ScholarGoogle Scholar
  5. Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In ACM SIGSAC CCS 2016. 308--318.Google ScholarGoogle Scholar
  6. Mohammad Al-Rubaie and J. Morris Chang. 2019. Privacy-Preserving Machine Learning: Threats and Solutions. IEEE Secur. Priv., Vol. 17, 2 (2019), 49--58. https://doi.org/10.1109/MSEC.2018.2888775Google ScholarGoogle ScholarCross RefCross Ref
  7. AlienVault. 2020. Open Threat Exchange. Available at https://otx.alienvault.com/.Google ScholarGoogle Scholar
  8. Cyber Threat Alliance. 2020. Available at http://cyberthreatalliance.org/.Google ScholarGoogle Scholar
  9. Yoshinori Aono, Takuya Hayashi, Le Trieu Phong, and Lihua Wang. 2016. Scalable and Secure Logistic Regression via Homomorphic Encryption. In ACM CODASPY 2016. 142--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gilad Asharov, Shai Halevi, Yehuda Lindell, and Tal Rabin. 2018. Privacy-Preserving Search of Similar Patients in Genomic Data. PoPETs, Vol. 2018, 4 (2018), 104--124. https://doi.org/10.1515/popets-2018-0034Google ScholarGoogle ScholarCross RefCross Ref
  11. Foteini Baldimtsi, Dimitrios Papadopoulos, Stavros Papadopoulos, Alessandra Scafuro, and Nikos Triandopoulos. 2017. Server-Aided Secure Computation with Off-line Parties. In ESORICS 2017. 103--123.Google ScholarGoogle Scholar
  12. Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and Thomas Schneider. 2011. Privacy-Preserving ECG Classification With Branching Programs and Neural Networks. IEEE Trans. Information Forensics and Security, Vol. 6, 2 (2011), 452--468. https://doi.org/10.1109/TIFS.2011.2108650Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda. 2009. Scalable, Behavior-Based Malware Clustering.. In Proceedings of the 16th Symposium on Network and Distributed System Security (NDSS).Google ScholarGoogle Scholar
  14. Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. 2012. Foundations of garbled circuits. In ACM CCS 2012. 784--796. https://doi.org/10.1145/2382196.2382279Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS '17). ACM, 1175--1191. https://doi.org/10.1145/3133956.3133982Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. 2015. Machine Learning Classification over Encrypted Data. In NDSS 2015.Google ScholarGoogle ScholarCross RefCross Ref
  17. Beyza Bozdemir, Sébastien Canard, Orhan Ermis, Helen Möllering, Melek Önen, and Thomas Schneider. 2021. Privacy-preserving Density-based Clustering. In ASIA CCS '21: ACM Asia Conference on Computer and Communications Security, Virtual Event, Hong Kong, June 7-11, 2021. ACM, 658--671. https://doi.org/10.1145/3433210.3453104Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Paul Bunn and Rafail Ostrovsky. 2007. Secure two-party k-means clustering. In Proceedings of the 2007 ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, USA, October 28-31, 2007. 486--497. https://doi.org/10.1145/1315245.1315306Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS).Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hervé Chabanne, Amaury de Wargny, Jonathan Milgram, Constance Morel, and Emmanuel Prouff. 2017. Privacy-Preserving Classification on Deep Neural Network. Cryptology ePrint Archive, Report 2017/035.Google ScholarGoogle Scholar
  21. Javad Ghareh Chamani and Dimitrios Papadopoulos. 2020. Mitigating Leakage in Federated Learning with Trusted Hardware. CoRR, Vol. abs/2011.04948 (2020). arxiv: 2011.04948 https://arxiv.org/abs/2011.04948Google ScholarGoogle Scholar
  22. Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul Sharma, and Shardul Tripathi. [n.d.]. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In IEEE European Symposium on Security and Privacy, EuroS&P 2019. 496--511. https://doi.org/10.1109/EuroSP.2019.00043Google ScholarGoogle Scholar
  23. Melissa Chase, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, and Peter Rindal. 2017. Private Collaborative Neural Network Learning. IACR Cryptology ePrint Archive, Vol. 2017 (2017), 762. http://eprint.iacr.org/2017/762Google ScholarGoogle Scholar
  24. Kamalika Chaudhuri and Claire Monteleoni. 2008. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems 21, 2008. 289--296.Google ScholarGoogle Scholar
  25. Jung Hee Cheon, Duhyeong Kim, and Jai Hyun Park. 2019. Towards a Practical Cluster Analysis over Encrypted Data. In Selected Areas in Cryptography - SAC 2019 - 26th International Conference, Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11959). Springer, 227--249. https://doi.org/10.1007/978-3-030-38471-5_10Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, and Claire Mathieu. [n.d.]. Hierarchical Clustering: Objective Functions and Algorithms. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018,, Artur Czumaj (Ed.). 378--397. https://doi.org/10.1137/1.9781611975031.26Google ScholarGoogle Scholar
  27. Ipsa De and Animesh Tripathy. 2014. A Secure Two Party Hierarchical Clustering Approach for Vertically Partitioned Data Set with Accuracy Measure. In Recent Advances in Intelligent Informatics. Springer International Publishing, 153--162.Google ScholarGoogle Scholar
  28. D. Demmler, T. Schneider, and M. Zohner. 2015. ABY - A framework for efficient mixed-protocol secure two-party computation. In Proc. n 22nd Annual Network and Distributed System Security Symposium (NDSS).Google ScholarGoogle Scholar
  29. Ben Dickson. 2016. How threat intelligence sharing can help deal with cybersecurity challenges. Available at https://techcrunch.com/2016/05/15/how-threat-intelligence-sharing-can-help-deal-with-cybersecurity-challenges/.Google ScholarGoogle Scholar
  30. Mahir Can Doganay, Thomas Brochmann Pedersen, Yücel Saygin, Erkay Savas, and Albert Levi. 2008. Distributed privacy preserving k-means clustering with additive secret sharing. In PAIS 2008. 3--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wenliang Du and Mikhail J. Atallah. 2001. Privacy-Preserving Cooperative Scientific Computations. In 14th IEEE Computer Security Foundations Workshop (CSFW-14 2001). 273--294.Google ScholarGoogle Scholar
  32. Wenliang Du, Yunghsiang S. Han, and Shigang Chen. 2004. Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification. In Proceedings of the Fourth SIAM International Conference on Data Mining. 222--233.Google ScholarGoogle ScholarCross RefCross Ref
  33. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In TCC 2006. 265--284.Google ScholarGoogle Scholar
  34. Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, and David Botstein. 1998. Cluster analysis and display of genome-wide expression patterns., Vol. 95 (1998), 14863--14868. Issue 25.Google ScholarGoogle ScholarCross RefCross Ref
  35. Zekeriya Erkin, Thijs Veugen, Tomas Toft, and Reginald L. Lagendijk. 2013. Privacy-preserving distributed clustering. EURASIP J. Information Security, Vol. 2013 (2013), 4. https://doi.org/10.1186/1687-417X-2013-4Google ScholarGoogle ScholarCross RefCross Ref
  36. Facebook. 2018. Threat Exchange. Available at https://developers.facebook.com/products/threat-exchange.Google ScholarGoogle Scholar
  37. Joan Feigenbaum, Yuval Ishai, Tal Malkin, Kobbi Nissim, Martin J. Strauss, and Rebecca N. Wright. 2006. Secure multiparty computation of approximations. ACM Trans. Algorithms, Vol. 2, 3 (2006), 435--472. https://doi.org/10.1145/1159892.1159900Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Stephen E. Fienberg, William J. Fulp, Aleksandra B. Slavkovic, and Tracey A. Wrobel. 2006. "Secure" Log-Linear and Logistic Regression Analysis of Distributed Databases. In Privacy in Statistical Databases. 277--290.Google ScholarGoogle Scholar
  39. Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks That Exploit Confidence Information and Basic Countermeasures (CCS '15). ACM, New York, NY, USA, 1322--1333. https://doi.org/10.1145/2810103.2813677Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Adrià Gascó n, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2017. Privacy-Preserving Distributed Linear Regression on High-Dimensional Data. PoPETs, Vol. 2017, 4 (2017), 345--364. https://doi.org/10.1515/popets-2017-0053Google ScholarGoogle ScholarCross RefCross Ref
  41. Craig Gentry. 2009. A Fully Homomorphic Encryption Scheme. Ph.D. Dissertation. Stanford, CA, USA. Advisor(s) Boneh, Dan. AAI3382729.Google ScholarGoogle Scholar
  42. Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In Proc. 33rd International Conference on Machine Learning (ICML).Google ScholarGoogle Scholar
  43. Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Peter Rindal, and Mike Rosulek. [n.d.]. Secure Data Exchange: A Marketplace in the Cloud. In Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, CCSW@CCS 2019. 117--128. https://doi.org/10.1145/3338466.3358924Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, Vol. 99, 12 (11 June 2002), 7821--7826. https://doi.org/10.1073/pnas.122653799Google ScholarGoogle ScholarCross RefCross Ref
  45. Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to Play any Mental Game or A Completeness Theorem for Protocols with Honest Majority. In ACM STOC 1987. 218--229. https://doi.org/10.1145/28395.28420Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008. BotMiner: Clustering Analysis of Network Traffic for Protocol and Structure-independent Botnet Detection. In Proceedings of the 17th USENIX Security Symposium.Google ScholarGoogle Scholar
  47. Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. 2001. Cure: An Efficient Clustering Algorithm for Large Databases. Inf. Syst., Vol. 26, 1 (2001), 35--58. https://doi.org/10.1016/S0306-4379(01)00008-4Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mona Hamidi, Mina Sheikhalishahi, and Fabio Martinelli. 2018. Privacy Preserving Expectation Maximization (EM) Clustering Construction. In DCAI 2018 (Advances in Intelligent Systems and Computing, Vol. 800). Springer, 255--263. https://doi.org/10.1007/978-3-319-94649-8_31Google ScholarGoogle Scholar
  49. Aditya Hegde, Helen Möllering, Thomas Schneider, and Hossein Yalame. 2021. SoK: Efficient Privacy-preserving Clustering. Proc. Priv. Enhancing Technol., Vol. 2021, 4 (2021), 225--248. https://doi.org/10.2478/popets-2021-0068Google ScholarGoogle ScholarCross RefCross Ref
  50. W. Henecka, S. Kögl, A.-R. Sadeghi, T. Schneider, and I. Wehrenberg. 1999. Tasty: Tool for automating secure two-party computations. In Proc. ACM Conference on Computer and Communications Security (CCS).Google ScholarGoogle Scholar
  51. Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. [n.d.]. Deep Neural Networks Classification over Encrypted Data. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019. 97--108. https://doi.org/10.1145/3292006.3300044Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Catherine Jones. 2017. Privacy-preserving Machine Learning in Cloud. In Proceedings of the 9th Cloud Computing Security Workshop, CCSW@CCS 2017, Dallas, TX, USA, November 3, 2017,, Bhavani M. Thuraisingham, Ghassan Karame, and Angelos Stavrou (Eds.). ACM, 39--43.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Rebecca N. Wright. 2018. Privacy-preserving Machine Learning as a Service. Proc. Priv. Enhancing Technol., Vol. 2018, 3 (2018), 123--142.Google ScholarGoogle ScholarCross RefCross Ref
  54. Briland Hitaj, Giuseppe Ateniese, and Fernando Pérez-Cruz. 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. In ACM CCS 2017. 603--618.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ali Inan, Selim Volkan Kaya, Yücel Saygin, Erkay Savas, Aycc a Azgin Hintoglu, and Albert Levi. 2007. Privacy preserving clustering on horizontally partitioned data. Data Knowl. Eng., Vol. 63, 3 (2007), 646--666. https://doi.org/10.1016/j.datak.2007.03.015Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N. Wright. 2006. A New Privacy-Preserving Distributed k-Clustering Algorithm. In Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA. SIAM, 494--498. https://doi.org/10.1137/1.9781611972764.47Google ScholarGoogle Scholar
  57. Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright, and Daryl Umano. 2010. Communication-Efficient Privacy-Preserving Clustering. Trans. Data Privacy, Vol. 3, 1 (2010), 1--25. http://www.tdp.cat/issues/abs.a028a09.phpGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  58. Geetha Jagannathan and Rebecca N. Wright. 2005. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In ACM SIGKDD 2005. 593--599. https://doi.org/10.1145/1081870.1081942Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Angela Jäschke and Frederik Armknecht. 2018. Unsupervised Machine Learning on Encrypted Data. In Selected Areas in Cryptography - SAC 2018m Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11349). Springer, 453--478. https://doi.org/10.1007/978-3-030-10970-7_21Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Somesh Jha, Luis Kruger, and Patrick McDaniel. 2005. Privacy Preserving Clustering. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS).Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. [n.d.]. GAZELLE: A Low Latency Framework for Secure Neural Network Inference. In 27th USENIX Security Symposium, USENIX Security 2018. 1651--1669. https://www.usenix.org/conference/usenixsecurity18/presentation/juvekarGoogle ScholarGoogle Scholar
  62. Hannah Keller, Helen Mö llering, Thomas Schneider, and Hossein Yalame. 2021. Balancing Quality and Efficiency in Private Clustering with Affinity Propagation. In Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021, July 6-8, 2021. SCITEPRESS, 173--184. https://doi.org/10.5220/0010547801730184Google ScholarGoogle ScholarCross RefCross Ref
  63. Florian Kerschbaum, Thomas Schneider, and Axel Schrö pfer. 2014. Automatic Protocol Selection in Secure Two-Party Computations. In ACNS 2014. 566--584.Google ScholarGoogle Scholar
  64. Hyeong-Jin Kim and Jae-Woo Chang. 2018. A Privacy-Preserving k-Means Clustering Algorithm Using Secure Comparison Protocol and Density-Based Center Point Selection. In 11th IEEE International Conference on Cloud Computing, CLOUD 2018. IEEE Computer Society, 928--931. https://doi.org/10.1109/CLOUD.2018.00138Google ScholarGoogle ScholarCross RefCross Ref
  65. Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and Thomas Schneider. 2009. Improved Garbled Circuit Building Blocks and Applications to Auctions and Computing Minima. In CANS 2009. 1--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process. Mag., Vol. 37, 3 (2020), 50--60. https://doi.org/10.1109/MSP.2020.2975749Google ScholarGoogle ScholarCross RefCross Ref
  67. Yi Li, Yitao Duan, and Wei Xu. 2018. PrivPy: Enabling Scalable and General Privacy-Preserving Computation. CoRR, Vol. abs/1801.10117 (2018). arxiv: 1801.10117 http://arxiv.org/abs/1801.10117Google ScholarGoogle Scholar
  68. Minlei Liao, Yunfeng Li, Farid Kianifard, Engels Obi, and Stephen Arcona. 2016. Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis. BMC Nephrology, Vol. 17 (2016). Issue 25.Google ScholarGoogle ScholarCross RefCross Ref
  69. Yehuda Lindell and Benny Pinkas. 2009. A Proof of Security of Yao's Protocol for Two-Party Computation. J. Cryptology, Vol. 22, 2 (2009), 161--188. https://doi.org/10.1007/s00145-008-9036-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Y. Lindhell and B. Pinkas. 2000. Privacy Preserving Data Mining. In Proc. Advances in Cryptology - CRYPTO. Springer-Verlag.Google ScholarGoogle Scholar
  71. Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. 2021. When Machine Learning Meets Privacy: A Survey and Outlook. ACM Comput. Surv., Vol. 54, 2, Article 31 (March 2021), 36 pages. https://doi.org/10.1145/3436755Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. 2017. Oblivious Neural Network Predictions via MiniONN Transformations. In ACM SIGSAC CCS. 619--631. https://doi.org/10.1145/3133956.3134056Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.Google ScholarGoogle Scholar
  74. Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. [n.d.]. Exploiting Unintended Feature Leakage in Collaborative Learning. In 2019 IEEE Symposium on Security and Privacy, SP 2019. 691--706. https://doi.org/10.1109/SP.2019.00029Google ScholarGoogle Scholar
  75. Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. [n.d.]. Delphi: A Cryptographic Inference Service for Neural Networks. In 29th USENIX Security Symposium, USENIX Security 2020. 2505--2522. https://www.usenix.org/conference/usenixsecurity20/presentation/mishraGoogle ScholarGoogle Scholar
  76. Payman Mohassel, Mike Rosulek, and Ni Trieu. 2020. Practical Privacy-Preserving K-means Clustering. Proc. Priv. Enhancing Technol., Vol. 2020, 4 (2020), 414--433. https://doi.org/10.2478/popets-2020-0080Google ScholarGoogle ScholarCross RefCross Ref
  77. Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In IEEE Security and Privacy 2017. 19--38. https://doi.org/10.1109/SP.2017.12Google ScholarGoogle Scholar
  78. Fionn Murtagh and Pedro Contreras. 2017. Algorithms for hierarchical clustering: an overview, II. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., Vol. 7, 6 (2017). https://doi.org/10.1002/widm.1219Google ScholarGoogle Scholar
  79. Terry Nelms, Roberto Perdisci, and Mustaque Ahamad. 2013. ExecScent: Mining for New cc Domains in Live Networks with Adaptive Control Protocol Templates. In Proceedings of the 22nd USENIX Security Symposium.Google ScholarGoogle Scholar
  80. Sophia R. Newcomer, John F. Steiner,, and Elizabeth A. Bayliss. 2011. Identifying Subgroups of Complex Patients With Cluster Analysis. American Journal of Managed Care, Vol. 17 (2011), 324--332. Issue 8.Google ScholarGoogle Scholar
  81. Valeria Nikolaenko, Udi Weinsberg, Stratis Ioannidis, Marc Joye, Dan Boneh, and Nina Taft. 2013. Privacy-Preserving Ridge Regression on Hundreds of Millions of Records. In Proc. IEEE Symposium on Security and Privacy (S&P). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Olga Ohrimenko, Felix Schuster, Cé dric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. [n.d.]. Oblivious Multi-Party Machine Learning on Trusted Processors. In 25th USENIX Security Symposium, USENIX Security 16. 619--636.Google ScholarGoogle Scholar
  83. Stanley R. M. Oliveira and Osmar R. Zaïane. 2003. Privacy Preserving Clustering by Data Transformation. In XVIII Simpósio Brasileiro de Bancos de Dados, Anais/Proceedings. 304--318.Google ScholarGoogle Scholar
  84. Clark F. Olson. 1995. Parallel Algorithms for Hierarchical Clustering. Parallel Comput., Vol. 21, 8 (1995), 1313--1325. https://doi.org/10.1016/0167--8191(95)00017-IGoogle ScholarGoogle ScholarCross RefCross Ref
  85. Claudio Orlandi, Alessandro Piva, and Mauro Barni. 2007. Oblivious Neural Network Computing via Homomorphic Encryption. EURASIP J. Information Security, Vol. 2007 (2007). https://doi.org/10.1155/2007/37343Google ScholarGoogle Scholar
  86. P. Paillier. 1999. Public-key cryptosystems based on composite degree residuosity classes. In Proc. Advances in Cryptology - EUROCRYPT. Springer-Verlag.Google ScholarGoogle Scholar
  87. Martin Pettai and Peeter Laud. 2015. Combining Differential Privacy and Secure Multiparty Computation. In Proceedings of the 31st Annual Computer Security Applications Conference, Los Angeles, CA, USA, December 7-11, 2015. ACM, 421--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Michael O. Rabin. 1981. How to exchange secrets by oblivious transfer. Technical Report TR-81, Aiken Computation Laboratory, Harvard University.Google ScholarGoogle Scholar
  89. Fang-Yu Rao, Bharath K. Samanthula, Elisa Bertino, Xun Yi, and Dongxi Liu. 2015. Privacy-Preserving and Outsourced Multi-user K-Means Clustering. In IEEE Conference on Collaboration and Internet Computing, CIC 2015, Hangzhou, China, October 27-30, 2015. IEEE Computer Society, 80--89. https://doi.org/10.1109/CIC.2015.20Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, and Farinaz Koushanfar. [n.d.]. Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications. In AsiaCCS 2018. 707--721. https://doi.org/10.1145/3196494.3196522Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. R L Rivest, L Adleman, and M L Dertouzos. 1978. On Data Banks and Privacy Homomorphisms. Foundations of Secure Computation, Academia Press (1978), 169--179.Google ScholarGoogle Scholar
  92. Bita Darvish Rouhani, M. Sadegh Riazi, and Farinaz Koushanfar. 2018. Deepsecure: scalable provably-secure deep learning. In DAC 2018. ACM, 2:1--2:6. https://doi.org/10.1145/3195970.3196023Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Ahmad-Reza Sadeghi and Thomas Schneider. 2008. Generalized Universal Circuits for Secure Evaluation of Private Functions with Application to Data Classification. In ICISC 2008. 336--353. https://doi.org/10.1007/978-3-642-00730-9_21Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Ashish P. Sanil, Alan F. Karr, Xiaodong Lin, and Jerome P. Reiter. 2004. Privacy preserving regression modelling via distributed computation. In ACM SIGKDD 2004. 677--682.Google ScholarGoogle Scholar
  95. Mina Sheikhalishahi, Mona Hamidi, and Fabio Martinelli. [n.d.]. Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction. In Information Systems Security and Privacy - 4th International Conference, ICISSP 2018, Vol. 977. 261--280. https://doi.org/10.1007/978-3-030-25109-3_14Google ScholarGoogle Scholar
  96. Mina Sheikhalishahi and Fabio Martinelli. 2017. Privacy preserving clustering over horizontal and vertical partitioned data. In IEEE ISCC 2017. 1237--1244. https://doi.org/10.1109/ISCC.2017.8024694Google ScholarGoogle ScholarCross RefCross Ref
  97. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In ACM SIGSAC CCS 2015. 1310--1321.Google ScholarGoogle Scholar
  98. Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy. 3--18.Google ScholarGoogle Scholar
  99. Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In IEEE Global Conference on Signal and Information Processing 2013. 245--248. https://doi.org/10.1109/GlobalSIP.2013.6736861Google ScholarGoogle ScholarCross RefCross Ref
  100. Ion Stoica, Dawn Song, Raluca Ada Popa, David A. Patterson, Michael W. Mahoney, Randy H. Katz, Anthony D. Joseph, Michael I. Jordan, Joseph M. Hellerstein, Joseph E. Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, and Pieter Abbeel. 2017. A Berkeley View of Systems Challenges for AI. CoRR, Vol. abs/1712.05855 (2017). arxiv: 1712.05855 http://arxiv.org/abs/1712.05855Google ScholarGoogle Scholar
  101. Chunhua Su, Feng Bao, Jianying Zhou, Tsuyoshi Takagi, and Kouichi Sakurai. 2007. Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation. In AINA 2007. 385--391.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Chunhua Su, Jianying Zhou, Feng Bao, Tsuyoshi Takagi, and Kouichi Sakurai. 2014. Collaborative agglomerative document clustering with limited information disclosure. Security and Communication Networks, Vol. 7, 6 (2014), 964--978. https://doi.org/10.1002/sec.811Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Toshiyuki Takada, Hiroyuki Hanada, Yoshiji Yamada, Jun Sakuma, and Ichiro Takeuchi. 2016. Secure Approximation Guarantee for Cryptographically Private Empirical Risk Minimization. In ACML 2016. 126--141. http://jmlr.org/proceedings/papers/v63/takada48.htmlGoogle ScholarGoogle Scholar
  104. Harry Chandra Tanuwidjaja, Rakyong Choi, Seunggeun Baek, and Kwangjo Kim. 2020. Privacy-Preserving Deep Learning on Machine Learning as a Service - a Comprehensive Survey. IEEE Access, Vol. 8 (2020), 167425--167447.Google ScholarGoogle ScholarCross RefCross Ref
  105. Florian Tramèr and Dan Boneh. [n.d.]. Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In 7th International Conference on Learning Representations, ICLR 2019. https://openreview.net/forum?id=rJVorjCcKQGoogle ScholarGoogle Scholar
  106. A. Tripathy and I. De. 2013. Privacy Preserving Two-Party Hierarchical Clustering Over Vertically Partitioned Dataset. Journal of Software Engineering and Applications, Vol. 06 (2013), 26--31.Google ScholarGoogle ScholarCross RefCross Ref
  107. Jaideep Vaidya and Chris Clifton. 2003. Privacy-preserving k-means clustering over vertically partitioned data. In ACM SIGKDD 2003. 206--215.Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Jaideep Vaidya, Hwanjo Yu, and Xiaoqian Jiang. 2008. Privacy-preserving SVM classification. Knowl. Inf. Syst., Vol. 14, 2 (2008), 161--178. https://doi.org/10.1007/s10115-007-0073-7Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng Wang, and Diyue Bu. 2015. Efficient Genome-Wide, Privacy-Preserving Similar Patient Query Based on Private Edit Distance (CCS '15). ACM, 492--503. https://doi.org/10.1145/2810103.2813725Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. M.R. Weir, E.W. Maibach, G.L. Bakris, H.R. Black, P. Chawla, F.H. Messerli, J.M. Neutel, and M.A. Weber. 2000. Implications of a health lifestyle and medication analysis for improving hypertension control. Archives of Internal Medicine, Vol. 160 (2000), 481--490. Issue 4.Google ScholarGoogle ScholarCross RefCross Ref
  111. Wei Xie, Yang Wang, Steven M. Boker, and Donald E. Brown. 2016. PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers. CoRR, Vol. abs/1611.01170 (2016). arxiv: 1611.01170 http://arxiv.org/abs/1611.01170Google ScholarGoogle Scholar
  112. Hongyang Yan, Li Hu, Xiaoyu Xiang, Zheli Liu, and Xu Yuan. 2021. PPCL: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inf. Sci., Vol. 548 (2021), 423--437. https://doi.org/10.1016/j.ins.2020.09.064Google ScholarGoogle ScholarCross RefCross Ref
  113. Andrew Chi-Chih Yao. 1982. Protocols for Secure Computations (Extended Abstract). In 23rd Annual Symposium on Foundations of Computer Science, 1982. 160--164. https://doi.org/10.1109/SFCS.1982.38Google ScholarGoogle Scholar
  114. Andrew Chi-Chih Yao. 1986. How to Generate and Exchange Secrets (Extended Abstract). In 27th Annual Symposium on Foundations of Computer Science, 1986. 162--167. https://doi.org/10.1109/SFCS.1986.25Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Samee Zahur and David Evans. 2013. Circuit Structures for Improving Efficiency of Security and Privacy Tools. In 2013 IEEE Symposium on Security and Privacy, SP 2013, Berkeley, CA, USA, May 19-22, 2013. IEEE Computer Society, 493--507. https://doi.org/10.1109/SP.2013.40Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Qingchen Zhang, Laurence T. Yang, Zhikui Chen, and Peng Li. 2017. PPHOPCM: Privacy-preserving High-order Possibilistic c-Means Algorithm for Big Data Clustering with Cloud Computing. IEEE Transactions on Big Data (2017), 1-1. https://doi.org/10.1109/TBDATA.2017.2701816Google ScholarGoogle Scholar
  117. Tian Zhang, Raghu Ramakrishnan, and Miron Livny. 1996. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In ACM SIGMOD 1996. 103--114.Google ScholarGoogle Scholar
  118. Lingchen Zhao, Qian Wang, Qin Zou, Yan Zhang, and Yanjiao Chen. 2020 a. Privacy-Preserving Collaborative Deep Learning With Unreliable Participants. IEEE Trans. Inf. Forensics Secur., Vol. 15 (2020), 1486--1500. https://doi.org/10.1109/TIFS.2019.2939713Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Qi Zhao, Chuan Zhao, Shujie Cui, Shan Jing, and Zhenxiang Chen. 2020 b. PrivateDL PrivateDL: Privacy-preserving collaborative deep learning against leakage from gradient sharing. Int. J. Intell. Syst., Vol. 35, 8 (2020), 1262--1279. https://doi.org/10.1002/int.22241Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. Jan Henrik Ziegeldorf, Jens Hiller, Martin Henze, Hanno Wirtz, and Klaus Wehrle. [n.d.]. Bandwidth-Optimized Secure Two-Party Computation of Minima. In CANS 2015. 197--213.Google ScholarGoogle Scholar

Index Terms

  1. Private Hierarchical Clustering and Efficient Approximation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop
        November 2021
        161 pages
        ISBN:9781450386531
        DOI:10.1145/3474123

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 November 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate37of108submissions,34%

        Upcoming Conference

        CCS '24
        ACM SIGSAC Conference on Computer and Communications Security
        October 14 - 18, 2024
        Salt Lake City , UT , USA
      • Article Metrics

        • Downloads (Last 12 months)93
        • Downloads (Last 6 weeks)8

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader