research-article

Public Access

Private Hierarchical Clustering and Efficient Approximation

Authors:
Xianrui Meng

Amazon Web Serivces, Seattle, WA, USA

Amazon Web Serivces, Seattle, WA, USA
View Profile

,
Dimitrios Papadopoulos

Hong Kong University of Science and Technology, Hong Kong, Hong Kong

Hong Kong University of Science and Technology, Hong Kong, Hong Kong
View Profile

,
Alina Oprea

Northeastern University, Boston, MA, USA

Northeastern University, Boston, MA, USA
View Profile

,
Nikos Triandopoulos

Stevens Institute of Technology, Hoboken, NJ, USA

Stevens Institute of Technology, Hoboken, NJ, USA
View Profile

CCSW '21: Proceedings of the 2021 on Cloud Computing Security WorkshopNovember 2021Pages 3–20https://doi.org/10.1145/3474123.3486760

Published:15 November 2021Publication History

CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop

Pages 3–20

ABSTRACT

In collaborative learning, multiple parties contribute their datasets to jointly deduce global machine learning models for numerous predictive tasks. Despite its efficacy, this learning paradigm fails to encompass critical application domains that involve highly sensitive data, such as healthcare and security analytics, where privacy risks limit entities to individually train models using only their own datasets. In this work, we target privacy-preserving collaborative hierarchical clustering. We introduce a formal security definition that aims to achieve balance between utility and privacy and present a two-party protocol that provably satisfies it. We then extend our protocol with: (i) an optimized version for single-linkage clustering, and (ii) scalable approximation variants. We implement all our schemes and experimentally evaluate their performance and accuracy on synthetic and real datasets, obtaining very encouraging results. For example, end-to-end execution of our secure approximate protocol for over 1M 10-dimensional data samples requires 35sec of computation and achieves 97.09% accuracy.

Supplemental Material

CCSW-66-meng.mp4

mp4

33.5 MB

Download

CCSW-66-meng.mp4

mp4

33.5 MB

Download

References

2017. The Intelligent Trial: AI Comes To Clinical Trials. Clinical Informatics News. http://www.clinicalinformaticsnews.com/2017/09/29/the-intelligent- trial-ai-comes-to-clinical-trials.aspx.Google Scholar
2019. The UCI Machine Learning Data Repository. http://archive.ics.uci.edu/ ml/index.php.Google Scholar
2019. UTexas Paillier Library. http://acsc.cs.utexas.edu/libpaillier.Google Scholar
2021. AWS VPC. https://aws.amazon.com/vpc.Google Scholar
Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In ACM SIGSAC CCS 2016. 308--318.Google Scholar
Mohammad Al-Rubaie and J. Morris Chang. 2019. Privacy-Preserving Machine Learning: Threats and Solutions. IEEE Secur. Priv., Vol. 17, 2 (2019), 49--58. https://doi.org/10.1109/MSEC.2018.2888775Google ScholarCross Ref
AlienVault. 2020. Open Threat Exchange. Available at https://otx.alienvault.com/.Google Scholar
Cyber Threat Alliance. 2020. Available at http://cyberthreatalliance.org/.Google Scholar
Yoshinori Aono, Takuya Hayashi, Le Trieu Phong, and Lihua Wang. 2016. Scalable and Secure Logistic Regression via Homomorphic Encryption. In ACM CODASPY 2016. 142--144.Google ScholarDigital Library
Gilad Asharov, Shai Halevi, Yehuda Lindell, and Tal Rabin. 2018. Privacy-Preserving Search of Similar Patients in Genomic Data. PoPETs, Vol. 2018, 4 (2018), 104--124. https://doi.org/10.1515/popets-2018-0034Google ScholarCross Ref
Foteini Baldimtsi, Dimitrios Papadopoulos, Stavros Papadopoulos, Alessandra Scafuro, and Nikos Triandopoulos. 2017. Server-Aided Secure Computation with Off-line Parties. In ESORICS 2017. 103--123.Google Scholar
Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and Thomas Schneider. 2011. Privacy-Preserving ECG Classification With Branching Programs and Neural Networks. IEEE Trans. Information Forensics and Security, Vol. 6, 2 (2011), 452--468. https://doi.org/10.1109/TIFS.2011.2108650Google ScholarDigital Library
Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda. 2009. Scalable, Behavior-Based Malware Clustering.. In Proceedings of the 16th Symposium on Network and Distributed System Security (NDSS).Google Scholar
Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. 2012. Foundations of garbled circuits. In ACM CCS 2012. 784--796. https://doi.org/10.1145/2382196.2382279Google ScholarDigital Library
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS '17). ACM, 1175--1191. https://doi.org/10.1145/3133956.3133982Google ScholarDigital Library
Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. 2015. Machine Learning Classification over Encrypted Data. In NDSS 2015.Google ScholarCross Ref
Beyza Bozdemir, Sébastien Canard, Orhan Ermis, Helen Möllering, Melek Önen, and Thomas Schneider. 2021. Privacy-preserving Density-based Clustering. In ASIA CCS '21: ACM Asia Conference on Computer and Communications Security, Virtual Event, Hong Kong, June 7-11, 2021. ACM, 658--671. https://doi.org/10.1145/3433210.3453104Google ScholarDigital Library
Paul Bunn and Rafail Ostrovsky. 2007. Secure two-party k-means clustering. In Proceedings of the 2007 ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, USA, October 28-31, 2007. 486--497. https://doi.org/10.1145/1315245.1315306Google ScholarDigital Library
Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS).Google ScholarDigital Library
Hervé Chabanne, Amaury de Wargny, Jonathan Milgram, Constance Morel, and Emmanuel Prouff. 2017. Privacy-Preserving Classification on Deep Neural Network. Cryptology ePrint Archive, Report 2017/035.Google Scholar
Javad Ghareh Chamani and Dimitrios Papadopoulos. 2020. Mitigating Leakage in Federated Learning with Trusted Hardware. CoRR, Vol. abs/2011.04948 (2020). arxiv: 2011.04948 https://arxiv.org/abs/2011.04948Google Scholar
Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul Sharma, and Shardul Tripathi. [n.d.]. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In IEEE European Symposium on Security and Privacy, EuroS&P 2019. 496--511. https://doi.org/10.1109/EuroSP.2019.00043Google Scholar
Melissa Chase, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, and Peter Rindal. 2017. Private Collaborative Neural Network Learning. IACR Cryptology ePrint Archive, Vol. 2017 (2017), 762. http://eprint.iacr.org/2017/762Google Scholar
Kamalika Chaudhuri and Claire Monteleoni. 2008. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems 21, 2008. 289--296.Google Scholar
Jung Hee Cheon, Duhyeong Kim, and Jai Hyun Park. 2019. Towards a Practical Cluster Analysis over Encrypted Data. In Selected Areas in Cryptography - SAC 2019 - 26th International Conference, Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11959). Springer, 227--249. https://doi.org/10.1007/978-3-030-38471-5_10Google ScholarDigital Library
Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, and Claire Mathieu. [n.d.]. Hierarchical Clustering: Objective Functions and Algorithms. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018,, Artur Czumaj (Ed.). 378--397. https://doi.org/10.1137/1.9781611975031.26Google Scholar
Ipsa De and Animesh Tripathy. 2014. A Secure Two Party Hierarchical Clustering Approach for Vertically Partitioned Data Set with Accuracy Measure. In Recent Advances in Intelligent Informatics. Springer International Publishing, 153--162.Google Scholar
D. Demmler, T. Schneider, and M. Zohner. 2015. ABY - A framework for efficient mixed-protocol secure two-party computation. In Proc. n 22nd Annual Network and Distributed System Security Symposium (NDSS).Google Scholar
Ben Dickson. 2016. How threat intelligence sharing can help deal with cybersecurity challenges. Available at https://techcrunch.com/2016/05/15/how-threat-intelligence-sharing-can-help-deal-with-cybersecurity-challenges/.Google Scholar
Mahir Can Doganay, Thomas Brochmann Pedersen, Yücel Saygin, Erkay Savas, and Albert Levi. 2008. Distributed privacy preserving k-means clustering with additive secret sharing. In PAIS 2008. 3--11.Google ScholarDigital Library
Wenliang Du and Mikhail J. Atallah. 2001. Privacy-Preserving Cooperative Scientific Computations. In 14th IEEE Computer Security Foundations Workshop (CSFW-14 2001). 273--294.Google Scholar
Wenliang Du, Yunghsiang S. Han, and Shigang Chen. 2004. Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification. In Proceedings of the Fourth SIAM International Conference on Data Mining. 222--233.Google ScholarCross Ref
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In TCC 2006. 265--284.Google Scholar
Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, and David Botstein. 1998. Cluster analysis and display of genome-wide expression patterns., Vol. 95 (1998), 14863--14868. Issue 25.Google ScholarCross Ref
Zekeriya Erkin, Thijs Veugen, Tomas Toft, and Reginald L. Lagendijk. 2013. Privacy-preserving distributed clustering. EURASIP J. Information Security, Vol. 2013 (2013), 4. https://doi.org/10.1186/1687-417X-2013-4Google ScholarCross Ref
Facebook. 2018. Threat Exchange. Available at https://developers.facebook.com/products/threat-exchange.Google Scholar
Joan Feigenbaum, Yuval Ishai, Tal Malkin, Kobbi Nissim, Martin J. Strauss, and Rebecca N. Wright. 2006. Secure multiparty computation of approximations. ACM Trans. Algorithms, Vol. 2, 3 (2006), 435--472. https://doi.org/10.1145/1159892.1159900Google ScholarDigital Library
Stephen E. Fienberg, William J. Fulp, Aleksandra B. Slavkovic, and Tracey A. Wrobel. 2006. "Secure" Log-Linear and Logistic Regression Analysis of Distributed Databases. In Privacy in Statistical Databases. 277--290.Google Scholar
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks That Exploit Confidence Information and Basic Countermeasures (CCS '15). ACM, New York, NY, USA, 1322--1333. https://doi.org/10.1145/2810103.2813677Google ScholarDigital Library
Adrià Gascó n, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2017. Privacy-Preserving Distributed Linear Regression on High-Dimensional Data. PoPETs, Vol. 2017, 4 (2017), 345--364. https://doi.org/10.1515/popets-2017-0053Google ScholarCross Ref
Craig Gentry. 2009. A Fully Homomorphic Encryption Scheme. Ph.D. Dissertation. Stanford, CA, USA. Advisor(s) Boneh, Dan. AAI3382729.Google Scholar
Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In Proc. 33rd International Conference on Machine Learning (ICML).Google Scholar
Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Peter Rindal, and Mike Rosulek. [n.d.]. Secure Data Exchange: A Marketplace in the Cloud. In Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, CCSW@CCS 2019. 117--128. https://doi.org/10.1145/3338466.3358924Google ScholarDigital Library
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, Vol. 99, 12 (11 June 2002), 7821--7826. https://doi.org/10.1073/pnas.122653799Google ScholarCross Ref
Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to Play any Mental Game or A Completeness Theorem for Protocols with Honest Majority. In ACM STOC 1987. 218--229. https://doi.org/10.1145/28395.28420Google ScholarDigital Library
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008. BotMiner: Clustering Analysis of Network Traffic for Protocol and Structure-independent Botnet Detection. In Proceedings of the 17th USENIX Security Symposium.Google Scholar
Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. 2001. Cure: An Efficient Clustering Algorithm for Large Databases. Inf. Syst., Vol. 26, 1 (2001), 35--58. https://doi.org/10.1016/S0306-4379(01)00008-4Google ScholarDigital Library
Mona Hamidi, Mina Sheikhalishahi, and Fabio Martinelli. 2018. Privacy Preserving Expectation Maximization (EM) Clustering Construction. In DCAI 2018 (Advances in Intelligent Systems and Computing, Vol. 800). Springer, 255--263. https://doi.org/10.1007/978-3-319-94649-8_31Google Scholar
Aditya Hegde, Helen Möllering, Thomas Schneider, and Hossein Yalame. 2021. SoK: Efficient Privacy-preserving Clustering. Proc. Priv. Enhancing Technol., Vol. 2021, 4 (2021), 225--248. https://doi.org/10.2478/popets-2021-0068Google ScholarCross Ref
W. Henecka, S. Kögl, A.-R. Sadeghi, T. Schneider, and I. Wehrenberg. 1999. Tasty: Tool for automating secure two-party computations. In Proc. ACM Conference on Computer and Communications Security (CCS).Google Scholar
Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. [n.d.]. Deep Neural Networks Classification over Encrypted Data. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019. 97--108. https://doi.org/10.1145/3292006.3300044Google ScholarDigital Library
Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Catherine Jones. 2017. Privacy-preserving Machine Learning in Cloud. In Proceedings of the 9th Cloud Computing Security Workshop, CCSW@CCS 2017, Dallas, TX, USA, November 3, 2017,, Bhavani M. Thuraisingham, Ghassan Karame, and Angelos Stavrou (Eds.). ACM, 39--43.Google ScholarDigital Library
Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Rebecca N. Wright. 2018. Privacy-preserving Machine Learning as a Service. Proc. Priv. Enhancing Technol., Vol. 2018, 3 (2018), 123--142.Google ScholarCross Ref
Briland Hitaj, Giuseppe Ateniese, and Fernando Pérez-Cruz. 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. In ACM CCS 2017. 603--618.Google ScholarDigital Library
Ali Inan, Selim Volkan Kaya, Yücel Saygin, Erkay Savas, Aycc a Azgin Hintoglu, and Albert Levi. 2007. Privacy preserving clustering on horizontally partitioned data. Data Knowl. Eng., Vol. 63, 3 (2007), 646--666. https://doi.org/10.1016/j.datak.2007.03.015Google ScholarDigital Library
Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N. Wright. 2006. A New Privacy-Preserving Distributed k-Clustering Algorithm. In Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA. SIAM, 494--498. https://doi.org/10.1137/1.9781611972764.47Google Scholar
Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright, and Daryl Umano. 2010. Communication-Efficient Privacy-Preserving Clustering. Trans. Data Privacy, Vol. 3, 1 (2010), 1--25. http://www.tdp.cat/issues/abs.a028a09.phpGoogle ScholarDigital Library
Geetha Jagannathan and Rebecca N. Wright. 2005. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In ACM SIGKDD 2005. 593--599. https://doi.org/10.1145/1081870.1081942Google ScholarDigital Library
Angela Jäschke and Frederik Armknecht. 2018. Unsupervised Machine Learning on Encrypted Data. In Selected Areas in Cryptography - SAC 2018m Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11349). Springer, 453--478. https://doi.org/10.1007/978-3-030-10970-7_21Google ScholarDigital Library
Somesh Jha, Luis Kruger, and Patrick McDaniel. 2005. Privacy Preserving Clustering. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS).Google ScholarDigital Library
Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. [n.d.]. GAZELLE: A Low Latency Framework for Secure Neural Network Inference. In 27th USENIX Security Symposium, USENIX Security 2018. 1651--1669. https://www.usenix.org/conference/usenixsecurity18/presentation/juvekarGoogle Scholar
Hannah Keller, Helen Mö llering, Thomas Schneider, and Hossein Yalame. 2021. Balancing Quality and Efficiency in Private Clustering with Affinity Propagation. In Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021, July 6-8, 2021. SCITEPRESS, 173--184. https://doi.org/10.5220/0010547801730184Google ScholarCross Ref
Florian Kerschbaum, Thomas Schneider, and Axel Schrö pfer. 2014. Automatic Protocol Selection in Secure Two-Party Computations. In ACNS 2014. 566--584.Google Scholar
Hyeong-Jin Kim and Jae-Woo Chang. 2018. A Privacy-Preserving k-Means Clustering Algorithm Using Secure Comparison Protocol and Density-Based Center Point Selection. In 11th IEEE International Conference on Cloud Computing, CLOUD 2018. IEEE Computer Society, 928--931. https://doi.org/10.1109/CLOUD.2018.00138Google ScholarCross Ref
Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and Thomas Schneider. 2009. Improved Garbled Circuit Building Blocks and Applications to Auctions and Computing Minima. In CANS 2009. 1--20.Google ScholarDigital Library
Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process. Mag., Vol. 37, 3 (2020), 50--60. https://doi.org/10.1109/MSP.2020.2975749Google ScholarCross Ref
Yi Li, Yitao Duan, and Wei Xu. 2018. PrivPy: Enabling Scalable and General Privacy-Preserving Computation. CoRR, Vol. abs/1801.10117 (2018). arxiv: 1801.10117 http://arxiv.org/abs/1801.10117Google Scholar
Minlei Liao, Yunfeng Li, Farid Kianifard, Engels Obi, and Stephen Arcona. 2016. Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis. BMC Nephrology, Vol. 17 (2016). Issue 25.Google ScholarCross Ref
Yehuda Lindell and Benny Pinkas. 2009. A Proof of Security of Yao's Protocol for Two-Party Computation. J. Cryptology, Vol. 22, 2 (2009), 161--188. https://doi.org/10.1007/s00145-008-9036-8Google ScholarDigital Library
Y. Lindhell and B. Pinkas. 2000. Privacy Preserving Data Mining. In Proc. Advances in Cryptology - CRYPTO. Springer-Verlag.Google Scholar
Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. 2021. When Machine Learning Meets Privacy: A Survey and Outlook. ACM Comput. Surv., Vol. 54, 2, Article 31 (March 2021), 36 pages. https://doi.org/10.1145/3436755Google ScholarDigital Library
Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. 2017. Oblivious Neural Network Predictions via MiniONN Transformations. In ACM SIGSAC CCS. 619--631. https://doi.org/10.1145/3133956.3134056Google ScholarDigital Library
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.Google Scholar
Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. [n.d.]. Exploiting Unintended Feature Leakage in Collaborative Learning. In 2019 IEEE Symposium on Security and Privacy, SP 2019. 691--706. https://doi.org/10.1109/SP.2019.00029Google Scholar
Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. [n.d.]. Delphi: A Cryptographic Inference Service for Neural Networks. In 29th USENIX Security Symposium, USENIX Security 2020. 2505--2522. https://www.usenix.org/conference/usenixsecurity20/presentation/mishraGoogle Scholar
Payman Mohassel, Mike Rosulek, and Ni Trieu. 2020. Practical Privacy-Preserving K-means Clustering. Proc. Priv. Enhancing Technol., Vol. 2020, 4 (2020), 414--433. https://doi.org/10.2478/popets-2020-0080Google ScholarCross Ref
Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In IEEE Security and Privacy 2017. 19--38. https://doi.org/10.1109/SP.2017.12Google Scholar
Fionn Murtagh and Pedro Contreras. 2017. Algorithms for hierarchical clustering: an overview, II. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., Vol. 7, 6 (2017). https://doi.org/10.1002/widm.1219Google Scholar
Terry Nelms, Roberto Perdisci, and Mustaque Ahamad. 2013. ExecScent: Mining for New cc Domains in Live Networks with Adaptive Control Protocol Templates. In Proceedings of the 22nd USENIX Security Symposium.Google Scholar
Sophia R. Newcomer, John F. Steiner,, and Elizabeth A. Bayliss. 2011. Identifying Subgroups of Complex Patients With Cluster Analysis. American Journal of Managed Care, Vol. 17 (2011), 324--332. Issue 8.Google Scholar
Valeria Nikolaenko, Udi Weinsberg, Stratis Ioannidis, Marc Joye, Dan Boneh, and Nina Taft. 2013. Privacy-Preserving Ridge Regression on Hundreds of Millions of Records. In Proc. IEEE Symposium on Security and Privacy (S&P). IEEE.Google ScholarDigital Library
Olga Ohrimenko, Felix Schuster, Cé dric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. [n.d.]. Oblivious Multi-Party Machine Learning on Trusted Processors. In 25th USENIX Security Symposium, USENIX Security 16. 619--636.Google Scholar
Stanley R. M. Oliveira and Osmar R. Zaïane. 2003. Privacy Preserving Clustering by Data Transformation. In XVIII Simpósio Brasileiro de Bancos de Dados, Anais/Proceedings. 304--318.Google Scholar
Clark F. Olson. 1995. Parallel Algorithms for Hierarchical Clustering. Parallel Comput., Vol. 21, 8 (1995), 1313--1325. https://doi.org/10.1016/0167--8191(95)00017-IGoogle ScholarCross Ref
Claudio Orlandi, Alessandro Piva, and Mauro Barni. 2007. Oblivious Neural Network Computing via Homomorphic Encryption. EURASIP J. Information Security, Vol. 2007 (2007). https://doi.org/10.1155/2007/37343Google Scholar
P. Paillier. 1999. Public-key cryptosystems based on composite degree residuosity classes. In Proc. Advances in Cryptology - EUROCRYPT. Springer-Verlag.Google Scholar
Martin Pettai and Peeter Laud. 2015. Combining Differential Privacy and Secure Multiparty Computation. In Proceedings of the 31st Annual Computer Security Applications Conference, Los Angeles, CA, USA, December 7-11, 2015. ACM, 421--430.Google ScholarDigital Library
Michael O. Rabin. 1981. How to exchange secrets by oblivious transfer. Technical Report TR-81, Aiken Computation Laboratory, Harvard University.Google Scholar
Fang-Yu Rao, Bharath K. Samanthula, Elisa Bertino, Xun Yi, and Dongxi Liu. 2015. Privacy-Preserving and Outsourced Multi-user K-Means Clustering. In IEEE Conference on Collaboration and Internet Computing, CIC 2015, Hangzhou, China, October 27-30, 2015. IEEE Computer Society, 80--89. https://doi.org/10.1109/CIC.2015.20Google ScholarDigital Library
M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, and Farinaz Koushanfar. [n.d.]. Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications. In AsiaCCS 2018. 707--721. https://doi.org/10.1145/3196494.3196522Google ScholarDigital Library
R L Rivest, L Adleman, and M L Dertouzos. 1978. On Data Banks and Privacy Homomorphisms. Foundations of Secure Computation, Academia Press (1978), 169--179.Google Scholar
Bita Darvish Rouhani, M. Sadegh Riazi, and Farinaz Koushanfar. 2018. Deepsecure: scalable provably-secure deep learning. In DAC 2018. ACM, 2:1--2:6. https://doi.org/10.1145/3195970.3196023Google ScholarDigital Library
Ahmad-Reza Sadeghi and Thomas Schneider. 2008. Generalized Universal Circuits for Secure Evaluation of Private Functions with Application to Data Classification. In ICISC 2008. 336--353. https://doi.org/10.1007/978-3-642-00730-9_21Google ScholarDigital Library
Ashish P. Sanil, Alan F. Karr, Xiaodong Lin, and Jerome P. Reiter. 2004. Privacy preserving regression modelling via distributed computation. In ACM SIGKDD 2004. 677--682.Google Scholar
Mina Sheikhalishahi, Mona Hamidi, and Fabio Martinelli. [n.d.]. Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction. In Information Systems Security and Privacy - 4th International Conference, ICISSP 2018, Vol. 977. 261--280. https://doi.org/10.1007/978-3-030-25109-3_14Google Scholar
Mina Sheikhalishahi and Fabio Martinelli. 2017. Privacy preserving clustering over horizontal and vertical partitioned data. In IEEE ISCC 2017. 1237--1244. https://doi.org/10.1109/ISCC.2017.8024694Google ScholarCross Ref
Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In ACM SIGSAC CCS 2015. 1310--1321.Google Scholar
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy. 3--18.Google Scholar
Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In IEEE Global Conference on Signal and Information Processing 2013. 245--248. https://doi.org/10.1109/GlobalSIP.2013.6736861Google ScholarCross Ref
Ion Stoica, Dawn Song, Raluca Ada Popa, David A. Patterson, Michael W. Mahoney, Randy H. Katz, Anthony D. Joseph, Michael I. Jordan, Joseph M. Hellerstein, Joseph E. Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, and Pieter Abbeel. 2017. A Berkeley View of Systems Challenges for AI. CoRR, Vol. abs/1712.05855 (2017). arxiv: 1712.05855 http://arxiv.org/abs/1712.05855Google Scholar
Chunhua Su, Feng Bao, Jianying Zhou, Tsuyoshi Takagi, and Kouichi Sakurai. 2007. Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation. In AINA 2007. 385--391.Google ScholarDigital Library
Chunhua Su, Jianying Zhou, Feng Bao, Tsuyoshi Takagi, and Kouichi Sakurai. 2014. Collaborative agglomerative document clustering with limited information disclosure. Security and Communication Networks, Vol. 7, 6 (2014), 964--978. https://doi.org/10.1002/sec.811Google ScholarDigital Library
Toshiyuki Takada, Hiroyuki Hanada, Yoshiji Yamada, Jun Sakuma, and Ichiro Takeuchi. 2016. Secure Approximation Guarantee for Cryptographically Private Empirical Risk Minimization. In ACML 2016. 126--141. http://jmlr.org/proceedings/papers/v63/takada48.htmlGoogle Scholar
Harry Chandra Tanuwidjaja, Rakyong Choi, Seunggeun Baek, and Kwangjo Kim. 2020. Privacy-Preserving Deep Learning on Machine Learning as a Service - a Comprehensive Survey. IEEE Access, Vol. 8 (2020), 167425--167447.Google ScholarCross Ref
Florian Tramèr and Dan Boneh. [n.d.]. Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In 7th International Conference on Learning Representations, ICLR 2019. https://openreview.net/forum?id=rJVorjCcKQGoogle Scholar
A. Tripathy and I. De. 2013. Privacy Preserving Two-Party Hierarchical Clustering Over Vertically Partitioned Dataset. Journal of Software Engineering and Applications, Vol. 06 (2013), 26--31.Google ScholarCross Ref
Jaideep Vaidya and Chris Clifton. 2003. Privacy-preserving k-means clustering over vertically partitioned data. In ACM SIGKDD 2003. 206--215.Google ScholarDigital Library
Jaideep Vaidya, Hwanjo Yu, and Xiaoqian Jiang. 2008. Privacy-preserving SVM classification. Knowl. Inf. Syst., Vol. 14, 2 (2008), 161--178. https://doi.org/10.1007/s10115-007-0073-7Google ScholarDigital Library
Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng Wang, and Diyue Bu. 2015. Efficient Genome-Wide, Privacy-Preserving Similar Patient Query Based on Private Edit Distance (CCS '15). ACM, 492--503. https://doi.org/10.1145/2810103.2813725Google ScholarDigital Library
M.R. Weir, E.W. Maibach, G.L. Bakris, H.R. Black, P. Chawla, F.H. Messerli, J.M. Neutel, and M.A. Weber. 2000. Implications of a health lifestyle and medication analysis for improving hypertension control. Archives of Internal Medicine, Vol. 160 (2000), 481--490. Issue 4.Google ScholarCross Ref
Wei Xie, Yang Wang, Steven M. Boker, and Donald E. Brown. 2016. PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers. CoRR, Vol. abs/1611.01170 (2016). arxiv: 1611.01170 http://arxiv.org/abs/1611.01170Google Scholar
Hongyang Yan, Li Hu, Xiaoyu Xiang, Zheli Liu, and Xu Yuan. 2021. PPCL: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inf. Sci., Vol. 548 (2021), 423--437. https://doi.org/10.1016/j.ins.2020.09.064Google ScholarCross Ref
Andrew Chi-Chih Yao. 1982. Protocols for Secure Computations (Extended Abstract). In 23rd Annual Symposium on Foundations of Computer Science, 1982. 160--164. https://doi.org/10.1109/SFCS.1982.38Google Scholar
Andrew Chi-Chih Yao. 1986. How to Generate and Exchange Secrets (Extended Abstract). In 27th Annual Symposium on Foundations of Computer Science, 1986. 162--167. https://doi.org/10.1109/SFCS.1986.25Google ScholarDigital Library
Samee Zahur and David Evans. 2013. Circuit Structures for Improving Efficiency of Security and Privacy Tools. In 2013 IEEE Symposium on Security and Privacy, SP 2013, Berkeley, CA, USA, May 19-22, 2013. IEEE Computer Society, 493--507. https://doi.org/10.1109/SP.2013.40Google ScholarDigital Library
Qingchen Zhang, Laurence T. Yang, Zhikui Chen, and Peng Li. 2017. PPHOPCM: Privacy-preserving High-order Possibilistic c-Means Algorithm for Big Data Clustering with Cloud Computing. IEEE Transactions on Big Data (2017), 1-1. https://doi.org/10.1109/TBDATA.2017.2701816Google Scholar
Tian Zhang, Raghu Ramakrishnan, and Miron Livny. 1996. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In ACM SIGMOD 1996. 103--114.Google Scholar
Lingchen Zhao, Qian Wang, Qin Zou, Yan Zhang, and Yanjiao Chen. 2020 a. Privacy-Preserving Collaborative Deep Learning With Unreliable Participants. IEEE Trans. Inf. Forensics Secur., Vol. 15 (2020), 1486--1500. https://doi.org/10.1109/TIFS.2019.2939713Google ScholarDigital Library
Qi Zhao, Chuan Zhao, Shujie Cui, Shan Jing, and Zhenxiang Chen. 2020 b. PrivateDL PrivateDL: Privacy-preserving collaborative deep learning against leakage from gradient sharing. Int. J. Intell. Syst., Vol. 35, 8 (2020), 1262--1279. https://doi.org/10.1002/int.22241Google ScholarDigital Library
Jan Henrik Ziegeldorf, Jens Hiller, Martin Henze, Hanno Wirtz, and Klaus Wehrle. [n.d.]. Bandwidth-Optimized Secure Two-Party Computation of Minima. In CANS 2015. 197--213.Google Scholar

Index Terms

Private Hierarchical Clustering and Efficient Approximation
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Security and privacy
  1. Cryptography

Recommendations

Privacy-preserving Density-based Clustering
ASIA CCS '21: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security

Clustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical ...
Read More
Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation
AINAW '07: Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops - Volume 01

K-means clustering is a powerful and frequently used technique in data mining. However, privacy breaching is a serious problem if the k-means clustering is used without any security treatment, while privacy is a real concern in many practical ...
Read More
Importance of Data Standardization in Privacy-Preserving K-Means Clustering
Database Systems for Advanced Applications

Privacy-preserving k-means clustering assumes that there are at least two parties in the secure interactive computation. However, the existing schemes do not consider the data standardization which is an important task before executing the clustering ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop
November 2021
161 pages
ISBN:9781450386531
DOI:10.1145/3474123
Program Chairs:
Yinqian Zhang
Southern University of Science and Technology
,
Marten van Dijk
Centrum Wiskunde & Informatica
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 November 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
private hierarchical clustering
secure approximation
secure computation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate37of108submissions,34%
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 232
  Total Downloads
- Downloads (Last 12 months)93
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Private Hierarchical Clustering and Efficient Approximation

CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Privacy-preserving Density-based Clustering

Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation

Importance of Data Standardization in Privacy-Preserving K-Means Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Private Hierarchical Clustering and Efficient Approximation

CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Privacy-preserving Density-based Clustering

Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation

Importance of Data Standardization in Privacy-Preserving K-Means Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media