ABSTRACT
In collaborative learning, multiple parties contribute their datasets to jointly deduce global machine learning models for numerous predictive tasks. Despite its efficacy, this learning paradigm fails to encompass critical application domains that involve highly sensitive data, such as healthcare and security analytics, where privacy risks limit entities to individually train models using only their own datasets. In this work, we target privacy-preserving collaborative hierarchical clustering. We introduce a formal security definition that aims to achieve balance between utility and privacy and present a two-party protocol that provably satisfies it. We then extend our protocol with: (i) an optimized version for single-linkage clustering, and (ii) scalable approximation variants. We implement all our schemes and experimentally evaluate their performance and accuracy on synthetic and real datasets, obtaining very encouraging results. For example, end-to-end execution of our secure approximate protocol for over 1M 10-dimensional data samples requires 35sec of computation and achieves 97.09% accuracy.
Supplemental Material
- 2017. The Intelligent Trial: AI Comes To Clinical Trials. Clinical Informatics News. http://www.clinicalinformaticsnews.com/2017/09/29/the-intelligent- trial-ai-comes-to-clinical-trials.aspx.Google Scholar
- 2019. The UCI Machine Learning Data Repository. http://archive.ics.uci.edu/ ml/index.php.Google Scholar
- 2019. UTexas Paillier Library. http://acsc.cs.utexas.edu/libpaillier.Google Scholar
- 2021. AWS VPC. https://aws.amazon.com/vpc.Google Scholar
- Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In ACM SIGSAC CCS 2016. 308--318.Google Scholar
- Mohammad Al-Rubaie and J. Morris Chang. 2019. Privacy-Preserving Machine Learning: Threats and Solutions. IEEE Secur. Priv., Vol. 17, 2 (2019), 49--58. https://doi.org/10.1109/MSEC.2018.2888775Google ScholarCross Ref
- AlienVault. 2020. Open Threat Exchange. Available at https://otx.alienvault.com/.Google Scholar
- Cyber Threat Alliance. 2020. Available at http://cyberthreatalliance.org/.Google Scholar
- Yoshinori Aono, Takuya Hayashi, Le Trieu Phong, and Lihua Wang. 2016. Scalable and Secure Logistic Regression via Homomorphic Encryption. In ACM CODASPY 2016. 142--144.Google ScholarDigital Library
- Gilad Asharov, Shai Halevi, Yehuda Lindell, and Tal Rabin. 2018. Privacy-Preserving Search of Similar Patients in Genomic Data. PoPETs, Vol. 2018, 4 (2018), 104--124. https://doi.org/10.1515/popets-2018-0034Google ScholarCross Ref
- Foteini Baldimtsi, Dimitrios Papadopoulos, Stavros Papadopoulos, Alessandra Scafuro, and Nikos Triandopoulos. 2017. Server-Aided Secure Computation with Off-line Parties. In ESORICS 2017. 103--123.Google Scholar
- Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and Thomas Schneider. 2011. Privacy-Preserving ECG Classification With Branching Programs and Neural Networks. IEEE Trans. Information Forensics and Security, Vol. 6, 2 (2011), 452--468. https://doi.org/10.1109/TIFS.2011.2108650Google ScholarDigital Library
- Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda. 2009. Scalable, Behavior-Based Malware Clustering.. In Proceedings of the 16th Symposium on Network and Distributed System Security (NDSS).Google Scholar
- Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. 2012. Foundations of garbled circuits. In ACM CCS 2012. 784--796. https://doi.org/10.1145/2382196.2382279Google ScholarDigital Library
- Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS '17). ACM, 1175--1191. https://doi.org/10.1145/3133956.3133982Google ScholarDigital Library
- Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. 2015. Machine Learning Classification over Encrypted Data. In NDSS 2015.Google ScholarCross Ref
- Beyza Bozdemir, Sébastien Canard, Orhan Ermis, Helen Möllering, Melek Önen, and Thomas Schneider. 2021. Privacy-preserving Density-based Clustering. In ASIA CCS '21: ACM Asia Conference on Computer and Communications Security, Virtual Event, Hong Kong, June 7-11, 2021. ACM, 658--671. https://doi.org/10.1145/3433210.3453104Google ScholarDigital Library
- Paul Bunn and Rafail Ostrovsky. 2007. Secure two-party k-means clustering. In Proceedings of the 2007 ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, USA, October 28-31, 2007. 486--497. https://doi.org/10.1145/1315245.1315306Google ScholarDigital Library
- Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS).Google ScholarDigital Library
- Hervé Chabanne, Amaury de Wargny, Jonathan Milgram, Constance Morel, and Emmanuel Prouff. 2017. Privacy-Preserving Classification on Deep Neural Network. Cryptology ePrint Archive, Report 2017/035.Google Scholar
- Javad Ghareh Chamani and Dimitrios Papadopoulos. 2020. Mitigating Leakage in Federated Learning with Trusted Hardware. CoRR, Vol. abs/2011.04948 (2020). arxiv: 2011.04948 https://arxiv.org/abs/2011.04948Google Scholar
- Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul Sharma, and Shardul Tripathi. [n.d.]. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In IEEE European Symposium on Security and Privacy, EuroS&P 2019. 496--511. https://doi.org/10.1109/EuroSP.2019.00043Google Scholar
- Melissa Chase, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, and Peter Rindal. 2017. Private Collaborative Neural Network Learning. IACR Cryptology ePrint Archive, Vol. 2017 (2017), 762. http://eprint.iacr.org/2017/762Google Scholar
- Kamalika Chaudhuri and Claire Monteleoni. 2008. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems 21, 2008. 289--296.Google Scholar
- Jung Hee Cheon, Duhyeong Kim, and Jai Hyun Park. 2019. Towards a Practical Cluster Analysis over Encrypted Data. In Selected Areas in Cryptography - SAC 2019 - 26th International Conference, Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11959). Springer, 227--249. https://doi.org/10.1007/978-3-030-38471-5_10Google ScholarDigital Library
- Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn, and Claire Mathieu. [n.d.]. Hierarchical Clustering: Objective Functions and Algorithms. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018,, Artur Czumaj (Ed.). 378--397. https://doi.org/10.1137/1.9781611975031.26Google Scholar
- Ipsa De and Animesh Tripathy. 2014. A Secure Two Party Hierarchical Clustering Approach for Vertically Partitioned Data Set with Accuracy Measure. In Recent Advances in Intelligent Informatics. Springer International Publishing, 153--162.Google Scholar
- D. Demmler, T. Schneider, and M. Zohner. 2015. ABY - A framework for efficient mixed-protocol secure two-party computation. In Proc. n 22nd Annual Network and Distributed System Security Symposium (NDSS).Google Scholar
- Ben Dickson. 2016. How threat intelligence sharing can help deal with cybersecurity challenges. Available at https://techcrunch.com/2016/05/15/how-threat-intelligence-sharing-can-help-deal-with-cybersecurity-challenges/.Google Scholar
- Mahir Can Doganay, Thomas Brochmann Pedersen, Yücel Saygin, Erkay Savas, and Albert Levi. 2008. Distributed privacy preserving k-means clustering with additive secret sharing. In PAIS 2008. 3--11.Google ScholarDigital Library
- Wenliang Du and Mikhail J. Atallah. 2001. Privacy-Preserving Cooperative Scientific Computations. In 14th IEEE Computer Security Foundations Workshop (CSFW-14 2001). 273--294.Google Scholar
- Wenliang Du, Yunghsiang S. Han, and Shigang Chen. 2004. Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification. In Proceedings of the Fourth SIAM International Conference on Data Mining. 222--233.Google ScholarCross Ref
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In TCC 2006. 265--284.Google Scholar
- Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, and David Botstein. 1998. Cluster analysis and display of genome-wide expression patterns., Vol. 95 (1998), 14863--14868. Issue 25.Google ScholarCross Ref
- Zekeriya Erkin, Thijs Veugen, Tomas Toft, and Reginald L. Lagendijk. 2013. Privacy-preserving distributed clustering. EURASIP J. Information Security, Vol. 2013 (2013), 4. https://doi.org/10.1186/1687-417X-2013-4Google ScholarCross Ref
- Facebook. 2018. Threat Exchange. Available at https://developers.facebook.com/products/threat-exchange.Google Scholar
- Joan Feigenbaum, Yuval Ishai, Tal Malkin, Kobbi Nissim, Martin J. Strauss, and Rebecca N. Wright. 2006. Secure multiparty computation of approximations. ACM Trans. Algorithms, Vol. 2, 3 (2006), 435--472. https://doi.org/10.1145/1159892.1159900Google ScholarDigital Library
- Stephen E. Fienberg, William J. Fulp, Aleksandra B. Slavkovic, and Tracey A. Wrobel. 2006. "Secure" Log-Linear and Logistic Regression Analysis of Distributed Databases. In Privacy in Statistical Databases. 277--290.Google Scholar
- Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks That Exploit Confidence Information and Basic Countermeasures (CCS '15). ACM, New York, NY, USA, 1322--1333. https://doi.org/10.1145/2810103.2813677Google ScholarDigital Library
- Adrià Gascó n, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2017. Privacy-Preserving Distributed Linear Regression on High-Dimensional Data. PoPETs, Vol. 2017, 4 (2017), 345--364. https://doi.org/10.1515/popets-2017-0053Google ScholarCross Ref
- Craig Gentry. 2009. A Fully Homomorphic Encryption Scheme. Ph.D. Dissertation. Stanford, CA, USA. Advisor(s) Boneh, Dan. AAI3382729.Google Scholar
- Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, and John Wernsing. 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In Proc. 33rd International Conference on Machine Learning (ICML).Google Scholar
- Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Peter Rindal, and Mike Rosulek. [n.d.]. Secure Data Exchange: A Marketplace in the Cloud. In Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop, CCSW@CCS 2019. 117--128. https://doi.org/10.1145/3338466.3358924Google ScholarDigital Library
- M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, Vol. 99, 12 (11 June 2002), 7821--7826. https://doi.org/10.1073/pnas.122653799Google ScholarCross Ref
- Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to Play any Mental Game or A Completeness Theorem for Protocols with Honest Majority. In ACM STOC 1987. 218--229. https://doi.org/10.1145/28395.28420Google ScholarDigital Library
- Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008. BotMiner: Clustering Analysis of Network Traffic for Protocol and Structure-independent Botnet Detection. In Proceedings of the 17th USENIX Security Symposium.Google Scholar
- Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. 2001. Cure: An Efficient Clustering Algorithm for Large Databases. Inf. Syst., Vol. 26, 1 (2001), 35--58. https://doi.org/10.1016/S0306-4379(01)00008-4Google ScholarDigital Library
- Mona Hamidi, Mina Sheikhalishahi, and Fabio Martinelli. 2018. Privacy Preserving Expectation Maximization (EM) Clustering Construction. In DCAI 2018 (Advances in Intelligent Systems and Computing, Vol. 800). Springer, 255--263. https://doi.org/10.1007/978-3-319-94649-8_31Google Scholar
- Aditya Hegde, Helen Möllering, Thomas Schneider, and Hossein Yalame. 2021. SoK: Efficient Privacy-preserving Clustering. Proc. Priv. Enhancing Technol., Vol. 2021, 4 (2021), 225--248. https://doi.org/10.2478/popets-2021-0068Google ScholarCross Ref
- W. Henecka, S. Kögl, A.-R. Sadeghi, T. Schneider, and I. Wehrenberg. 1999. Tasty: Tool for automating secure two-party computations. In Proc. ACM Conference on Computer and Communications Security (CCS).Google Scholar
- Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. [n.d.]. Deep Neural Networks Classification over Encrypted Data. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019. 97--108. https://doi.org/10.1145/3292006.3300044Google ScholarDigital Library
- Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Catherine Jones. 2017. Privacy-preserving Machine Learning in Cloud. In Proceedings of the 9th Cloud Computing Security Workshop, CCSW@CCS 2017, Dallas, TX, USA, November 3, 2017,, Bhavani M. Thuraisingham, Ghassan Karame, and Angelos Stavrou (Eds.). ACM, 39--43.Google ScholarDigital Library
- Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi, and Rebecca N. Wright. 2018. Privacy-preserving Machine Learning as a Service. Proc. Priv. Enhancing Technol., Vol. 2018, 3 (2018), 123--142.Google ScholarCross Ref
- Briland Hitaj, Giuseppe Ateniese, and Fernando Pérez-Cruz. 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. In ACM CCS 2017. 603--618.Google ScholarDigital Library
- Ali Inan, Selim Volkan Kaya, Yücel Saygin, Erkay Savas, Aycc a Azgin Hintoglu, and Albert Levi. 2007. Privacy preserving clustering on horizontally partitioned data. Data Knowl. Eng., Vol. 63, 3 (2007), 646--666. https://doi.org/10.1016/j.datak.2007.03.015Google ScholarDigital Library
- Geetha Jagannathan, Krishnan Pillaipakkamnatt, and Rebecca N. Wright. 2006. A New Privacy-Preserving Distributed k-Clustering Algorithm. In Proceedings of the Sixth SIAM International Conference on Data Mining, April 20-22, 2006, Bethesda, MD, USA. SIAM, 494--498. https://doi.org/10.1137/1.9781611972764.47Google Scholar
- Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright, and Daryl Umano. 2010. Communication-Efficient Privacy-Preserving Clustering. Trans. Data Privacy, Vol. 3, 1 (2010), 1--25. http://www.tdp.cat/issues/abs.a028a09.phpGoogle ScholarDigital Library
- Geetha Jagannathan and Rebecca N. Wright. 2005. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In ACM SIGKDD 2005. 593--599. https://doi.org/10.1145/1081870.1081942Google ScholarDigital Library
- Angela Jäschke and Frederik Armknecht. 2018. Unsupervised Machine Learning on Encrypted Data. In Selected Areas in Cryptography - SAC 2018m Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11349). Springer, 453--478. https://doi.org/10.1007/978-3-030-10970-7_21Google ScholarDigital Library
- Somesh Jha, Luis Kruger, and Patrick McDaniel. 2005. Privacy Preserving Clustering. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS).Google ScholarDigital Library
- Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. [n.d.]. GAZELLE: A Low Latency Framework for Secure Neural Network Inference. In 27th USENIX Security Symposium, USENIX Security 2018. 1651--1669. https://www.usenix.org/conference/usenixsecurity18/presentation/juvekarGoogle Scholar
- Hannah Keller, Helen Mö llering, Thomas Schneider, and Hossein Yalame. 2021. Balancing Quality and Efficiency in Private Clustering with Affinity Propagation. In Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021, July 6-8, 2021. SCITEPRESS, 173--184. https://doi.org/10.5220/0010547801730184Google ScholarCross Ref
- Florian Kerschbaum, Thomas Schneider, and Axel Schrö pfer. 2014. Automatic Protocol Selection in Secure Two-Party Computations. In ACNS 2014. 566--584.Google Scholar
- Hyeong-Jin Kim and Jae-Woo Chang. 2018. A Privacy-Preserving k-Means Clustering Algorithm Using Secure Comparison Protocol and Density-Based Center Point Selection. In 11th IEEE International Conference on Cloud Computing, CLOUD 2018. IEEE Computer Society, 928--931. https://doi.org/10.1109/CLOUD.2018.00138Google ScholarCross Ref
- Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and Thomas Schneider. 2009. Improved Garbled Circuit Building Blocks and Applications to Auctions and Computing Minima. In CANS 2009. 1--20.Google ScholarDigital Library
- Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process. Mag., Vol. 37, 3 (2020), 50--60. https://doi.org/10.1109/MSP.2020.2975749Google ScholarCross Ref
- Yi Li, Yitao Duan, and Wei Xu. 2018. PrivPy: Enabling Scalable and General Privacy-Preserving Computation. CoRR, Vol. abs/1801.10117 (2018). arxiv: 1801.10117 http://arxiv.org/abs/1801.10117Google Scholar
- Minlei Liao, Yunfeng Li, Farid Kianifard, Engels Obi, and Stephen Arcona. 2016. Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis. BMC Nephrology, Vol. 17 (2016). Issue 25.Google ScholarCross Ref
- Yehuda Lindell and Benny Pinkas. 2009. A Proof of Security of Yao's Protocol for Two-Party Computation. J. Cryptology, Vol. 22, 2 (2009), 161--188. https://doi.org/10.1007/s00145-008-9036-8Google ScholarDigital Library
- Y. Lindhell and B. Pinkas. 2000. Privacy Preserving Data Mining. In Proc. Advances in Cryptology - CRYPTO. Springer-Verlag.Google Scholar
- Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. 2021. When Machine Learning Meets Privacy: A Survey and Outlook. ACM Comput. Surv., Vol. 54, 2, Article 31 (March 2021), 36 pages. https://doi.org/10.1145/3436755Google ScholarDigital Library
- Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. 2017. Oblivious Neural Network Predictions via MiniONN Transformations. In ACM SIGSAC CCS. 619--631. https://doi.org/10.1145/3133956.3134056Google ScholarDigital Library
- Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.Google Scholar
- Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. [n.d.]. Exploiting Unintended Feature Leakage in Collaborative Learning. In 2019 IEEE Symposium on Security and Privacy, SP 2019. 691--706. https://doi.org/10.1109/SP.2019.00029Google Scholar
- Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. [n.d.]. Delphi: A Cryptographic Inference Service for Neural Networks. In 29th USENIX Security Symposium, USENIX Security 2020. 2505--2522. https://www.usenix.org/conference/usenixsecurity20/presentation/mishraGoogle Scholar
- Payman Mohassel, Mike Rosulek, and Ni Trieu. 2020. Practical Privacy-Preserving K-means Clustering. Proc. Priv. Enhancing Technol., Vol. 2020, 4 (2020), 414--433. https://doi.org/10.2478/popets-2020-0080Google ScholarCross Ref
- Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In IEEE Security and Privacy 2017. 19--38. https://doi.org/10.1109/SP.2017.12Google Scholar
- Fionn Murtagh and Pedro Contreras. 2017. Algorithms for hierarchical clustering: an overview, II. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., Vol. 7, 6 (2017). https://doi.org/10.1002/widm.1219Google Scholar
- Terry Nelms, Roberto Perdisci, and Mustaque Ahamad. 2013. ExecScent: Mining for New cc Domains in Live Networks with Adaptive Control Protocol Templates. In Proceedings of the 22nd USENIX Security Symposium.Google Scholar
- Sophia R. Newcomer, John F. Steiner,, and Elizabeth A. Bayliss. 2011. Identifying Subgroups of Complex Patients With Cluster Analysis. American Journal of Managed Care, Vol. 17 (2011), 324--332. Issue 8.Google Scholar
- Valeria Nikolaenko, Udi Weinsberg, Stratis Ioannidis, Marc Joye, Dan Boneh, and Nina Taft. 2013. Privacy-Preserving Ridge Regression on Hundreds of Millions of Records. In Proc. IEEE Symposium on Security and Privacy (S&P). IEEE.Google ScholarDigital Library
- Olga Ohrimenko, Felix Schuster, Cé dric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. [n.d.]. Oblivious Multi-Party Machine Learning on Trusted Processors. In 25th USENIX Security Symposium, USENIX Security 16. 619--636.Google Scholar
- Stanley R. M. Oliveira and Osmar R. Zaïane. 2003. Privacy Preserving Clustering by Data Transformation. In XVIII Simpósio Brasileiro de Bancos de Dados, Anais/Proceedings. 304--318.Google Scholar
- Clark F. Olson. 1995. Parallel Algorithms for Hierarchical Clustering. Parallel Comput., Vol. 21, 8 (1995), 1313--1325. https://doi.org/10.1016/0167--8191(95)00017-IGoogle ScholarCross Ref
- Claudio Orlandi, Alessandro Piva, and Mauro Barni. 2007. Oblivious Neural Network Computing via Homomorphic Encryption. EURASIP J. Information Security, Vol. 2007 (2007). https://doi.org/10.1155/2007/37343Google Scholar
- P. Paillier. 1999. Public-key cryptosystems based on composite degree residuosity classes. In Proc. Advances in Cryptology - EUROCRYPT. Springer-Verlag.Google Scholar
- Martin Pettai and Peeter Laud. 2015. Combining Differential Privacy and Secure Multiparty Computation. In Proceedings of the 31st Annual Computer Security Applications Conference, Los Angeles, CA, USA, December 7-11, 2015. ACM, 421--430.Google ScholarDigital Library
- Michael O. Rabin. 1981. How to exchange secrets by oblivious transfer. Technical Report TR-81, Aiken Computation Laboratory, Harvard University.Google Scholar
- Fang-Yu Rao, Bharath K. Samanthula, Elisa Bertino, Xun Yi, and Dongxi Liu. 2015. Privacy-Preserving and Outsourced Multi-user K-Means Clustering. In IEEE Conference on Collaboration and Internet Computing, CIC 2015, Hangzhou, China, October 27-30, 2015. IEEE Computer Society, 80--89. https://doi.org/10.1109/CIC.2015.20Google ScholarDigital Library
- M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, and Farinaz Koushanfar. [n.d.]. Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications. In AsiaCCS 2018. 707--721. https://doi.org/10.1145/3196494.3196522Google ScholarDigital Library
- R L Rivest, L Adleman, and M L Dertouzos. 1978. On Data Banks and Privacy Homomorphisms. Foundations of Secure Computation, Academia Press (1978), 169--179.Google Scholar
- Bita Darvish Rouhani, M. Sadegh Riazi, and Farinaz Koushanfar. 2018. Deepsecure: scalable provably-secure deep learning. In DAC 2018. ACM, 2:1--2:6. https://doi.org/10.1145/3195970.3196023Google ScholarDigital Library
- Ahmad-Reza Sadeghi and Thomas Schneider. 2008. Generalized Universal Circuits for Secure Evaluation of Private Functions with Application to Data Classification. In ICISC 2008. 336--353. https://doi.org/10.1007/978-3-642-00730-9_21Google ScholarDigital Library
- Ashish P. Sanil, Alan F. Karr, Xiaodong Lin, and Jerome P. Reiter. 2004. Privacy preserving regression modelling via distributed computation. In ACM SIGKDD 2004. 677--682.Google Scholar
- Mina Sheikhalishahi, Mona Hamidi, and Fabio Martinelli. [n.d.]. Privacy Preserving Collaborative Agglomerative Hierarchical Clustering Construction. In Information Systems Security and Privacy - 4th International Conference, ICISSP 2018, Vol. 977. 261--280. https://doi.org/10.1007/978-3-030-25109-3_14Google Scholar
- Mina Sheikhalishahi and Fabio Martinelli. 2017. Privacy preserving clustering over horizontal and vertical partitioned data. In IEEE ISCC 2017. 1237--1244. https://doi.org/10.1109/ISCC.2017.8024694Google ScholarCross Ref
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In ACM SIGSAC CCS 2015. 1310--1321.Google Scholar
- Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy. 3--18.Google Scholar
- Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In IEEE Global Conference on Signal and Information Processing 2013. 245--248. https://doi.org/10.1109/GlobalSIP.2013.6736861Google ScholarCross Ref
- Ion Stoica, Dawn Song, Raluca Ada Popa, David A. Patterson, Michael W. Mahoney, Randy H. Katz, Anthony D. Joseph, Michael I. Jordan, Joseph M. Hellerstein, Joseph E. Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, and Pieter Abbeel. 2017. A Berkeley View of Systems Challenges for AI. CoRR, Vol. abs/1712.05855 (2017). arxiv: 1712.05855 http://arxiv.org/abs/1712.05855Google Scholar
- Chunhua Su, Feng Bao, Jianying Zhou, Tsuyoshi Takagi, and Kouichi Sakurai. 2007. Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation. In AINA 2007. 385--391.Google ScholarDigital Library
- Chunhua Su, Jianying Zhou, Feng Bao, Tsuyoshi Takagi, and Kouichi Sakurai. 2014. Collaborative agglomerative document clustering with limited information disclosure. Security and Communication Networks, Vol. 7, 6 (2014), 964--978. https://doi.org/10.1002/sec.811Google ScholarDigital Library
- Toshiyuki Takada, Hiroyuki Hanada, Yoshiji Yamada, Jun Sakuma, and Ichiro Takeuchi. 2016. Secure Approximation Guarantee for Cryptographically Private Empirical Risk Minimization. In ACML 2016. 126--141. http://jmlr.org/proceedings/papers/v63/takada48.htmlGoogle Scholar
- Harry Chandra Tanuwidjaja, Rakyong Choi, Seunggeun Baek, and Kwangjo Kim. 2020. Privacy-Preserving Deep Learning on Machine Learning as a Service - a Comprehensive Survey. IEEE Access, Vol. 8 (2020), 167425--167447.Google ScholarCross Ref
- Florian Tramèr and Dan Boneh. [n.d.]. Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. In 7th International Conference on Learning Representations, ICLR 2019. https://openreview.net/forum?id=rJVorjCcKQGoogle Scholar
- A. Tripathy and I. De. 2013. Privacy Preserving Two-Party Hierarchical Clustering Over Vertically Partitioned Dataset. Journal of Software Engineering and Applications, Vol. 06 (2013), 26--31.Google ScholarCross Ref
- Jaideep Vaidya and Chris Clifton. 2003. Privacy-preserving k-means clustering over vertically partitioned data. In ACM SIGKDD 2003. 206--215.Google ScholarDigital Library
- Jaideep Vaidya, Hwanjo Yu, and Xiaoqian Jiang. 2008. Privacy-preserving SVM classification. Knowl. Inf. Syst., Vol. 14, 2 (2008), 161--178. https://doi.org/10.1007/s10115-007-0073-7Google ScholarDigital Library
- Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng Wang, and Diyue Bu. 2015. Efficient Genome-Wide, Privacy-Preserving Similar Patient Query Based on Private Edit Distance (CCS '15). ACM, 492--503. https://doi.org/10.1145/2810103.2813725Google ScholarDigital Library
- M.R. Weir, E.W. Maibach, G.L. Bakris, H.R. Black, P. Chawla, F.H. Messerli, J.M. Neutel, and M.A. Weber. 2000. Implications of a health lifestyle and medication analysis for improving hypertension control. Archives of Internal Medicine, Vol. 160 (2000), 481--490. Issue 4.Google ScholarCross Ref
- Wei Xie, Yang Wang, Steven M. Boker, and Donald E. Brown. 2016. PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers. CoRR, Vol. abs/1611.01170 (2016). arxiv: 1611.01170 http://arxiv.org/abs/1611.01170Google Scholar
- Hongyang Yan, Li Hu, Xiaoyu Xiang, Zheli Liu, and Xu Yuan. 2021. PPCL: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inf. Sci., Vol. 548 (2021), 423--437. https://doi.org/10.1016/j.ins.2020.09.064Google ScholarCross Ref
- Andrew Chi-Chih Yao. 1982. Protocols for Secure Computations (Extended Abstract). In 23rd Annual Symposium on Foundations of Computer Science, 1982. 160--164. https://doi.org/10.1109/SFCS.1982.38Google Scholar
- Andrew Chi-Chih Yao. 1986. How to Generate and Exchange Secrets (Extended Abstract). In 27th Annual Symposium on Foundations of Computer Science, 1986. 162--167. https://doi.org/10.1109/SFCS.1986.25Google ScholarDigital Library
- Samee Zahur and David Evans. 2013. Circuit Structures for Improving Efficiency of Security and Privacy Tools. In 2013 IEEE Symposium on Security and Privacy, SP 2013, Berkeley, CA, USA, May 19-22, 2013. IEEE Computer Society, 493--507. https://doi.org/10.1109/SP.2013.40Google ScholarDigital Library
- Qingchen Zhang, Laurence T. Yang, Zhikui Chen, and Peng Li. 2017. PPHOPCM: Privacy-preserving High-order Possibilistic c-Means Algorithm for Big Data Clustering with Cloud Computing. IEEE Transactions on Big Data (2017), 1-1. https://doi.org/10.1109/TBDATA.2017.2701816Google Scholar
- Tian Zhang, Raghu Ramakrishnan, and Miron Livny. 1996. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In ACM SIGMOD 1996. 103--114.Google Scholar
- Lingchen Zhao, Qian Wang, Qin Zou, Yan Zhang, and Yanjiao Chen. 2020 a. Privacy-Preserving Collaborative Deep Learning With Unreliable Participants. IEEE Trans. Inf. Forensics Secur., Vol. 15 (2020), 1486--1500. https://doi.org/10.1109/TIFS.2019.2939713Google ScholarDigital Library
- Qi Zhao, Chuan Zhao, Shujie Cui, Shan Jing, and Zhenxiang Chen. 2020 b. PrivateDL PrivateDL: Privacy-preserving collaborative deep learning against leakage from gradient sharing. Int. J. Intell. Syst., Vol. 35, 8 (2020), 1262--1279. https://doi.org/10.1002/int.22241Google ScholarDigital Library
- Jan Henrik Ziegeldorf, Jens Hiller, Martin Henze, Hanno Wirtz, and Klaus Wehrle. [n.d.]. Bandwidth-Optimized Secure Two-Party Computation of Minima. In CANS 2015. 197--213.Google Scholar
Index Terms
- Private Hierarchical Clustering and Efficient Approximation
Recommendations
Privacy-preserving Density-based Clustering
ASIA CCS '21: Proceedings of the 2021 ACM Asia Conference on Computer and Communications SecurityClustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical ...
Privacy-Preserving Two-Party K-Means Clustering via Secure Approximation
AINAW '07: Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops - Volume 01K-means clustering is a powerful and frequently used technique in data mining. However, privacy breaching is a serious problem if the k-means clustering is used without any security treatment, while privacy is a real concern in many practical ...
Importance of Data Standardization in Privacy-Preserving K-Means Clustering
Database Systems for Advanced ApplicationsPrivacy-preserving k-means clustering assumes that there are at least two parties in the secure interactive computation. However, the existing schemes do not consider the data standardization which is an important task before executing the clustering ...
Comments