Copyright © 2006 Elsevier Inc. All rights reserved.
Received 11 October 2005;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
Standard algorithms for association rule mining are based on identification of frequent itemsets. In this paper, we study how to maintain privacy in distributed mining of frequent itemsets. That is, we study how two (or more) parties can find frequent itemsets in a distributed database without revealing each party’s portion of the data to the other. The existing solution for vertically partitioned data leaks a significant amount of information, while the existing solution for horizontally partitioned data only works for three parties or more. In this paper, we design algorithms for both vertically and horizontally partitioned data, with cryptographically strong privacy. We give two algorithms for vertically partitioned data; one of them reveals only the support count and the other reveals nothing. Both of them have computational overheads linear in the number of transactions. Our algorithm for horizontally partitioned data works for two parties and above and is more efficient than the existing solution.
Keywords: Data mining; Association rules; Distributed databases; Privacy
Article Outline
- 1. Introduction
- 1.1. Related work
- 1.2. Our contributions
- 1.3. Paper organization
- 2. Technical preliminaries
- 2.1. Problem formulation
- 2.1.1. Association rule and frequent itemset
- 2.1.2. Matrix representation
- 2.1.3. Vertically partitioned and horizontally partitioned data
- 2.2. Model and definitions of privacy
- 2.2.1. Semi-honest model
- 2.2.2. Defining privacy
- 3. Weakly privacy-preserving algorithm for vertically partitioned data
- 3.1. Overview
- 3.2. Algorithm summary
- 3.3. Security analysis
- 3.4. Efficiency analysis
- 3.4.1. Computational overhead
- 3.4.2. Communication overhead
- 4. Strongly privacy-preserving algorithm for vertically partitioned data
- 4.1. Overview
- 4.1.1. Homomorphic encryption
- 4.1.2. Algorithm design
- 4.2. Algorithm summary
- 4.3. Security analysis
- 4.4. Efficiency analysis
- 4.4.1. Computational overhead
- 4.4.2. Communication overhead
- 5. Algorithm for horizontally partitioned data
- 5.1. Overview
- 5.2. Algorithm summary
- 5.3. Security analysis
- 5.4. Efficiency analysis
- 5.4.1. Computational overhead
- 5.4.2. Communication overhead
- 6. Extension to multiparty distributed mining
- 7. Conclusion and open problems
- Acknowledgements
- Appendix. Goldwasser–Micali encryption scheme
- References







E-mail Article
Add to my Quick Links

Cited By in Scopus (6)






