ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Information Sciences
Volume 177, Issue 2, 15 January 2007, Pages 490-503
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (229 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.ins.2006.08.010    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2006 Elsevier Inc. All rights reserved.

Privacy-preserving algorithms for distributed mining of frequent itemsetsstar, open

Sheng Zhonga, E-mail The Corresponding Author

aComputer Science and Engineering Department, State University of New York at Buffalo, Amherst, NY 14260, USA

Received 11 October 2005; 
revised 25 July 2006; 
accepted 7 August 2006. 
Available online 1 September 2006.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Standard algorithms for association rule mining are based on identification of frequent itemsets. In this paper, we study how to maintain privacy in distributed mining of frequent itemsets. That is, we study how two (or more) parties can find frequent itemsets in a distributed database without revealing each party’s portion of the data to the other. The existing solution for vertically partitioned data leaks a significant amount of information, while the existing solution for horizontally partitioned data only works for three parties or more. In this paper, we design algorithms for both vertically and horizontally partitioned data, with cryptographically strong privacy. We give two algorithms for vertically partitioned data; one of them reveals only the support count and the other reveals nothing. Both of them have computational overheads linear in the number of transactions. Our algorithm for horizontally partitioned data works for two parties and above and is more efficient than the existing solution.

Keywords: Data mining; Association rules; Distributed databases; Privacy

Article Outline

1. Introduction
1.1. Related work
1.2. Our contributions
1.3. Paper organization
2. Technical preliminaries
2.1. Problem formulation
2.1.1. Association rule and frequent itemset
2.1.2. Matrix representation
2.1.3. Vertically partitioned and horizontally partitioned data
2.2. Model and definitions of privacy
2.2.1. Semi-honest model
2.2.2. Defining privacy
3. Weakly privacy-preserving algorithm for vertically partitioned data
3.1. Overview
3.2. Algorithm summary
3.3. Security analysis
3.4. Efficiency analysis
3.4.1. Computational overhead
3.4.2. Communication overhead
4. Strongly privacy-preserving algorithm for vertically partitioned data
4.1. Overview
4.1.1. Homomorphic encryption
4.1.2. Algorithm design
4.2. Algorithm summary
4.3. Security analysis
4.4. Efficiency analysis
4.4.1. Computational overhead
4.4.2. Communication overhead
5. Algorithm for horizontally partitioned data
5.1. Overview
5.2. Algorithm summary
5.3. Security analysis
5.4. Efficiency analysis
5.4.1. Computational overhead
5.4.2. Communication overhead
6. Extension to multiparty distributed mining
6.1. Algorithm for vertically partitioned data
6.2. Algorithm for horizontally partitioned data
7. Conclusion and open problems
Acknowledgements
Appendix. Goldwasser–Micali encryption scheme
References

Information Sciences
Volume 177, Issue 2, 15 January 2007, Pages 490-503
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.