Abstract
This paper proposes an efficient solution with high accuracy to the problem of privacy-preserving clustering. This problem has been studied mainly using two approaches: data perturbation and secure multiparty computation. In our research, we focus on the data perturbation approach, and propose an algorithm of linear time complexity based on 1-d clustering to perturb the data. Performance study on real datasets from the UCI machine learning repository shows that our approach reaches better accuracy and hence lowers the distortion of clustering result than previous approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Asonov, D., Kantarcioglu, M., Li, Y.: Sovereign Joins. In: IEEE ICDE (2006)
Artin, A.: Algebra. Prentice Hall, New Jersey (1991)
Bingham, E., Mannila, H.: Random Projection in Dimensionality Reduction: Applications to Image and Text Data. In: ACM SIGKDD (2001)
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering Large Graphs via the Singular Value Decomposition. Machine Learning 56, 9–33 (2004)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kauffmann, San Francisco (2001)
Huang, Z., Du, W., Chen, B.: Deriving Private Information from Randomized Data. In: ACM SIGMOD (2005)
ISO/IEC 14882: 2003(E), p. 562 (2003)
Jagannathan, G., Wright, R.: Privacy-Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data. In: ACM KDD (2005)
Johnson, W., Lindenstrauss, J.: Extensions of Lipshitz Mapping Into Hilbert Space. In: Proc.of the Conference in Modern Analysis and Probability. Contemporary Mathematics, vol. 26, pp. 189–206 (1984)
Liu, K., Giannella, C.M., Kargupta, H.: An attacker’s view of distance preserving maps for privacy preserving data mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS, vol. 4213, pp. 297–308. Springer, Heidelberg (2006)
Liu, K., Kargupta, H.: Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE TKDE 18(1) (2006)
Liu, K., Giannella, C., Kargupta, H.: A survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods. In: Privacy-Preserving Data Mining: Models and Algorithms (2007)
Lloyd, S.: Least Squares Quantization in Pcm. IEEE Transactions on Information Theory IT-28(2), 129–136 (1982)
Meila, M.: Comparing Clusterings - An Axiomatic View. In: ICML (2005)
Oliveira, S., Zaiane, O.: Privacy Preserving Clustering By Data Transformation. In: Proceedings of the 18th Brazilian Symposium on Databases, pp. 304–318 (2003)
Ward Jr., J.: Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association 58(301) (1963)
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/
Vaidya, J., Clifton, C.: Privacy Preserving K-Means Clustering over Vertically Partitioned Data. In: ACM SIGKDD (2003)
Yao, A.: How to Generate and Exchange Secrets. In: Proceedings of the 27th IEEE FOCS, pp. 162–167 (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cui, Y., Wong, W.K., Cheung, D.W. (2009). Privacy-Preserving Clustering with High Accuracy and Low Time Complexity. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-00887-0_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00886-3
Online ISBN: 978-3-642-00887-0
eBook Packages: Computer ScienceComputer Science (R0)