Skip to main content

Privacy-Preserving Data Mining Techniques: Survey and Challenges

  • Chapter
Discrimination and Privacy in the Information Society

Part of the book series: Studies in Applied Philosophy, Epistemology and Rational Ethics ((SAPERE,volume 3))

Abstract

This chapter presents a brief summary and review of Privacy-preserving Data Mining (PPDM). The review of the existing approaches is structured along a tentative taxonomy of PPDM as a field. The main axes of this taxonomy specify what kind of data is being protected, and what is the ownership of the data (centralized or distributed). We comment on the relationship between PPDM and preventing discriminatory use of data mining techniques. We round up the chapter by discussing some of the new, arising challenges before PPDM as a field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aggarwal, C.C., Yu, P.S.: A framework for condensation-based anonymization of string data. Data Mining and Knowledge Discovery 16, 251–275 (2008)

    Article  MathSciNet  Google Scholar 

  • Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM SIGMOD Record 29, 439–450 (2000)

    Article  Google Scholar 

  • Atzori, M., Bonchi, F., et al.: Anonymity preserving pattern discovery. VLDB Journal 17(4), 703–727 (2008)

    Article  Google Scholar 

  • Bonizzoni, P., Della Vedova, G., Dondi, R.: The k-Anonymity Problem is Hard. In: KutyÅ‚owski, M., Charatonik, W., GÄ™bala, M. (eds.) FCT 2009. LNCS, vol. 5699, pp. 26–37. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  • Chen, B.-C., Kifer, D., et al.: Privacy-Preserving Data Publishing. Found Trends Databases 2(1-2), 1–167 (2009)

    Article  Google Scholar 

  • Chen, B.-C., LeFevre, K., et al.: Privacy skyline: privacy with multidimensional adversarial knowledge. In: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria. VLDB Endowment (2007)

    Google Scholar 

  • Ciriani, V., Capitani di Vimercati, S., et al.: k-Anonymity. Secure Data Management in Decentralized Systems 33, 323–353 (2007)

    Article  Google Scholar 

  • El Emam, K., Dankar, F.K., et al.: A Globally Optimal k-Anonymity Method for the De-Identification of Health Data. Journal of the American Medical Informatics Association 16(5), 670–682 (2009)

    Article  Google Scholar 

  • Fang, L., Kim, H., et al.: A privacy recommendation wizard for users of social networking sites. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, Chicago, Illinois, USA. ACM (2010)

    Google Scholar 

  • Fung, B.C.M., Wang, K., et al.: Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv. 42(4), 1–53 (2010)

    Article  Google Scholar 

  • Gentry, C.: Computing arbitrary functions of encrypted data. Commun. ACM 53(3), 97–105 (2010)

    Article  Google Scholar 

  • Giannotti, F., Pedreschi, D., Turini, F.: Mobility, Data Mining and Privacy the Experience of the GeoPKDD Project. In: Bonchi, F., Ferrari, E., Jiang, W., Malin, B. (eds.) PinKDD 2008. LNCS, vol. 5456, pp. 25–32. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  • Hay, M., Miklau, G., et al.: Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1(1), 102–114 (2008)

    Google Scholar 

  • Kargupta, H., Datta, S., et al.: On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 99–106 (2003)

    Google Scholar 

  • Muralidhar, K., Sarathy, R.: Transactions on Data Privacy 1(1), 17–33 (2008)

    MathSciNet  Google Scholar 

  • Kun, L., Kargupta, H., et al.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)

    Article  Google Scholar 

  • Li, N., Li, T.: t-Closeness: Privacy Beyond k-Anonymity and â„“-Diversity. In: Proceedings of IEEE International Conference on Data Engineering (2007)

    Google Scholar 

  • Lindell, Y., Pinkas, B.: Secure Multiparty Computation for Privacy-Preserving Data Mining. Journal of Privacy and Confidentiality 1(1), 59–98 (2009)

    Google Scholar 

  • Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver. ACM (2008)

    Google Scholar 

  • Loukides, G., Gkoulalas-Divanis, A., Shao, J.: Anonymizing Transaction Data to Eliminate Sensitive Inferences. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6261, pp. 400–415. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  • Martin, D.J., Kifer, D., et al.: Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007 (2007)

    Google Scholar 

  • Mohammed, N., Fung, B.C.M., et al.: Anonymizing healthcare data: a case study on the blood transfusion service. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France. ACM (2009)

    Google Scholar 

  • Moor, J.: Towards a theory of privacy in the information age. In: Bynum, T., Rodgerson, S. (eds.) Computer Ethics and Professional Responsibility. Blackwell Publishing (2004)

    Google Scholar 

  • Nin, J., Herranz, J., et al.: Rethinking rank swapping to decrease disclosure risk. Data Knowl. Eng. 64(1), 346–364 (2008)

    Article  Google Scholar 

  • Oliveira, S.R.M., Zaïane, O.R., Saygın, Y.: Secure Association Rule Sharing. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 74–85. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  • Paillier, P.: The 26th International Conference on Privacy and Personal Data Protection. In: Advances in Cryptography - EUROCRYPT 1999, pp. 23–38 (1999)

    Google Scholar 

  • Sweeney, L.: Datafly: A System for Providing Anonymity in Medical Data. In: Proceedings of the IFIP TC11 WG11.3 Eleventh International Conference on Database Securty XI: Status and Prospects, pp. 356–381 (1998)

    Google Scholar 

  • Sweeney, L.: Computational Disclosure Control: A Primer on Data Privacy Protection, Ph.D. thesis. Massachusetts Institute of Technology (2001)

    Google Scholar 

  • Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., ACM (2003)

    Google Scholar 

  • Vaidya, J., Clifton, C., et al.: Privacy-preserving decision trees over vertically partitioned data. ACM Trans. Knowl. Discov. Data 2(3), 1–27 (2008)

    Article  Google Scholar 

  • Vaidya, J., Zhu, Y.M., et al.: Privacy Preserving Data Mining. Springer (2006)

    Google Scholar 

  • Verykios, V.S., Elmagarmid, A.K., et al.: Association Rule Hiding. IEEE Trans. on Knowl. and Data Eng. 16(4), 434–447 (2004)

    Article  Google Scholar 

  • Wang, D., Pedreschi, D., et al.: Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, USA, pp. 1100–1108. ACM (2011)

    Google Scholar 

  • Yang, Z., Wright, R.N.: Privacy-Preserving Computation of Bayesian Networks on Vertically Partitioned Data. IEEE Trans. on Knowl. and Data Eng. 18(9), 1253–1264 (2006)

    Article  Google Scholar 

  • Yang, Z., Wright, R.N., et al.: Experimental analysis of a privacy-preserving scalar product protocol. Comput. Syst. Sci. Eng. 21(1) (2006)

    Google Scholar 

  • Zhan, J., Chang, L., et al.: Privacy preserving k-nearest neighbor classification. International Journal of Network Security (1), 46–51 (2005)

    Google Scholar 

  • Zhan, J., Matwin, S.: Privacy-preserving support vector machine classification. International Journal of Intelligent Information and Database Systems 1(3-4), 365–385 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stan Matwin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Matwin, S. (2013). Privacy-Preserving Data Mining Techniques: Survey and Challenges. In: Custers, B., Calders, T., Schermer, B., Zarsky, T. (eds) Discrimination and Privacy in the Information Society. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30487-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30487-3_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30486-6

  • Online ISBN: 978-3-642-30487-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics