Skip to main content
Log in

Flexible sensitive K-anonymization on transactions

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In recent years, privacy breaches have been a great concern on the published data. Only removing one’s personal identification information is not sufficient to protect individual’s privacy. Privacy preservation technology for published data is devoted to preventing re-identification and retaining the useful information in published data. In this work, we propose a novel algorithm to deal with sensitive and quasi-identifier items, respectively, in transactional data. The proposed algorithm maintains at least the same or a stronger privacy level for transactional data with 1/k. In numerical experiments, our proposed algorithm shows better running time and better data utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

References

  1. Aggarwal, C.C., Yu, P.S.: This book title: Privacy-Preserving Data Mining: Models and Algorithms. Springer, New York (2008)

  2. Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A.: Anonymizing tables, In Proc. of the 10th International Conference on Database Theory, pp. 246–258, (2005)

  3. Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., and Zhu, A.: Achieving anonymity via clustering, In Proc. of ACM SIGMOD conference, (2006), pp. 153–162

  4. Barbaro, M. and Zeller, T. Jr: A face is exposed for AOL serach no. 4417749, New York Times, (2006)

  5. Bayardo, R.J. and Agrawal, R., Data Privacy through optimal k-Anonymization, In Proc. of ICDE, (2005), pp. 217–228

  6. Casino, F., Patsakis, C., Puig, C., and Solanas, A.: On privacy preserving collaborative filtering: current trends, open problems and new issues, In Proc. of IEEE 10th International Conference on e-Business Engineering, (2013)

  7. Casino, F., Domingo-Ferrer, J., Patsakis, C., Puig, D., Solanas, A.: A k-anonymous approach to privacy preserving collaborative filtering. J. Comput. Syst. Sci. 81(6), 1000–1011 (2015)

    Article  Google Scholar 

  8. Dwork, C.: Differential privacy, In Proc. of the 33th International conference on Automata Languages and Programming-Volume PartII (ICALP), (2006), pp. 1–12

  9. Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey on recent developments. ACM Comput. Surv. 42(4), (2010)

  10. Ghinita, G., Tao, Y., and Kalnis, P.: On the anonymization of sparse high-dimensional data, In Proc. of ICDE, (2008), pp. 715–724

  11. Ghinita, G., Kalnis, P., Tao, Y.: Anonymous publication of sensitive transactional data. IEEE Trans. Knowl. Data Eng. 33(2), 161–174 (2011)

    Article  Google Scholar 

  12. Gkountouna, O., Angeli, S., Zigomitros, A., Terrovitis, M., and Vassiliou, Y.: Km-anonymity for continuous data using dynamic hierarchies, In Proc. of International Conference on Privacy in Statistical Databases, (2014), pp. 156–169

  13. Gokila, S., Venkateswari, P.: A survey on privacy preserving data publishing. IJCI. 3(1), (2014)

  14. He, Y. and Naughton, J.F., Anonymization of set-valued data via top-down, local generalization, In Proc. of VLDB conference, (2009), pp. 934–945

  15. IBM Quest Market-Basket Synthetic Data Generator, http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData

  16. Kabou, S., Benslimane, S.M., Mosteghanemi, M.: A survey on privacy preserving dynamic data publishing. IJOCI. 8(4), 1–20 (2018)

  17. Li, N., Li, T., and Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity, In Proc. of ICDE conference, (2007), pp. 106–115

  18. Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: a new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 23(3), 561–574 (2012)

    Article  Google Scholar 

  19. Liu, X., Xie, Q., and Wang, L.: Personalized extended (α, k)-anonymity model for privacy-preserving data publishing, Concurrency and Computation Practice and Experience, (2016)

  20. Loukides, G., Gkoulalas-Divanis, A.: Utility-preserving transaction data anonymization with low information loss. Expert Syst. Appl. 39(10), 9764–9777 (2012)

    Article  Google Scholar 

  21. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data. 1(1), (2007)

  22. Meyerson, A., Williames, R.: On the complexity of optimal k-anonymity, In Proc. of ACM PODS conference, (2004), pp. 223–228

  23. Mirashe, M.S., Hande, K.N.: Efficient technique for annonymized microdata preservation using slicing. International Journal of Computer Science and Information Technologies(IJCSIT). 6(4), 3701–3705 (2015)

    Google Scholar 

  24. Motwani, R. and Nabar, S.U.: Anonymizing unstructured data, arXiv: 0810.5582v2, [cs.DB], (2008)

  25. Ni, S., Xie, M., Qian, Q.: Clustering based K-anonymity algorithm for privacy preservation. Int. J. Netw. Secur. 19(6), 1062–1071 (2017)

    Google Scholar 

  26. Park, H., Shim, K., Approximate algorithms for k-anonymity, In Proc. of ACM SIGMOD conference, (2007), pp. 67–78

  27. Rao, P.R.M., Krishna, S.M., Kumar, A.P.S.: Privacy preservation techniques in big data analytics: a survey. J. Big Data. (2018)

  28. Rumbold, J., Pierscionek, B.: Contextual anonymization for secondary use of big data in biomedical research: proposal for an anonymization matrix. JMIR Med. Inform. 6(4), (2018)

  29. Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  30. Samarati, P. and Sweeny, L., Generalizing data to provide anonymity when disclosing information, In Proc. of ACM Symposium on Principles of Database Systems, (1998), pp. 188

  31. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 10(5), 571–588 (2002)

    Article  MathSciNet  Google Scholar 

  32. Sweeny, L.: K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems. 10(5), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  33. Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)

  34. Wang, S.L., Tsai, Y.C., Kao, H.Y., Hong, T.P.: Extending suppression for anonymization on set-valued data. Int. J. Innov. Comput. Inf. Control. 7(1), 6849–6863 (2011)

    Google Scholar 

  35. Wang, S.L., Tsai, Y.C., Kao, H.Y., Hong, T.P.: On anonymizing transactions with sensitive items. Appl. Intell. 41(4), 1043–1058 (2014)

    Article  Google Scholar 

  36. Wang, J., Du, K., Luo, X., Li, X.: Two privacy-preserving approaches for data publishing with identity reservation. Knowl. Inf. Syst. 1–42 (2018)

  37. Xu, T., Wang, K., Fu, A.W.C., and Yu, P.S.: Anonymizing transaction databases for publication, In Proc. of SIGKDD, (2008), pp. 767–775

  38. Xu Y., Fung, B.C.M., Wang, K., Fu, A.W.C., and Pei, J.: Publishing sensitive transactions for itemset utility, In Proc. of ICDM, (2008), pp. 1109–1114

  39. Zhang, H., Zhou, Z., Ye, L., Du, X.: Towards privacy preserving publishing of set-valued data on hybrid cloud. IEEE Trans. Cloud. Comput. 6(2), 316–329 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shyue-Liang Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsai, YC., Wang, SL., Ting, IH. et al. Flexible sensitive K-anonymization on transactions. World Wide Web 23, 2391–2406 (2020). https://doi.org/10.1007/s11280-020-00798-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-020-00798-8

Keywords

Navigation