Skip to main content
Log in

Safe disassociation of set-valued datasets

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Disassociation is a bucketization based anonymization technique that divides a set-valued dataset into several clusters to hide the link between individuals and their complete set of items. It increases the utility of the anonymized dataset, but on the other side, it raises many privacy concerns, one in particular, is when the items are tightly coupled to form what is called, a cover problem. In this paper, we present safe disassociation, a technique that relies on partial suppression, to overcome the aforementioned privacy breach encountered when disassociating set-valued datasets. Safe disassociation allows the km-anonymity privacy constraint to be extended to a bucketized dataset and copes with the cover problem. We describe our algorithm that achieves the safe disassociation and we provide a set of experiments to demonstrate its efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. In what follows, we use km-disassociation to denote a dataset that is disassociated and satisfies km-anonymity.

  2. Vertical partitioning creates km-anonymous record chunks.

  3. www.inmobiles.net

References

  • Barakat, S., al Bouna, B., Nassar, M., Guyeux, C. (2016). On the evaluation of the privacy breach in disassociated set-valued datasets. In Callegari, C., van Sinderen, M., Sarigiannidis, P.G., Samarati, P., Cabello, E., Lorenz, P., Obaidat, M.S. (Eds.) Proceedings of the 13th International Joint Conference on e-Business and Telecommunications (ICETE 2016) - Volume 4: SECRYPT, Lisbon, Portugal, July 26-28, 2016 (pp. 318–326). SciTePress.

  • Bewong, M., Liu, J., Liu, L., Li, J. (2017). Utility aware clustering for publishing transactional data. In Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (Eds.) Advances in Knowledge Discovery and Data Mining (pp. 481–494). Cham: Springer International Publishing.

    Chapter  Google Scholar 

  • Biskup, J., Marcel, P.B., Wiese, L. (2011). On the inference-proofness of database fragmentation satisfying confidentiality constraints. In: Proceedings of the 14th Information Security Conference, Xian, China.

  • Barbaro, M., & Zeller, T. (2006). A face is exposed for aol searcher no. 4417749.

  • Chen, L., Zhong, S., Wang, L.-E., Li, X. (2016). A sensitivity-adaptive ρ-uncertainty model for set-valued data. In International Conference on Financial Cryptography and Data Security 2016 (pp. 460–473). Berlin: Springer.

  • Ciriani, V., De Capitani Di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P. (2010). Combining fragmentation and encryption to protect privacy in data storage. ACM Transactions on Information and System Security, 13, 22:1–22:33.

    Article  Google Scholar 

  • De Capitani di Vimercati, S, Foresti, S., Jajodia, S., Livraga, G., Paraboschi, S., Samarati, P. (2013). Extending loose associations to multiple fragments. In Proceedings of the 27th International Conference on Data and Applications Security and Privacy XXVII, DBSec’13 (pp. 1–16). Berlin: Springer.

  • Dwork, C., McSherry, F., Nissim, K., Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Conference on Theory of Cryptography, TCC’06 (pp. 265–284). Berlin: Springer.

    Chapter  Google Scholar 

  • Fard, A.M., & Wang, K. (2010). An effective clustering approach to web query log anonymization. In: Proceedings of the 2010 International Conference on Security and Cryptography (SECRYPT) (pp. 1–11). IEEE.

  • He, Y., & Naughton, J.F. (2009). Anonymization of set-valued data via top-down, local generalization. Proceedings of the VLDB Endowment, 2(1), 934–945.

    Article  Google Scholar 

  • Jia, X., Pan, C., Xu, X., Zhu, K.Q., Lo, E. (2014). ρ-uncertainty anonymization by partial suppression. In Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (Eds.) Database Systems for Advanced Applications, volume 8422 of Lecture Notes in Computer Science (pp. 188–202). Berlin: Springer International Publishing.

  • Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M. (2015). Utility-constrained electronic health record data publishing through generalization and disassociation. In Gkoulalas-Divanis, A., & Loukides, G. (Eds.) Medical Data Privacy Handbook (pp. 149–177). Berlin: Springer International Publishing.

  • Loukides, G., Liagouris, J., Gkoulalas-divanis, A., Terrovitis, M. (2014). Disassociation for electronic health record privacy. Journal of Biomedical Informatics, 50, 46–61.

    Article  Google Scholar 

  • Li, T., Li, N., Zhang, J., Molloy, I. (2012). Slicing: a new approach for privacy preserving data publishing. IEEE Transactions on Knowledge and Data Engineering, 24(3), 561–574.

    Article  Google Scholar 

  • Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M. (2006). l-diversity: Privacy beyond k-anonymity. In: Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), Atlanta Georgia.

  • Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.

    Article  Google Scholar 

  • Sweeney, L. (2002). k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 557–570.

    Article  MathSciNet  Google Scholar 

  • Terrovitis, M., Mamoulis, N., Kalnis, P. (2008). Privacy-preserving anonymization of set-valued data. PVLDB, 1(1), 115–125.

    Google Scholar 

  • Terrovitis, M., Mamoulis, N., Liagouris, J., Skiadopoulos, S. (2012). Privacy preservation by disassociation. Proceedings of the VLDB Endowment, 5(10), 944–955.

    Article  Google Scholar 

  • Wang, J., Deng, C., Li, X. (2018). Two privacy-preserving approaches for publishing transactional data streams. IEEE Access, pp. 1–1.

  • Ke, W., Wang, P., Fu, A.W., Wong, R.C.-W. (2016). Generalized bucketization scheme for flexible privacy settings. Information Sciences, 348, 377–393.

    Article  MathSciNet  Google Scholar 

  • Xiao, X., & Tao, Y. (2006). Anatomy: Simple and effective privacy preservation. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea, September 12-15.

  • Zhang, H., Zhou, Z., Ye, L., Xiaojiang, D.U. (2015). Towards privacy preserving publishing of set-valued data on hybrid cloud. In: IEEE Transactions on cloud computing.

Download references

Acknowledgments

This work is funded by the InMobiles companyFootnote 3 and the Labex ACTION program (contract ANR-11-LABX-01-01). Computations have been performed on the supercomputer facilities of the Mésocentre de calcul de Franche-Comté. Special thanks to Ms. Sara Barakat for her contribution in identifying the cover problem.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nancy Awad.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Awad, N., Al Bouna, B., Couchot, JF. et al. Safe disassociation of set-valued datasets. J Intell Inf Syst 53, 547–562 (2019). https://doi.org/10.1007/s10844-019-00568-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-019-00568-7

Keywords

Navigation