skip to main content
10.1145/3584372.3588665acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Open Access

Differentially Private Data Release over Multiple Tables

Published:18 June 2023Publication History

ABSTRACT

We study synthetic data release for answering multiple linear queries over a set of database tables in a differentially private way. Two special cases have been considered in the literature: how to release a synthetic dataset for answering multiple linear queries over a single table, and how to release the answer for a single counting (join size) query over a set of database tables. Compared to the single-table case, the join operator makes query answering challenging, since the sensitivity (i.e., by how much an individual data record can affect the answer) could be heavily amplified by complex join relationships. We present an algorithm for the general problem, and prove a lower bound illustrating that our general algorithm achieves parameterized optimality (up to logarithmic factors) on some simple queries (e.g., two-table join queries) in the most commonly-used privacy parameter regimes. For the case of hierarchical joins, we present a data partition procedure that exploits the concept of uniformized sensitivities to further improve the utility.

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley Reading.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hilal Asi and John C Duchi. 2020. Instance-optimality in differential privacy via approximate inverse sensitivity mechanisms. NeurIPS, 14106--14117.Google ScholarGoogle Scholar
  3. Albert Atserias, Martin Grohe, and Dániel Marx. 2008. Size bounds and query plans for relational joins. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science. IEEE, 739--748.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. 2017. Answering conjunctive queries under updates. In PODS. 303--318.Google ScholarGoogle Scholar
  5. Aditya Bhaskara, Daniel Dadush, Ravishankar Krishnaswamy, and Kunal Talwar. 2012. Unconditional differentially private mechanisms for linear queries. In STOC. 1269--1284.Google ScholarGoogle Scholar
  6. Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In ITCS. 87--96.Google ScholarGoogle Scholar
  7. Mark Bun, Kobbi Nissim, Uri Stemmer, and Salil Vadhan. 2015. Differentially private release and learning of threshold functions. In FOCS. 634--649.Google ScholarGoogle Scholar
  8. Mark Bun, Jonathan Ullman, and Salil Vadhan. 2018. Fingerprinting codes and the price of approximate differential privacy. SIAM J. Comput. , Vol. 47, 5 (2018), 1888--1938.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T-H Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and continual release of statistics. ACM TISSEC, Vol. 14, 3 (2011), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shixi Chen and Shuigeng Zhou. 2013. Recursive mechanism: towards node differential privacy and unrestricted joins. In SIGMOD. 653--664.Google ScholarGoogle Scholar
  11. Graham Cormode, Cecilia Procopiuc, Divesh Srivastava, Entong Shen, and Ting Yu. 2012. Differentially private spatial decompositions. In ICDE. 20--31.Google ScholarGoogle Scholar
  12. Nilesh Dalvi and Dan Suciu. 2007. Efficient query evaluation on probabilistic databases. The VLDB Journal, Vol. 16, 4 (2007), 523--544.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bolin Ding, Marianne Winslett, Jiawei Han, and Zhenhui Li. 2011. Differentially private data cubes: optimizing noise sources and consistency. In SIGMOD. 217--228.Google ScholarGoogle Scholar
  14. Wei Dong, Juanru Fang, Ke Yi, Yuchao Tao, and Ashwin Machanavajjhala. 2022. R2T: Instance-optimal Truncation for Differentially Private Query Evaluation with Foreign Keys. In SIGMOD.Google ScholarGoogle Scholar
  15. Wei Dong and Ke Yi. 2021. Residual Sensitivity for Differentially Private Multi-Way Joins. In SIGMOD. 432--444.Google ScholarGoogle Scholar
  16. Wei Dong and Ke Yi. 2022. A Nearly Instance-optimal Differentially Private Mechanism for Conjunctive Queries. In PODS.Google ScholarGoogle Scholar
  17. Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006 a. Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques. Springer, 486--503.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006 b. Calibrating noise to sensitivity in private data analysis. In TCC. 265--284.Google ScholarGoogle Scholar
  19. Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N Rothblum. 2010. Differential privacy under continual observation. In STOC. 715--724.Google ScholarGoogle Scholar
  20. Cynthia Dwork, Moni Naor, Omer Reingold, and Guy N Rothblum. 2015. Pure differential privacy for rectangle queries via private partitions. In ASIACRYPT. 735--751.Google ScholarGoogle Scholar
  21. Robert Fink and Dan Olteanu. 2016. Dichotomies for queries with negation in probabilistic databases. ACM TODS, Vol. 41, 1 (2016), 1--47.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Quan Geng, Wei Ding, Ruiqi Guo, and Sanjiv Kumar. 2020. Tight Analysis of Privacy and Utility Tradeoff in Approximate Differential Privacy. In AISTATS. 89--99.Google ScholarGoogle Scholar
  23. Badih Ghazi, Neel Kamal, Ravi Kumar, Pasin Manurangsi, and Annika Zhang. 2022. Private Aggregation of Trajectories. Proc. Priv. Enhancing Technol. , Vol. 2022, 4 (2022), 626--644. https://doi.org/10.56553/popets-2022-0125Google ScholarGoogle ScholarCross RefCross Ref
  24. Todd J Green, Grigoris Karvounarakis, and Val Tannen. 2007. Provenance semirings. In PODS. 31--40.Google ScholarGoogle Scholar
  25. Moritz Hardt, Katrina Ligett, and Frank McSherry. 2012. A simple and practical algorithm for differentially private data release. In NIPS. 2348--2356.Google ScholarGoogle Scholar
  26. Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In STOC. 705--714.Google ScholarGoogle Scholar
  27. Xiao Hu, Stavros Sintos, Junyang Gao, K. Pankaj Agarwal, and Jun Yang. 2022. Computing Complex Temporal Join Queries Efficiently. In SIGMOD.Google ScholarGoogle Scholar
  28. Ziyue Huang and Ke Yi. 2021. Approximate Range Counting Under Differential Privacy. In SoCG. 45:1--45:14.Google ScholarGoogle Scholar
  29. Manas Joglekar and Christopher Ré. 2018. It's all a matter of degree: Using degree information to optimize multiway joins. Theory Comput. Syst. , Vol. 62(4) (2018), 810--853.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Noah Johnson, Joseph P Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. VLDB, Vol. 11, 5 (2018), 526--539.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Daniel Kifer and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. In SIGMOD. 193--204.Google ScholarGoogle Scholar
  32. Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. Privatesql: a differentially private SQL query engine. VLDB, Vol. 12, 11 (2019), 1371--1384.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chao Li, Michael Hay, Vibhor Rastogi, Gerome Miklau, and Andrew McGregor. 2010. Optimizing linear counting queries under differential privacy. In PODS. 123--134.Google ScholarGoogle Scholar
  34. Chao Li and Gerome Miklau. 2011. Efficient batch query answering under differential privacy. arXiv preprint arXiv:1103.1367 (2011).Google ScholarGoogle Scholar
  35. Chao Li and Gerome Miklau. 2012. An adaptive mechanism for accurate query answering under differential privacy. VLDB , Vol. 5(6) (2012), 514--525.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In FOCS. 94--103.Google ScholarGoogle Scholar
  37. Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In SIGMOD. 19--30.Google ScholarGoogle Scholar
  38. Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially Private Join Queries over Distributed Databases. In OSDI. 149--162.Google ScholarGoogle Scholar
  39. Aleksandar Nikolov, Kunal Talwar, and Li Zhang. 2016. The Geometry of Differential Privacy: The Small Database and Approximate Cases. SIAM J. Comput. , Vol. 45, 2 (2016), 575--616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2007. Smooth sensitivity and sampling in private data analysis. In STOC. 75--84.Google ScholarGoogle Scholar
  41. Davide Proserpio, Sharon Goldberg, and Frank McSherry. 2014. Calibrating data to sensitivity in private data analysis: A platform for differentially-private analysis of weighted datasets. VLDB, Vol. 7, 8 (2014), 637--648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yuchao Tao, Xi He, Ashwin Machanavajjhala, and Sudeepa Roy. 2020. Computing local sensitivities of counting queries with joins. In SIGMOD. 479--494.Google ScholarGoogle Scholar
  43. Salil Vadhan. 2017. The complexity of differential privacy. In Tutorials on the Foundations of Cryptography. Springer, 347--450.Google ScholarGoogle Scholar
  44. Moshe Y Vardi. 1982. The complexity of relational query languages. In STOC. 137--146.Google ScholarGoogle Scholar
  45. Jun Zhang, Graham Cormode, Cecilia M Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2017. Privbayes: Private data release via bayesian networks. ACM TODS, Vol. 42, 4 (2017), 1--41. ioGoogle ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Differentially Private Data Release over Multiple Tables

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PODS '23: Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
          June 2023
          392 pages
          ISBN:9798400701276
          DOI:10.1145/3584372

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 June 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate642of2,707submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader