Skip to main content

Remote Data Access and the Risk of Disclosure from Linear Regression: An Empirical Study

  • Conference paper
Privacy in Statistical Databases (PSD 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6344))

Included in the following conference series:

Abstract

In the endeavor of finding ways for easy data access for researchers not employed at a statistical agency remote data access seems to be an attractive alternative to the current standard of either altering the data substantially before release or allowing access only at designated data archives or research data centers. Data perturbation is often not accepted by the researchers since they do not trust the results from the altered data sets. But on-site access puts some heavy burdens on the researcher and the data providing agency both in terms of time and money. Remote data access or remote analysis servers that allow to submit queries without actually seeing the microdata have the potential of overcoming both these disadvantages. However, even if the microdata is not available to the researcher directly, disclosure of sensitive information for individual survey respondents is still possible.

In this paper we illustrate how an intruder could use some commonly available background information to reveal sensitive information using simple linear regression. We demonstrate the real risks from this approach with an empirical evaluation based on a German establishment survey, the IAB Establishment Panel. Although these kind of attacks can easily be prevented once the agency is aware of the problem, this small simulation aims to emphasize that there might be many ways to obtain sensitive information using multivariate analysis and not all of them are obvious. Thus, agencies thinking about actually implementing some form of remote data access should consider carefully which queries could be allowed by the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Drechsler, J.: Multiple imputation of missing values in the wave 2007 of the IAB Establishment Panel. IAB Discussion Paper No. 6 (2010)

    Google Scholar 

  2. Fischer, G., Janik, F., Müller, D., Schmucker, A.: The IAB Establishment Panel – from sample to survey to projection. Tech. rep., FDZ-Methodenreport No. 1 (2008)

    Google Scholar 

  3. Gomatam, S., Karr, A.F., Reiter, J.P., Sanil, A.P.: Data dissemination and disclosure limitation in a world without microdata: A risk-utility framework for remote access servers. Statistical Science 20, 163–177 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  4. Hoaglin, D.C., Welsh, R.E.: The Hat Matrix in Regression and ANOVA. The American Statistician 32, 17–22 (1978)

    Article  MATH  Google Scholar 

  5. Kölling, A.: The IAB-Establishment Panel. Journal of Applied Social Science Studies 120, 291–300 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bleninger, P., Drechsler, J., Ronning, G. (2010). Remote Data Access and the Risk of Disclosure from Linear Regression: An Empirical Study. In: Domingo-Ferrer, J., Magkos, E. (eds) Privacy in Statistical Databases. PSD 2010. Lecture Notes in Computer Science, vol 6344. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15838-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15838-4_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15837-7

  • Online ISBN: 978-3-642-15838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics