Skip to main content

Rank-Based Record Linkage for Re-Identification Risk Assessment

  • Conference paper
  • First Online:
Book cover Privacy in Statistical Databases (PSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9867))

Included in the following conference series:

Abstract

There is a tradition of data administrators using record linkage to assess the re-identification risk before releasing anonymized microdata sets. In this paper we describe a record linkage procedure based on ranks, and we compare the performance of this rank-based record linkage against the more usual distance-based record linkage to re-identify records masked using several different masking methods. We try to elicit the reasons why RBRL performs better than DBRL for certain methods and worse than DBRL for other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for the protection of numerical microdata. Deliverable of the EU IST-2000-25069 “CASC” project (2003). http://neon.vb.cbs.nl/casc/

  2. Domingo-Ferrer, J., Oganian, A., Torres, A., Mateo-Sanz, J.M.: On the security of microaggregation with individual ranking: analytical attacks. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 477–491 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data using semantic marginality. Inf. Sci. 242, 35–48 (2013)

    Article  Google Scholar 

  4. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. Confidentiality, Disclosure and Data Access, Theory and Practical Applications for Statistical Agencies, pp. 111–134. North-Holland, Amsterdam (2001)

    Google Scholar 

  5. Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical disclosure control of microdata via advanced record linkage. Stat. Comput. 13(4), 343–354 (2003)

    Article  MathSciNet  Google Scholar 

  6. Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Discov. 11(2), 195–212 (2005)

    Article  MathSciNet  Google Scholar 

  7. Fellegi, I., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64(328), 1183–1210 (1969)

    Article  MATH  Google Scholar 

  8. Gouweleeuw, J.M., Kooiman, P., De Wolf, P.-P.: Post randomisation for statistical disclosure control: theory and implementation. J. Official Stat. 14(4), 463–478 (1998)

    Google Scholar 

  9. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., De Wolf, P.-P.: Statistical Disclosure Control. Wiley, Hoboken (2012)

    Book  Google Scholar 

  10. Jaro, M.A.: Advances in record linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)

    Article  Google Scholar 

  11. Mateo-Sanz, J.M., Sebé, F., Domingo-Ferrer, J.: Outlier protection in continuous microdata masking. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 201–215. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Moore, R.A.: Controlled Data Swapping for Masking Public Use Microdata Sets. Research report series (RR96/04), Statistical Research Division, US Census Bureau, Washington, DC (1996)

    Google Scholar 

  13. Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. Data Knowl. Eng. 64(1), 346–364 (2008)

    Article  Google Scholar 

  14. Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing data utility in differential privacy via microaggregation-based k-anonymity. VLDB J. 23(5), 771–794 (2014)

    Article  Google Scholar 

  15. Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Torra, V. (ed.) Information Fusion in Data Mining. Studies in Fuzziness and Soft Computing, vol. 123, pp. 99–130. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  16. Winkler, W.E.: Matching and record linkage. In: Business Survey Methods, pp. 355–384. Wiley, Hoboken (1995)

    Google Scholar 

  17. Winkler, W.E.: Masking and re-identification methods for public-use microdata: overview and research problems. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 231–246. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Acknowledgments and Disclaimer

The second author is partly supported by the European Commission (projects H2020-644024 “CLARUS” and H2020-700540 “CANVAS”), by the Government of Catalonia (ICREA-Acadèmia prize and grant 2014 SGR 537) and by the Spanish Government (projects TIN2014-57364-C2-1-R “SmartGlacis” and TIN2015-70054-REDC). The second author leads the UNESCO Chair in Data Privacy, but the views expressed in this paper are the authors’ own and are not necessarily shared by UNESCO.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krishnamurty Muralidhar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Muralidhar, K., Domingo-Ferrer, J. (2016). Rank-Based Record Linkage for Re-Identification Risk Assessment. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45381-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45380-4

  • Online ISBN: 978-3-319-45381-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics