Skip to main content

Streamlining Personal Data Access Requests: From Obstructive Procedures to Automated Web Workflows

  • Conference paper
  • First Online:
Web Engineering (ICWE 2023)

Abstract

Transparency and data portability are two core principles of modern privacy legislations such as the GDPR. From the regulatory perspective, providing individuals (data subjects) with access to their data is a main building block for implementing these. Different from other privacy principles and respective regulatory provisions, however, this right to data access has so far only seen marginal technical reflection. Processes related to performing data subject access requests (DSARs) are thus still to be executed manually, hindering the concept of data access from unfolding its full potential.

To tackle this problem, we present an automated approach to the execution of DSARs, employing modern techniques of web automation. In particular, we propose a generic DSAR workflow model, a corresponding formal language for representing the particular workflows of different service providers (controllers), a publicly accessible and extendable workflow repository, and a browser-based execution engine, altogether providing “one-click” DSARs. To validate our approach and technical concepts, we examine, formalize and make publicly available the DSAR workflows of 15 widely used service providers and implement the execution engine in a publicly available browser extension. Altogether, we thereby pave the way for automated data subject access requests and lay the groundwork for a broad variety of subsequent technical means helping web users to better understand their privacy-related exposure to different service providers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The controller has to fulfill the request without delay, but no later than within 30 days that might even be extended up to 60 days, if the data subject is informed about the delay in the first 30 days (Art. 12 (3) GDPR).

  2. 2.

    Examples of such DAP specific text generators are datenanfragen.de or mydatadoneright.eu.

  3. 3.

    Instructions for such processes can for example be found at justgetmydata.com.

  4. 4.

    An API-based DAP is followed, e.g., by the aeon prototype (aeon.technology) that integrates a few big service providers.

  5. 5.

    GAFA represents the four tech-companies Google, Apple, Facebook, and Amazon.

  6. 6.

    Respective provider-side processes are considered out of scope herein.

  7. 7.

    github.com/DaSKITA/darpal.

  8. 8.

    github.com/DaSKITA/darpal-documents.

  9. 9.

    github.com/DaSKITA/dara-api.

  10. 10.

    For instance, statistics on successful and failed local workflow executions shall in future versions be reported back to the repository. With this data, likely outdated and dysfunctional workflows could be marked respectively.

  11. 11.

    github.com/DaSKITA/dara-extension.

  12. 12.

    www.automa.site.

  13. 13.

    github.com/DaSKITA/dara-frontend.

References

  1. van der Aalst, W.M.P., Bichler, M., Heinzl, A.: Robotic process automation. Bus. Inf. Syst. Eng. 60(4), 269–272 (2018). https://doi.org/10.1007/s12599-018-0542-4

    Article  Google Scholar 

  2. Agostinelli, S., Lupia, M., Marrella, A., Mecella, M.: Automated generation of executable RPA scripts from user interface logs. In: Asatiani, A., et al. (eds.) BPM 2020. LNBIP, vol. 393, pp. 116–131. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58779-6_8

    Chapter  Google Scholar 

  3. Alizadeh, F., Jakobi, T., Boden, A., Stevens, G., Boldt, J.: GDPR reality check - claiming and investigating personally identifiable data from companies. In: 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS &PW), pp. 120–129 (2020). https://doi.org/10.1109/EuroSPW51379.2020.00025

  4. Amershi, S., Mahmud, J., Nichols, J., Lau, T., Ruiz, G.A.: LiveAction: automating web task model generation. ACM Trans. Interact. Intell. Syst. 3(3), 1–23 (2013). https://doi.org/10.1145/2533670.2533672

    Article  Google Scholar 

  5. Ausloos, J., Dewitte, P.: Shattering one-way mirrors - data subject access rights in practice. Int. Data Priv. Law 8(1), 4–28 (2018). https://doi.org/10.1093/idpl/ipy001

    Article  Google Scholar 

  6. Barman, S., Chasins, S., Bodik, R., Gulwani, S.: Ringer: web automation by demonstration. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pp. 748–764. OOPSLA 2016, Association for Computing Machinery (2016). https://doi.org/10.1145/2983990.2984020

  7. Bigham, J.P., Lau, T., Nichols, J.: Trailblazer: enabling blind users to blaze trails through the web. In: Proceedings of the 14th International Conference on Intelligent User Interfaces, pp. 177–186 (2009)

    Google Scholar 

  8. Bolin, M., Webber, M., Rha, P., Wilson, T., Miller, R.C.: Automation and customization of rendered web pages. In: Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology, UIST 2005, pp. 163–172. Association for Computing Machinery, New York (2005). https://doi.org/10.1145/1095034.1095062

  9. Bowyer, A., Holt, J., Go Jefferies, J., Wilson, R., Kirk, D., David Smeddinck, J.: Human-GDPR interaction: practical experiences of accessing personal data. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI 2022, pp. 1–19. Association for Computing Machinery (2022). https://doi.org/10.1145/3491102.3501947

  10. Bufalieri, L., Morgia, M.L., Mei, A., Stefa, J.: GDPR: when the right to access personal data becomes a threat. In: 2020 IEEE International Conference on Web Services (ICWS), pp. 75–83 (2020). https://doi.org/10.1109/ICWS49710.2020.00017

  11. Cagnazzo, M., Holz, T., Pohlmann, N.: GDPiRated – stealing personal information on- and offline. In: Sako, K., Schneider, S., Ryan, P.Y.A. (eds.) ESORICS 2019. LNCS, vol. 11736, pp. 367–386. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29962-0_18

    Chapter  Google Scholar 

  12. Chasins, S., Barman, S., Bodik, R., Gulwani, S.: Browser record and replay as a building block for end-user web automation tools. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015 Companion, pp. 179–182. Association for Computing Machinery, New York (2015). https://doi.org/10.1145/2740908.2742849

  13. Cypher, A., Halbert, D.C.: Watch What I Do: Programming by Demonstration. MIT Press, Cambridge (1993)

    Google Scholar 

  14. Di Martino, M., Meers, I., Quax, P., Andries, K., Lamotte, W.: Revisiting identification issues in GDPR ‘right of access’ policies: a technical and longitudinal analysis. Proc. Priv. Enhanc. Technol. 2022(2), 95–113 (2022)

    Google Scholar 

  15. Di Martino, M., Robyns, P., Weyts, W., Quax, P., Lamotte, W., Andries, K.: Personal information leakage by abusing the GDPR ‘right of access’. In: Fifteenth Symposium on Usable Privacy and Security (SOUPS 2019), pp. 371–385. USENIX (2019)

    Google Scholar 

  16. Dong, R., Huang, Z., Lam, I.I., Chen, Y., Wang, X.: WebRobot: web robotic process automation using interactive programming-by-demonstration. In: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2022, pp. 152–167. Association for Computing Machinery (2022). https://doi.org/10.1145/3519939.3523711

  17. Fialová, E.: Data portability and informational self-determination. Masaryk Univ. J. Law Technol. 8(1), 45–55 (2014)

    Google Scholar 

  18. Gill, D., Metzger, J.: Data access through data portability. Eur. Data Prot. Law Rev. 8(2), 221–237 (2022)

    Article  Google Scholar 

  19. Grünewald, E., Pallas, F.: Datensouveränität für Verbraucher:innen: Technische Ansätze durch KI-basierte Transparenz und Auskunft im Kontext der DSGVO, pp. 1–17. Alexander Boden, Timo Jakobi, Gunnar Stevens, Christian Bala (Hgg.): Verbraucherdatenschutz - Technik und Regulation zur Unterstützung des Individuums (2021). https://doi.org/10.18418/978-3-96043-095-7_02

  20. Grünewald, E., Pallas, F.: TILT: A GDPR-aligned transparency information language and toolkit for practical privacy engineering. In: Proceedings of the 2021 Conference on Fairness, Accountability, and Transparency. FAccT 2021, Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3442188.3445925

  21. Grünewald, E., Wille, P., Pallas, F., Borges, M.C., Ulbricht, M.R.: TIRA: an OpenAPI extension and toolbox for GDPR transparency in RESTful architectures. In: 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS &PW). IEEE Computer Society (2021)

    Google Scholar 

  22. Hansen, M., Jensen, M.: A generic data model for implementing right of access requests. In: Gryszczyńska, A., Polański, P., Gruschka, N., Rannenberg, K., Adamczyk, M. (eds.) Privacy Technologies and Policy. APF 2022. Lecture Notes in Computer Science, pp. 3–22. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-07315-1_1

  23. Hildebrandt, M.: Behavioural biometric profiling and transparancy enhancing tools. Fidis Deliverable 7.12 (2009). https://doi.org/10.13140/RG.2.2.21363.32808

  24. Janssen, H., Cobbe, J., Singh, J.: Personal information management systems: a user-centric privacy utopia? Internet Policy Rev. 9(4), 1–25 (2020)

    Article  Google Scholar 

  25. Joris, G., Mechant, P., De Marez, L.: Exercising the right of access: a benchmark for future GDPR evaluations. In: 70th Annual ICA Conference : Open Communication, Proceedings (2020)

    Google Scholar 

  26. Lau, T., Wolfman, S.A., Domingos, P., Weld, D.S.: Programming by demonstration using version space algebra. Mach. Learn. 53, 111–156 (2003)

    Article  MATH  Google Scholar 

  27. Leno, V., Dumas, M., Maggi, F.M., La Rosa, M.: Multi-perspective process model discovery for robotic process automation. In: Proceedings of the Doctoral Consortium Papers Presented at the 30th International Conference on Advanced Information Systems Engineering (CAiSE), vol. 2114, pp. 37–45. CEUR-WS (2018)

    Google Scholar 

  28. Leshed, G., Haber, E.M., Matthews, T., Lau, T.: CoScripter: automating & sharing how-to knowledge in the enterprise. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2008, pp. 1719–1728. Association for Computing Machinery, New York (2008). https://doi.org/10.1145/1357054.1357323

  29. Little, G., Lau, T.A., Cypher, A., Lin, J., Haber, E.M., Kandogan, E.: Koala: capture, share, automate, personalize business processes on the web. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2007, pp. 943–946. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1240624.1240767

  30. Mahieu, R., Asghari, H., van Eeten, M.: Collectively exercising the right of access: individual effort, societal effect. Internet Policy Rev. 7(3) (2018)

    Google Scholar 

  31. Mickens, J., Elson, J., Howell, J.: Mugshot: deterministic capture and replay for javascript applications. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, p. 11. USENIX Association (2010)

    Google Scholar 

  32. Murmann, P., Fischer-Hübner, S.: Tools for achieving usable ex post transparency: a survey. IEEE Access 5, 22965–22991 (2017)

    Article  Google Scholar 

  33. Pallas, F., Hartmann, D., Heinrich, P., Kipke, J., Grünewald, E.: Configurable per-query data minimization for privacy-compliant web APIs. In: Proceedings of the 2022 ICWE International Conference on Web Engineering, Bari (2022). https://doi.org/10.1007/978-3-031-09917-5_22

  34. Pallas, F., et al.: Towards application-layer purpose-based access control. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp. 1288–1296 (2020)

    Google Scholar 

  35. Petelka, J., Oreglia, E., Finn, M., Srinivasan, J.: Generating practices: investigations into the double embedding of GDPR and data access policies. Proc. ACM Hum. Comput. Interact. 6(CSCW2), 1–26 (2022)

    Article  Google Scholar 

  36. Puzis, Y., Borodin, Y., Puzis, R., Ramakrishnan, I.: Predictive web automation assistant for people with vision impairments. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1031–1040 (2013)

    Google Scholar 

  37. Schufrin, M., Reynolds, S.L., Kuijper, A., Kohlhammer, J.: A visualization interface to improve the transparency of collected personal data on the internet. IEEE Trans. Visual Comput. Graph. 27(2), 1840–1849 (2021). https://doi.org/10.1109/TVCG.2020.3028946

    Article  Google Scholar 

  38. Sharma, M., Angmo, R.: Web based automation testing and tools. Int. J. Comput. Sci. Inf. Technol. 5(1), 908–912 (2014)

    Google Scholar 

  39. Urban, T., Tatang, D., Degeling, M., Holz, T., Pohlmann, N.: A study on subject data access in online advertising after the GDPR. In: Pérez-Solà, C., Navarro-Arribas, G., Biryukov, A., Garcia-Alfaro, J. (eds.) DPM CBT-2019. LNCS, vol. 11737, pp. 61–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31500-9_5

    Chapter  Google Scholar 

  40. Veys, S., Serrano, D., Stamos, M., Herman, M., Reitinger, N., Mazurek, M.L., Ur, B.: Pursuing usable and useful data downloads under GDPR/CCPA access rights via co-design. In: SOUPS @ USENIX Security Symposium (2021)

    Google Scholar 

Download references

Acknowledgements

We thank our students Majed Idilbi, Christopher Liebig, Ann-Sophie Messerschmid, Moriel Pevzner, Dominic Strempel, and Kjell Lillie-Stolze, who contributed to the initial proof-of-concept within the scope of a study project [19]. Special thanks go to Johanna Washington, who kindly supported us to conduct the user experiment.

The work behind this paper was partially conducted within the project DaSKITA, supported under grant no. 28V2307A19 by funds of the Federal Ministry for the Environment, Nature Conservation, Nuclear Safety and Consumer Protection (BMUV) based on a decision of the Parliament of the Federal Republic of Germany via the Federal Office for Agriculture and Food (BLE) under the innovation support program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicola Leschke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Leschke, N., Kirsten, F., Pallas, F., Grünewald, E. (2023). Streamlining Personal Data Access Requests: From Obstructive Procedures to Automated Web Workflows. In: Garrigós, I., Murillo Rodríguez, J.M., Wimmer, M. (eds) Web Engineering. ICWE 2023. Lecture Notes in Computer Science, vol 13893. Springer, Cham. https://doi.org/10.1007/978-3-031-34444-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34444-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34443-5

  • Online ISBN: 978-3-031-34444-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics