可扩展的发现和连续的个人数据库存在云本机系统中的休息

论文标题

可扩展的发现和连续的个人数据库存在云本机系统中的休息

Scalable Discovery and Continuous Inventory of Personal Data at Rest in Cloud Native Systems

论文作者

Grünewald, Elias, Schurbert, Leonard

论文摘要

云本地系统正在通过众多甚至可能的多范式数据存储（例如，关系和非关系数据库）来处理大量的个人数据。从隐私工程的角度来看，核心挑战是跟踪所有确切的位置，该位置正在存储个人数据，如监管框架所要求的，例如欧洲一般数据保护法规。在本文中，我们提出了Teiresias，包括i）一种工作流模式，用于在静止时可扩展发现个人数据，ii）云本机系统体系结构和开源原型实现该工作流程模式。为此，我们可以在DevOps/Devprivops实践之后启用具有透明度和问责制的个人数据库存。特别是，我们将根据版本控制的基础架构作为代码定义，基于云的存储以及如何将过程集成到CI/CD管道中。此后，我们提供III）进行比较性能评估，证明了现实世界中的适当执行时间，以及有希望的个人数据检测准确性优于公共云中现有专有工具。

Cloud native systems are processing large amounts of personal data through numerous and possibly multi-paradigmatic data stores (e.g., relational and non-relational databases). From a privacy engineering perspective, a core challenge is to keep track of all exact locations, where personal data is being stored, as required by regulatory frameworks such as the European General Data Protection Regulation. In this paper, we present Teiresias, comprising i) a workflow pattern for scalable discovery of personal data at rest, and ii) a cloud native system architecture and open source prototype implementation of said workflow pattern. To this end, we enable a continuous inventory of personal data featuring transparency and accountability following DevOps/DevPrivOps practices. In particular, we scope version-controlled Infrastructure as Code definitions, cloud-based storages, and how to integrate the process into CI/CD pipelines. Thereafter, we provide iii) a comparative performance evaluation demonstrating both appropriate execution times for real-world settings, and a promising personal data detection accuracy outperforming existing proprietary tools in public clouds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题