decouple and样本：在任务不可知数据发布中保护敏感信息

论文标题

decouple and样本：在任务不可知数据发布中保护敏感信息

Decouple-and-Sample: Protecting sensitive information in task agnostic data release

论文作者

Singh, Abhishek, Garza, Ethan, Chopra, Ayush, Vepakomma, Praneeth, Sharma, Vivek, Raskar, Ramesh

论文摘要

我们提出了Sanitizer，这是一个用于安全和任务不可能数据发布的框架。虽然发布数据集对计算机视觉的各种应用程序的影响仍然很大，但当数据共享不受隐私问题的抑制时，其影响大多会实现。我们通过在两个阶段的过程中对数据集进行消毒来缓解这些问题。首先，我们引入了一个全局脱钩阶段，将原始数据分解为敏感和非敏感的潜在表示。其次，我们设计了一个本地抽样阶段，以合成生成具有差异隐私的敏感信息，并将其与非敏感的潜在特征合并，以在保留隐私时创建有用的表示形式。这种新形成的潜在信息是带有匿名敏感信息的原始数据集的任务不可能表示。尽管大多数算法都以任务依赖性的方式对数据进行了消毒，但一些任务不合时宜的消毒技术通过审查敏感信息来消毒数据。在这项工作中，我们表明，如果可以私下综合敏感信息，则可以实现更好的隐私性权衡。我们通过在现有基准任务上优于最先进的基线并证明使用现有技术不可能的任务来验证消毒剂的有效性。

We propose sanitizer, a framework for secure and task-agnostic data release. While releasing datasets continues to make a big impact in various applications of computer vision, its impact is mostly realized when data sharing is not inhibited by privacy concerns. We alleviate these concerns by sanitizing datasets in a two-stage process. First, we introduce a global decoupling stage for decomposing raw data into sensitive and non-sensitive latent representations. Secondly, we design a local sampling stage to synthetically generate sensitive information with differential privacy and merge it with non-sensitive latent features to create a useful representation while preserving the privacy. This newly formed latent information is a task-agnostic representation of the original dataset with anonymized sensitive information. While most algorithms sanitize data in a task-dependent manner, a few task-agnostic sanitization techniques sanitize data by censoring sensitive information. In this work, we show that a better privacy-utility trade-off is achieved if sensitive information can be synthesized privately. We validate the effectiveness of the sanitizer by outperforming state-of-the-art baselines on the existing benchmark tasks and demonstrating tasks that are not possible using existing techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题