火焰：在洗牌模型中差异化私人联盟学习

论文标题

火焰：在洗牌模型中差异化私人联盟学习

FLAME: Differentially Private Federated Learning in the Shuffle Model

论文作者

Liu, Ruixuan, Cao, Yang, Chen, Hong, Guo, Ruoyang, Yoshikawa, Masatoshi

论文摘要

联合学习（FL）是一种有前途的机器学习范式，它使分析仪能够在不收集用户的原始数据的情况下训练模型。为了确保用户的隐私，对私人联合学习的差异化进行了深入的研究。现有作品主要基于\ textit {curator模型}或\ textit {local Model}的差异隐私。但是，他们俩都有利弊。策展人模型允许更高的准确性，但需要一个受信任的分析仪。在本地模型中，用户将本地数据发送到分析仪之前，不需要信任的分析仪，但准确性是有限的。在这项工作中，通过利用\ textit {隐私放大}在最近提出的差异隐私模型中的效果，我们在策展人模型中实现了两个世界的最佳状态，即在策展人模型中的准确性和强有力的隐私而不依靠任何可信赖的政党。我们首先在洗牌模型中提出了一个FL框架，并从现有工作中扩展了一个简单的协议（SS-Simple）。我们发现，由于模型参数的尺寸非常大，因此SS-Simple仅在FL中提供了不足的隐私放大效果。为了解决这一挑战，我们提出了一个增强的协议（SS-double），以通过子采样来增加隐私放大效果。此外，为了在模型大小大于用户群体时提高效用，我们提出了一个具有梯度稀疏技术的高级协议（SS-TOPK）。我们还提供了对拟议协议的隐私放大的理论分析和数值评估。与基于本地模型的FL相比，对现实世界数据集的实验验证SS-TOPK的实验提高了测试精度60.7 \％。

Federated Learning (FL) is a promising machine learning paradigm that enables the analyzer to train a model without collecting users' raw data. To ensure users' privacy, differentially private federated learning has been intensively studied. The existing works are mainly based on the \textit{curator model} or \textit{local model} of differential privacy. However, both of them have pros and cons. The curator model allows greater accuracy but requires a trusted analyzer. In the local model where users randomize local data before sending them to the analyzer, a trusted analyzer is not required but the accuracy is limited. In this work, by leveraging the \textit{privacy amplification} effect in the recently proposed shuffle model of differential privacy, we achieve the best of two worlds, i.e., accuracy in the curator model and strong privacy without relying on any trusted party. We first propose an FL framework in the shuffle model and a simple protocol (SS-Simple) extended from existing work. We find that SS-Simple only provides an insufficient privacy amplification effect in FL since the dimension of the model parameter is quite large. To solve this challenge, we propose an enhanced protocol (SS-Double) to increase the privacy amplification effect by subsampling. Furthermore, for boosting the utility when the model size is greater than the user population, we propose an advanced protocol (SS-Topk) with gradient sparsification techniques. We also provide theoretical analysis and numerical evaluations of the privacy amplification of the proposed protocols. Experiments on real-world dataset validate that SS-Topk improves the testing accuracy by 60.7\% than the local model based FL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题