论文标题

迈向合理的差异基于私有ADMM的分布式机器学习

Towards Plausible Differentially Private ADMM Based Distributed Machine Learning

论文作者

Ding, Jiahao, Wang, Jingyi, Liang, Guannan, Bi, Jinbo, Pan, Miao

论文摘要

乘数的交替方向方法及其分布式版本已被广泛用于机器学习。在ADMM的迭代中,使用本地私人数据和模型交换的模型更新引起了关键的隐私问题。尽管有一些开创性的努力来缓解此类担忧,但差异化的ADMM仍然面临许多研究挑战。例如,差异隐私(DP)的保证依赖于以下前提:每个局部问题的最佳性都可以在每种ADMM迭代中完美地实现,这在实践中可能永远不会发生。 DP ADMM训练的模型可能具有较低的预测准确性。在本文中,我们提出了一种新颖(改进)合理的私人ADMM算法,称为PP-ADMM和IPP-ADMM,以解决这些问题。在PP-ADMM中,每个代理都大致解决了一个扰动的优化问题,该问题是在迭代中从其局部私有数据提出的,然后用高斯噪声将近似解决方案删除以提供DP保证。为了进一步提高模型准确性和收敛性,改进的版本IPP-ADMM采用稀疏向量技术(SVT)来确定代理是否应使用当前的扰动解决方案更新其邻居。代理在上一次迭代中计算当前解决方案的差异,如果差异大于阈值,则将解传给邻居。否则,解决方案将被丢弃。此外,我们建议在零浓缩的DP(ZCDP)下跟踪总隐私损失,并提供概括性能分析。现实世界数据集的实验表明,在相同的隐私保证下,就模型的准确性和收敛率而言,所提出的算法优于最新技术。

The Alternating Direction Method of Multipliers (ADMM) and its distributed version have been widely used in machine learning. In the iterations of ADMM, model updates using local private data and model exchanges among agents impose critical privacy concerns. Despite some pioneering works to relieve such concerns, differentially private ADMM still confronts many research challenges. For example, the guarantee of differential privacy (DP) relies on the premise that the optimality of each local problem can be perfectly attained in each ADMM iteration, which may never happen in practice. The model trained by DP ADMM may have low prediction accuracy. In this paper, we address these concerns by proposing a novel (Improved) Plausible differentially Private ADMM algorithm, called PP-ADMM and IPP-ADMM. In PP-ADMM, each agent approximately solves a perturbed optimization problem that is formulated from its local private data in an iteration, and then perturbs the approximate solution with Gaussian noise to provide the DP guarantee. To further improve the model accuracy and convergence, an improved version IPP-ADMM adopts sparse vector technique (SVT) to determine if an agent should update its neighbors with the current perturbed solution. The agent calculates the difference of the current solution from that in the last iteration, and if the difference is larger than a threshold, it passes the solution to neighbors; or otherwise the solution will be discarded. Moreover, we propose to track the total privacy loss under the zero-concentrated DP (zCDP) and provide a generalization performance analysis. Experiments on real-world datasets demonstrate that under the same privacy guarantee, the proposed algorithms are superior to the state of the art in terms of model accuracy and convergence rate.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源