论文标题
数据中毒与拜占庭梯度攻击之间的等效性
An Equivalence Between Data Poisoning and Byzantine Gradient Attacks
论文作者
论文摘要
为了研究分布式学习的弹性,“拜占庭”文献考虑了一个强大的威胁模型,工人可以在其中向参数服务器报告任意梯度。尽管该模型有助于获得几个基本结果,但当工人大多是值得信赖的机器时,有时被认为是不现实的。在本文中,我们在该模型和数据中毒之间表现出令人惊讶的等效性,这一威胁被认为更现实。更具体地说,我们证明,在任何具有PAC保证的个性化联合学习系统中,每次梯度攻击都可以简化为数据中毒(我们表明这既是理想又现实的)。这种等效性使得有可能获得新的不可能结果,即在高度异质应用中,任何“强大”学习算法对数据中毒的弹性,这是拜占庭机器学习的现有不可能定理的推论。此外,使用我们的等效性,我们(从理论和经验上)提出了一种实践攻击,这对经典的个性化联合学习模型非常有效。
To study the resilience of distributed learning, the "Byzantine" literature considers a strong threat model where workers can report arbitrary gradients to the parameter server. Whereas this model helped obtain several fundamental results, it has sometimes been considered unrealistic, when the workers are mostly trustworthy machines. In this paper, we show a surprising equivalence between this model and data poisoning, a threat considered much more realistic. More specifically, we prove that every gradient attack can be reduced to data poisoning, in any personalized federated learning system with PAC guarantees (which we show are both desirable and realistic). This equivalence makes it possible to obtain new impossibility results on the resilience of any "robust" learning algorithm to data poisoning in highly heterogeneous applications, as corollaries of existing impossibility theorems on Byzantine machine learning. Moreover, using our equivalence, we derive a practical attack that we show (theoretically and empirically) can be very effective against classical personalized federated learning models.