论文标题

事实证明,对手稳健的最接近原型分类器

Provably Adversarially Robust Nearest Prototype Classifiers

论文作者

Voráček, Václav, Hein, Matthias

论文摘要

最近的原型分类器(NPC)分配给每个输入点,相对于选择的距离度量,最近原型的标签。 NPC的直接优势是决定是可解释的。先前的工作可以在使用相同的NPC时使用相同的$ \ ell_p $ -distance时,在$ \ ell_p $ -threat模型中的最小对抗扰动上提供下限。在本文中,我们在使用$ \ ell_p $ distances和$ \ ell_q $ -threat模型的$ p,q \ in \ in \ {1,2,\ infty \} $时提供了有关复杂性的完整讨论。特别是,当使用$ \ ell_2 $ distance并在其他情况下,使用$ \ ell_2 $ distance并改善了下限时,我们为\ emph {Exact}计算提供了可扩展的算法。使用有效的改进的下限,我们将训练我们可证明的对抗性NPC(PNPC),用于MNIST,其具有比神经网络更好的$ \ ell_2 $ - 固定保证。此外,我们符合我们的知识,第一个认证结果W.R.T.对于LPIP的感知度量标准,它被认为是图像分类比$ \ ell_p $ -balls更现实的威胁模型。我们的PNPC在CIFAR10上具有比在(Laidlaw等,2021)中报道的经验鲁棒精度更高的鲁棒精度。该代码在我们的存储库中可用。

Nearest prototype classifiers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the $\ell_p$-threat model when using the same $\ell_p$-distance for the NPCs. In this paper we provide a complete discussion on the complexity when using $\ell_p$-distances for decision and $\ell_q$-threat models for certification for $p,q \in \{1,2,\infty\}$. In particular we provide scalable algorithms for the \emph{exact} computation of the minimal adversarial perturbation when using $\ell_2$-distance and improved lower bounds in other cases. Using efficient improved lower bounds we train our Provably adversarially robust NPC (PNPC), for MNIST which have better $\ell_2$-robustness guarantees than neural networks. Additionally, we show up to our knowledge the first certification results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classification than $\ell_p$-balls. Our PNPC has on CIFAR10 higher certified robust accuracy than the empirical robust accuracy reported in (Laidlaw et al., 2021). The code is available in our repository.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源