事实证明，对手稳健的最接近原型分类器

论文标题

事实证明，对手稳健的最接近原型分类器

Provably Adversarially Robust Nearest Prototype Classifiers

论文作者

Voráček, Václav, Hein, Matthias

论文摘要

最近的原型分类器（NPC）分配给每个输入点，相对于选择的距离度量，最近原型的标签。 NPC的直接优势是决定是可解释的。先前的工作可以在使用相同的NPC时使用相同的$ \ ell_p $ -distance时，在$ \ ell_p $ -threat模型中的最小对抗扰动上提供下限。在本文中，我们在使用$ \ ell_p $ distances和$ \ ell_q $ -threat模型的$ p，q \ in \ in \ {1,2，\ infty \} $时提供了有关复杂性的完整讨论。特别是，当使用$ \ ell_2 $ distance并在其他情况下，使用$ \ ell_2 $ distance并改善了下限时，我们为\ emph {Exact}计算提供了可扩展的算法。使用有效的改进的下限，我们将训练我们可证明的对抗性NPC（PNPC），用于MNIST，其具有比神经网络更好的$ \ ell_2 $ - 固定保证。此外，我们符合我们的知识，第一个认证结果W.R.T.对于LPIP的感知度量标准，它被认为是图像分类比$ \ ell_p $ -balls更现实的威胁模型。我们的PNPC在CIFAR10上具有比在（Laidlaw等，2021）中报道的经验鲁棒精度更高的鲁棒精度。该代码在我们的存储库中可用。

Nearest prototype classifiers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the $\ell_p$-threat model when using the same $\ell_p$-distance for the NPCs. In this paper we provide a complete discussion on the complexity when using $\ell_p$-distances for decision and $\ell_q$-threat models for certification for $p,q \in \{1,2,\infty\}$. In particular we provide scalable algorithms for the \emph{exact} computation of the minimal adversarial perturbation when using $\ell_2$-distance and improved lower bounds in other cases. Using efficient improved lower bounds we train our Provably adversarially robust NPC (PNPC), for MNIST which have better $\ell_2$-robustness guarantees than neural networks. Additionally, we show up to our knowledge the first certification results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classification than $\ell_p$-balls. Our PNPC has on CIFAR10 higher certified robust accuracy than the empirical robust accuracy reported in (Laidlaw et al., 2021). The code is available in our repository.

下载PDF全文

下载文献需遵守相关版权规定

论文标题