梯子：潜在边界引导的对抗训练

论文标题

梯子：潜在边界引导的对抗训练

LADDER: Latent Boundary-guided Adversarial Training

论文作者

Zhou, Xiaowei, Tsang, Ivor W., Yin, Jie

论文摘要

深度神经网络（DNNS）最近在许多分类任务中取得了巨大的成功。不幸的是，它们容易受到对抗性攻击的攻击，这些攻击产生了对抗性示例，并具有很小的扰动，以欺骗DNN模型，尤其是在模型共享场景中。事实证明，对抗性训练是最有效的策略，它将对抗性示例注入模型训练中，以提高DNN模型针对对抗性攻击的稳健性。但是，基于现有的对抗示例的对抗训练无法很好地推广到标准的，不受干扰的测试数据。为了在标准准确性和对抗性鲁棒性之间取得更好的权衡，我们提出了一个新型的对抗训练框架，称为潜在边界引导的对抗训练（Ladder），该训练（梯子）对敌方对潜在边界引导的对抗性示例进行了训练DNN模型。与大多数在输入空间中生成对抗示例的现有方法相反，梯子通过增加对潜在特征的扰动而产生了无数的高质量对抗示例。扰动是沿着SVM构建的具有注意机制的决策边界的正常形式进行的。我们从边界场的角度和可视化视图分析了生成的边界引导的对抗示例的优点。与Vanilla DNN和竞争性基础线相比，对MNIST，SVHN，CELEBA和CIFAR-10的广泛实验和详细分析验证了梯子在标准准确性和对抗性鲁棒性之间取得更好权衡的有效性。

Deep Neural Networks (DNNs) have recently achieved great success in many classification tasks. Unfortunately, they are vulnerable to adversarial attacks that generate adversarial examples with a small perturbation to fool DNN models, especially in model sharing scenarios. Adversarial training is proved to be the most effective strategy that injects adversarial examples into model training to improve the robustness of DNN models against adversarial attacks. However, adversarial training based on the existing adversarial examples fails to generalize well to standard, unperturbed test data. To achieve a better trade-off between standard accuracy and adversarial robustness, we propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining (LADDER) that adversarially trains DNN models on latent boundary-guided adversarial examples. As opposed to most of the existing methods that generate adversarial examples in the input space, LADDER generates a myriad of high-quality adversarial examples through adding perturbations to latent features. The perturbations are made along the normal of the decision boundary constructed by an SVM with an attention mechanism. We analyze the merits of our generated boundary-guided adversarial examples from a boundary field perspective and visualization view. Extensive experiments and detailed analysis on MNIST, SVHN, CelebA, and CIFAR-10 validate the effectiveness of LADDER in achieving a better trade-off between standard accuracy and adversarial robustness as compared with vanilla DNNs and competitive baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题