通过有限函数来限制对抗性鲁棒性的徽标

论文标题

通过有限函数来限制对抗性鲁棒性的徽标

Constraining Logits by Bounded Function for Adversarial Robustness

论文作者

Kanai, Sekitoshi, Yamada, Masanori, Yamaguchi, Shin'ya, Takahashi, Hiroshi, Ida, Yasutoshi

论文摘要

我们提出了一种通过在SoftMax之前添加新的有界函数来改善对抗性鲁棒性的方法。最近的研究假设，logit正则化的小logits（SoftMax的输入）可以改善深度学习的对抗性鲁棒性。在此假设之后，我们在通用近似的假设下分析了最佳点处的logit向量规范，并通过在SoftMax之前添加有界函数来探索新方法来限制逻辑。从理论和经验上讲，我们通过添加共同的激活函数（例如双曲线切线，不要提高对抗性鲁棒性）来揭示小逻辑，因为该函数的输入向量（前旋转矢量）可以具有较大的规范。从理论发现中，我们开发了新的有界函数。我们功能的添加可改善对抗性的鲁棒性，因为它使logit和固定前向量具有较小的规范。由于我们的方法仅在SoftMax之前添加一个激活功能，因此很容易将我们的方法与对抗训练相结合。我们的实验表明，我们的方法与对抗扰动数据集的准确性无需对抗训练而与logit正则化方法相媲美。此外，在使用对抗训练时，它与Logit正则化方法和最近的防御方法（交易）相当。

We propose a method for improving adversarial robustness by addition of a new bounded function just before softmax. Recent studies hypothesize that small logits (inputs of softmax) by logit regularization can improve adversarial robustness of deep learning. Following this hypothesis, we analyze norms of logit vectors at the optimal point under the assumption of universal approximation and explore new methods for constraining logits by addition of a bounded function before softmax. We theoretically and empirically reveal that small logits by addition of a common activation function, e.g., hyperbolic tangent, do not improve adversarial robustness since input vectors of the function (pre-logit vectors) can have large norms. From the theoretical findings, we develop the new bounded function. The addition of our function improves adversarial robustness because it makes logit and pre-logit vectors have small norms. Since our method only adds one activation function before softmax, it is easy to combine our method with adversarial training. Our experiments demonstrate that our method is comparable to logit regularization methods in terms of accuracies on adversarially perturbed datasets without adversarial training. Furthermore, it is superior or comparable to logit regularization methods and a recent defense method (TRADES) when using adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题