在对抗性的情况下重新访问合奏：提高自然准确性

论文标题

在对抗性的情况下重新访问合奏：提高自然准确性

Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy

论文作者

Saligrama, Aditya, Leclerc, Guillaume

论文摘要

在现实世界应用中部署深度学习模型的必要特征是对小型对抗性扰动的抵抗力，同时保持非恶性输入的准确性。尽管强大的训练提供了比标准模型表现出更好的对抗性精度的模型，但我们旨在桥接的健壮模型和非运动模型之间的自然精度仍然存在显着差距。我们考虑了许多旨在减轻这种性能差异的合奏方法。我们的关键见解是，经过训练以承受小攻击的模型，当结束时，通常可以承受更大的攻击，而这种概念又可以利用以优化自然精度。我们考虑了两个方案，一种结合了几个随机初始化的强大模型的预测，另一个结合了融合了可靠和标准模型的特征。

A necessary characteristic for the deployment of deep learning models in real world applications is resistance to small adversarial perturbations while maintaining accuracy on non-malicious inputs. While robust training provides models that exhibit better adversarial accuracy than standard models, there is still a significant gap in natural accuracy between robust and non-robust models which we aim to bridge. We consider a number of ensemble methods designed to mitigate this performance difference. Our key insight is that model trained to withstand small attacks, when ensembled, can often withstand significantly larger attacks, and this concept can in turn be leveraged to optimize natural accuracy. We consider two schemes, one that combines predictions from several randomly initialized robust models, and the other that fuses features from robust and standard models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题