论文标题

成为您自己的邻居:通过基于自我监督的学习来检测邻里关系

Be Your Own Neighborhood: Detecting Adversarial Example by the Neighborhood Relations Built on Self-Supervised Learning

论文作者

He, Zhiyuan, Yang, Yijun, Chen, Pin-Yu, Xu, Qiang, Ho, Tsung-Yi

论文摘要

深度神经网络(DNN)在各个领域都取得了出色的性能。但是,DNNS对对抗性示例(AE)的脆弱性阻碍了他们的部署到关键的安全应用程序中。本文提出了一个新颖的AE检测框架,以值得信赖的预测为止。超越通过将AE的异常关系与其增强版本(即邻居)区分开来执行检测:表示相似性和标签一致性。与监督的学习模型相比,使用现成的自我监督学习(SSL)模型用于提取表示形式,并预测其高度信息代表能力的标签。对于干净的样本,它们的表示和预测与邻居密切一致,而AE的邻居差异很大。此外,我们解释了这一观察结果,并表明,通过利用这种差异可以有效地检测到AE。我们为超越的有效性建立了严格的理由。此外,作为一种插件模型,超越的范围可以轻松与受过对抗训练的分类器(ATC)合作,从而实现最先进的(SOTA)鲁棒性精度。实验结果表明,超越表现的基线较大,尤其是在适应性攻击下。在SSL上建立的稳健关系网络的授权下,我们发现超出了检测能力和速度方面的表现优于基准。我们的代码将公开可用。

Deep Neural Networks (DNNs) have achieved excellent performance in various fields. However, DNNs' vulnerability to Adversarial Examples (AE) hinders their deployments to safety-critical applications. This paper presents a novel AE detection framework, named BEYOND, for trustworthy predictions. BEYOND performs the detection by distinguishing the AE's abnormal relation with its augmented versions, i.e. neighbors, from two prospects: representation similarity and label consistency. An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label for its highly informative representation capacity compared to supervised learning models. For clean samples, their representations and predictions are closely consistent with their neighbors, whereas those of AEs differ greatly. Furthermore, we explain this observation and show that by leveraging this discrepancy BEYOND can effectively detect AEs. We develop a rigorous justification for the effectiveness of BEYOND. Furthermore, as a plug-and-play model, BEYOND can easily cooperate with the Adversarial Trained Classifier (ATC), achieving the state-of-the-art (SOTA) robustness accuracy. Experimental results show that BEYOND outperforms baselines by a large margin, especially under adaptive attacks. Empowered by the robust relation net built on SSL, we found that BEYOND outperforms baselines in terms of both detection ability and speed. Our code will be publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源