论文标题
贪婪:失真 - 意识到稀疏对抗攻击
GreedyFool: Distortion-Aware Sparse Adversarial Attack
论文作者
论文摘要
现代深层神经网络(DNN)容易受到对抗样本的影响。稀疏的对抗样品是对抗样品的特殊分支,仅通过扰动一些像素来欺骗目标模型。稀疏对抗攻击的存在指出,DNN比人们认为的要脆弱得多,这也是分析DNN的新方面。但是,当前的稀疏对抗攻击方法仍然存在一些稀疏性和隐形性的缺点。在本文中,我们提出了一种新型的两阶段失真感知贪婪的方法,称为“贪婪”。具体而言,它首先选择最有效的候选位置来修改梯度(对于对手)和失真图(对于隐形),然后在减少阶段下降一些重要点。实验表明,与启动方法相比,我们只需要在相同的稀疏扰动设置下修改$ 3 \ times $ $较少的像素即可。对于目标攻击,我们方法的成功率为9.96 \%\%比同一像素预算下的开始方法高。代码可以在https://github.com/lightdxy/greedyfool上找到。
Modern deep neural networks(DNNs) are vulnerable to adversarial samples. Sparse adversarial samples are a special branch of adversarial samples that can fool the target model by only perturbing a few pixels. The existence of the sparse adversarial attack points out that DNNs are much more vulnerable than people believed, which is also a new aspect for analyzing DNNs. However, current sparse adversarial attack methods still have some shortcomings on both sparsity and invisibility. In this paper, we propose a novel two-stage distortion-aware greedy-based method dubbed as "GreedyFool". Specifically, it first selects the most effective candidate positions to modify by considering both the gradient(for adversary) and the distortion map(for invisibility), then drops some less important points in the reduce stage. Experiments demonstrate that compared with the start-of-the-art method, we only need to modify $3\times$ fewer pixels under the same sparse perturbation setting. For target attack, the success rate of our method is 9.96\% higher than the start-of-the-art method under the same pixel budget. Code can be found at https://github.com/LightDXY/GreedyFool.