引起可转移对抗攻击的注意空间上的各种生成扰动

论文标题

引起可转移对抗攻击的注意空间上的各种生成扰动

Diverse Generative Perturbations on Attention Space for Transferable Adversarial Attacks

论文作者

Kim, Woo Jae, Hong, Seunghoon, Yoon, Sung-Eui

论文摘要

具有提高可传递性的对抗性攻击 - 在已知模型上制作的对抗性示例的能力也欺骗了未知模型 - 由于其实用性，最近受到了很多关注。然而，现有的可转移攻击以确定性的方式制作扰动，并且常常无法完全探索损失表面，从而陷入了贫穷的当地最佳最佳效果，并且遭受了低传递性。为了解决这个问题，我们提出了细心多样性攻击（ADA），该攻击以随机方式破坏了不同的显着特征，以提高可传递性。首先，我们将图像注意力扰动到破坏不同模型共享的通用特征。然后，为了有效避免局部优势差，我们以随机方式破坏了这些特征，并更加详尽地探索可转移扰动的搜索空间。更具体地说，我们使用发电机来产生对抗性扰动，每个扰动都取决于输入潜在代码，以不同的方式打扰。广泛的实验评估证明了我们方法的有效性，优于最先进方法的可传递性。代码可在https://github.com/wkim97/ada上找到。

Adversarial attacks with improved transferability - the ability of an adversarial example crafted on a known model to also fool unknown models - have recently received much attention due to their practicality. Nevertheless, existing transferable attacks craft perturbations in a deterministic manner and often fail to fully explore the loss surface, thus falling into a poor local optimum and suffering from low transferability. To solve this problem, we propose Attentive-Diversity Attack (ADA), which disrupts diverse salient features in a stochastic manner to improve transferability. Primarily, we perturb the image attention to disrupt universal features shared by different models. Then, to effectively avoid poor local optima, we disrupt these features in a stochastic manner and explore the search space of transferable perturbations more exhaustively. More specifically, we use a generator to produce adversarial perturbations that each disturbs features in different ways depending on an input latent code. Extensive experimental evaluations demonstrate the effectiveness of our method, outperforming the transferability of state-of-the-art methods. Codes are available at https://github.com/wkim97/ADA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题