通过有益的扰动功能增强，提高面部识别的对抗性攻击的转移性

论文标题

通过有益的扰动功能增强，提高面部识别的对抗性攻击的转移性

Improving the Transferability of Adversarial Attacks on Face Recognition with Beneficial Perturbation Feature Augmentation

论文作者

Zhou, Fengfan, Ling, Hefei, Shi, Yuxuan, Chen, Jiazhong, Li, Zongyi, Li, Ping

论文摘要

面部识别（FR）模型很容易被对抗性示例愚弄，这些例子是通过在良性面部图像上添加不可察觉的扰动来制作的。对抗性面部的存在的存在对社会的安全构成了巨大威胁。为了建立一个更可持续的数字国家，我们在本文中提高了对抗性面部示例的可传递性，以揭示现有FR模型的更多盲点。尽管生成硬样品表明其在改善训练任务中模型的概括方面的有效性，但利用这一想法来提高对抗性面部示例的可传递性的有效性仍未得到探索。为此，基于硬样品的属性以及训练任务和对抗性攻击任务之间的对称性，我们提出了硬模型的概念，这些概念与对抗性攻击任务的硬样品具有相似的效果。利用硬模型的概念，我们提出了一种称为“有益扰动特征增强攻击（BPFA）的新型攻击方法”，该方法通过不断生成新的硬模型来制作对抗性示例，从而减少了对抗性示例过度替代FR模型。具体而言，在反向传播中，BPFA记录了预选的特征图上的梯度，并使用输入图像上的梯度来制作对抗性示例。在下一个正向传播中，BPFA利用记录的梯度在其相应的特征图上增加有益的扰动以增加损失。广泛的实验表明，BPFA可以显着提高对抗攻击FR的可转移性。

Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. The existence of adversarial face examples poses a great threat to the security of society. In order to build a more sustainable digital nation, in this paper, we improve the transferability of adversarial face examples to expose more blind spots of existing FR models. Though generating hard samples has shown its effectiveness in improving the generalization of models in training tasks, the effectiveness of utilizing this idea to improve the transferability of adversarial face examples remains unexplored. To this end, based on the property of hard samples and the symmetry between training tasks and adversarial attack tasks, we propose the concept of hard models, which have similar effects as hard samples for adversarial attack tasks. Utilizing the concept of hard models, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of adversarial examples to surrogate FR models by constantly generating new hard models to craft the adversarial examples. Specifically, in the backpropagation, BPFA records the gradients on pre-selected feature maps and uses the gradient on the input image to craft the adversarial example. In the next forward propagation, BPFA leverages the recorded gradients to add beneficial perturbations on their corresponding feature maps to increase the loss. Extensive experiments demonstrate that BPFA can significantly boost the transferability of adversarial attacks on FR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题