对视频识别模型的清洁标签后门攻击

论文标题

对视频识别模型的清洁标签后门攻击

Clean-Label Backdoor Attacks on Video Recognition Models

论文作者

Zhao, Shihao, Ma, Xingjun, Zheng, Xiang, Bailey, James, Chen, Jingjing, Jiang, Yu-Gang

论文摘要

深度神经网络（DNNS）容易受到后门攻击的影响，这些攻击可以通过中毒培训数据中隐藏DNN中的后门触发器。后式模型通常在干净的测试图像上行事，但始终如一地预测包含触发模式的任何测试示例的特定目标类。因此，很难检测到后门攻击，并在现实世界中提出了严重的安全问题。到目前为止，后门研究主要是在图像域中通过图像分类模型进行的。在本文中，我们表明，现有图像后门攻击在视频中的有效性远不如现有攻击可能失败的严格条件：1）具有更多输入维度的场景（例如，视频），2）具有高分辨率的场景，3）具有大量班级的场景，每个类别且每类几个示例和“ Sparse DataSet seccess”和4）和4次攻击效果，并纠正了4）和4），并且4）和4）（4）（4）。我们建议将通用对抗性触发器用作后门触发器来攻击视频识别模型，这种情况可能会受到以上4个严格条件挑战后门攻击的情况。我们在基准视频数据集上显示，我们提出的后门攻击可以通过仅毒害一小部分培训数据（而不更改标签）来操纵具有很高成功率的最先进的视频模型。我们还表明，我们提出的后门攻击对最新的后门防御/检测方法有抵抗力，甚至可以应用以改善图像后门攻击。我们提出的视频后门攻击不仅是改善视频模型稳健性的强大基准，而且还提供了一种新的视角，以更多地了解更强大的后门攻击。

Deep neural networks (DNNs) are vulnerable to backdoor attacks which can hide backdoor triggers in DNNs by poisoning training data. A backdoored model behaves normally on clean test images, yet consistently predicts a particular target class for any test examples that contain the trigger pattern. As such, backdoor attacks are hard to detect, and have raised severe security concerns in real-world applications. Thus far, backdoor research has mostly been conducted in the image domain with image classification models. In this paper, we show that existing image backdoor attacks are far less effective on videos, and outline 4 strict conditions where existing attacks are likely to fail: 1) scenarios with more input dimensions (eg. videos), 2) scenarios with high resolution, 3) scenarios with a large number of classes and few examples per class (a "sparse dataset"), and 4) attacks with access to correct labels (eg. clean-label attacks). We propose the use of a universal adversarial trigger as the backdoor trigger to attack video recognition models, a situation where backdoor attacks are likely to be challenged by the above 4 strict conditions. We show on benchmark video datasets that our proposed backdoor attack can manipulate state-of-the-art video models with high success rates by poisoning only a small proportion of training data (without changing the labels). We also show that our proposed backdoor attack is resistant to state-of-the-art backdoor defense/detection methods, and can even be applied to improve image backdoor attacks. Our proposed video backdoor attack not only serves as a strong baseline for improving the robustness of video models, but also provides a new perspective for more understanding more powerful backdoor attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题