论文标题

VPN:神经网络中毒的验证

VPN: Verification of Poisoning in Neural Networks

论文作者

Sun, Youcheng, Usman, Muhammad, Gopinath, Divya, Păsăreanu, Corina S.

论文摘要

神经网络成功地用于各种应用程序中,其中许多都有安全和安全问题。结果,研究人员提出了用于验证神经网络特性的正式验证技术。尽管以前的努力主要集中在检查神经网络中的局部鲁棒性,但我们研究了另一个神经网络安全问题,即数据中毒。在这种情况下,攻击者将触发器插入训练数据的一个子集中,以使得在测试时,该触发器在输入中触发会导致训练的模型将训练的模型错误分类为某些目标类。我们展示了如何制定数据中毒的支票,作为可以使用现成的验证工具(例如Marabou和Nneum)进行检查的属性,在该工具中,失败检查的反例构成了触发器。我们进一步表明,发现的触发器是从小型模型“转移”到更大,训练有素的模型,从而使我们能够分析对图像分类任务训练的最先进的表现模型。

Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class. We show how to formulate the check for data poisoning as a property that can be checked with off-the-shelf verification tools, such as Marabou and nneum, where counterexamples of failed checks constitute the triggers. We further show that the discovered triggers are `transferable' from a small model to a larger, better-trained model, allowing us to analyze state-of-the art performant models trained for image classification tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源