论文标题
执行订单66:针对增强学习的目标数据中毒
Execute Order 66: Targeted Data Poisoning for Reinforcement Learning
论文作者
论文摘要
用于加强学习的数据中毒历史上一直集中在一般绩效下降上,而有针对性的攻击通过涉及控制受害者政策和奖励的扰动成功。我们为加强学习引入了一个阴险的中毒攻击,该攻击仅在特定目标状态下导致代理商行为不当 - 同时,在不承担任何控制政策或奖励的情况下,最少修改一小部分培训观察结果。我们通过调整最近的技术(梯度对齐方式)来增强学习来实现这一目标。我们测试了我们的方法,并在两种不同难度的Atari游戏中证明了成功。
Data poisoning for reinforcement learning has historically focused on general performance degradation, and targeted attacks have been successful via perturbations that involve control of the victim's policy and rewards. We introduce an insidious poisoning attack for reinforcement learning which causes agent misbehavior only at specific target states - all while minimally modifying a small fraction of training observations without assuming any control over policy or reward. We accomplish this by adapting a recent technique, gradient alignment, to reinforcement learning. We test our method and demonstrate success in two Atari games of varying difficulty.