论文标题
通过概率差异指导梁搜索的上下文感知的文本对抗攻击方法
A Context-Aware Approach for Textual Adversarial Attack through Probability Difference Guided Beam Search
论文作者
论文摘要
文本对手攻击暴露了文本分类器的漏洞,可用于改善其稳健性。现有的上下文感知方法仅考虑金标签概率,并在搜索攻击路径时使用贪婪的搜索,通常会限制攻击效率。为了解决这些问题,我们提出了PDB,这是一种使用概率差的引导光束搜索的上下文感知的文本对抗攻击模型。概率差异是所有类标签概率的总体考虑,PDB使用它来指导攻击路径的选择。此外,PDB使用Beam搜索找到成功的攻击路径,从而避免搜索空间有限。广泛的实验和人类评估表明,PDB在一系列评估指标中的表现优于以前的最佳模型,尤其是提高 +19.5%的攻击成功率。消融研究和定性分析进一步证实了PDB的效率。
Textual adversarial attacks expose the vulnerabilities of text classifiers and can be used to improve their robustness. Existing context-aware methods solely consider the gold label probability and use the greedy search when searching an attack path, often limiting the attack efficiency. To tackle these issues, we propose PDBS, a context-aware textual adversarial attack model using Probability Difference guided Beam Search. The probability difference is an overall consideration of all class label probabilities, and PDBS uses it to guide the selection of attack paths. In addition, PDBS uses the beam search to find a successful attack path, thus avoiding suffering from limited search space. Extensive experiments and human evaluation demonstrate that PDBS outperforms previous best models in a series of evaluation metrics, especially bringing up to a +19.5% attack success rate. Ablation studies and qualitative analyses further confirm the efficiency of PDBS.