神经阅读理解中的敏感性

论文标题

神经阅读理解中的敏感性

Undersensitivity in Neural Reading Comprehension

论文作者

Welbl, Johannes, Minervini, Pasquale, Bartolo, Max, Stenetorp, Pontus, Riedel, Sebastian

论文摘要

当前的阅读理解模型可以很好地推广到分布测试集，但在对手选择的输入方面的表现较差。大多数先前关于对抗性输入研究的工作过敏性：语义上不变的文本扰动，这些文本扰动会导致模型预测在不应该时会改变。在这项工作中，我们着重于互补问题：过度预测强烈，其中输入文本有意义地改变了，但是模型的预测却没有，即使应该。我们制定了一种嘈杂的对抗性攻击，该攻击在该问题的语义变化之间搜索，该问题的模型错误地预测了相同的答案，并且具有更高的概率。尽管构成了无法回答的问题，但Squad2.0和NewsQA模型都容易受到这一攻击的影响。这表明，尽管准确，但模型倾向于依靠虚假模式，并且没有完全考虑问题中指定的信息。我们试验数据的增强和对抗性训练作为防御措施，并发现两者都大大减少了对持有数据的攻击的脆弱性，并避免了攻击空间。解决敏感性还可以改善对添加和副词的结果，并在面对火车/评估分配不匹配时会更好地推广：它们不太容易过度依靠训练集中的预测提示，并且胜过传统模型的预测性提示高达10.9％的F1。

Current reading comprehension models generalise well to in-distribution test sets, yet perform poorly on adversarially selected inputs. Most prior work on adversarial inputs studies oversensitivity: semantically invariant text perturbations that cause a model's prediction to change when it should not. In this work we focus on the complementary problem: excessive prediction undersensitivity, where input text is meaningfully changed but the model's prediction does not, even though it should. We formulate a noisy adversarial attack which searches among semantic variations of the question for which a model erroneously predicts the same answer, and with even higher probability. Despite comprising unanswerable questions, both SQuAD2.0 and NewsQA models are vulnerable to this attack. This indicates that although accurate, models tend to rely on spurious patterns and do not fully consider the information specified in a question. We experiment with data augmentation and adversarial training as defences, and find that both substantially decrease vulnerability to attacks on held out data, as well as held out attack spaces. Addressing undersensitivity also improves results on AddSent and AddOneSent, and models furthermore generalise better when facing train/evaluation distribution mismatch: they are less prone to overly rely on predictive cues present only in the training set, and outperform a conventional model by as much as 10.9% F1.

下载PDF全文

下载文献需遵守相关版权规定

论文标题