错误的答案或错误的问题？问题重新印象与对话问题回答中的答案选择之间的复杂关系

论文标题

错误的答案或错误的问题？问题重新印象与对话问题回答中的答案选择之间的复杂关系

A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question Answering

论文作者

Vakulenko, Svitlana, Longpre, Shayne, Tu, Zhucheng, Anantha, Raviteja

论文摘要

适当的问题制定和正确的答案选择之间的依赖性是一个非常有趣但仍然没有被忽略的区域。在本文中，我们显示了对话环境的问题重写（QR），可以使这种现象更加阐明，并使用它来评估不同答案选择方法的鲁棒性。我们介绍了一个简单的框架，该框架可以使用问题重写对会话问题回答（QA）性能进行自动分析，并在TREC Cast和Quac（Canard）数据集上介绍此分析的结果。我们的实验发现了对阅读理解和通过排名的流行最新模型的提出的敏感性。我们的结果表明，阅读理解模型对问题制定不敏感，而段落排名发生了巨大变化，输入问题的变化很小。 QR的好处是，它允许我们自动查明和分组此类情况。我们展示了如何使用此方法来验证质量检查模型是在学习任务还是仅仅在数据集中找到捷径，并更好地理解它们遇到的错误类型。

The dependency between an adequate question formulation and correct answer selection is a very intriguing but still underexplored area. In this paper, we show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon and also use it to evaluate robustness of different answer selection approaches. We introduce a simple framework that enables an automated analysis of the conversational question answering (QA) performance using question rewrites, and present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets. Our experiments uncover sensitivity to question formulation of the popular state-of-the-art models for reading comprehension and passage ranking. Our results demonstrate that the reading comprehension model is insensitive to question formulation, while the passage ranking changes dramatically with a little variation in the input question. The benefit of QR is that it allows us to pinpoint and group such cases automatically. We show how to use this methodology to verify whether QA models are really learning the task or just finding shortcuts in the dataset, and better understand the frequent types of error they make.

下载PDF全文

下载文献需遵守相关版权规定

论文标题