检索作为注意力：单个变压器内检索和阅读的端到端学习

论文标题

检索作为注意力：单个变压器内检索和阅读的端到端学习

Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

论文作者

Jiang, Zhengbao, Gao, Luyu, Araki, Jun, Ding, Haibo, Wang, Zhiruo, Callan, Jamie, Neubig, Graham

论文摘要

知识密集型任务的系统，例如开放域问题回答（QA）通常由两个阶段组成：有效地从大型语料库中检索相关文档，并详细读取所选文档以生成答案。猎犬和读者通常是单独建模的，这需要繁琐的实现，并且很难以端到端的方式进行训练和适应。在本文中，我们重新审视了这一设计，并避免了单独的体系结构和培训，而有利于单个变压器，该变压器作为注意力（REATT）进行检索，而端到端培训仅基于QA最终任务的监督。我们首次证明了一个经过单个型号的端到端训练可以实现竞争性检索和质量保证的性能，匹配或略优于单独训练的猎犬和读者。此外，端到端的适应大大提高了其在受监督和无监督的设置中在室外数据集上的性能，这使我们的模型成为知识密集型任务的简单且适应性的解决方案。代码和型号可在https://github.com/jzbjyb/reatt上找到。

Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents to generate answers. Retrievers and readers are usually modeled separately, which necessitates a cumbersome implementation and is hard to train and adapt in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs Retrieval as Attention (ReAtt), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that a single model trained end-to-end can achieve both competitive retrieval and QA performance, matching or slightly outperforming state-of-the-art separately trained retrievers and readers. Moreover, end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable solution for knowledge-intensive tasks. Code and models are available at https://github.com/jzbjyb/ReAtt.

下载PDF全文

下载文献需遵守相关版权规定

论文标题