论文标题
通道面具:回收者阅读器模型的可学习正则化策略
Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models
论文作者
论文摘要
猎犬阅读器模型在许多不同的NLP任务中实现了竞争性能,例如开放的问答和对话对话。在这项工作中,我们注意到这些模型很容易过分超越顶级检索段落,而标准培训则无法在整个检索段落中推理。我们引入了一种可学习的通道遮罩机制,该机制使顶级检索段落的影响脱敏,并防止模型过度拟合。使用较少的面具候选者来控制梯度差异,并通过单发双层优化选择面具候选者,我们可学习的正则化策略强制执行答案生成,以专注于整个检索段落。对开放式问题回答,对话对话和事实验证的不同任务进行的实验表明,我们的方法始终优于其基准。广泛的实验和消融研究表明,我们的方法对许多NLP任务都是一般,有效和有益的。
Retriever-reader models achieve competitive performance across many different NLP tasks such as open question answering and dialogue conversations. In this work, we notice these models easily overfit the top-rank retrieval passages and standard training fails to reason over the entire retrieval passages. We introduce a learnable passage mask mechanism which desensitizes the impact from the top-rank retrieval passages and prevents the model from overfitting. Controlling the gradient variance with fewer mask candidates and selecting the mask candidates with one-shot bi-level optimization, our learnable regularization strategy enforces the answer generation to focus on the entire retrieval passages. Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines. Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.