论文标题
多向量检索是稀疏对齐
Multi-Vector Retrieval as Sparse Alignment
论文作者
论文摘要
多矢量检索模型比单矢量双重编码器改进了许多信息检索任务。在本文中,我们将多矢量检索问题作为查询和文档令牌之间的稀疏对齐。我们提出了一个新型的多矢量检索模型对齐器,该模型学习了查询和文档令牌(例如“狗'vs.“ puppy”)之间的稀疏成对对齐,并伴有肉体的非saliences,反映了它们相对重要的检索。我们表明,控制成对令牌对准的稀疏性通常会带来显着的性能增长。虽然大多数集中在文档的特定部分的事实问题需要较少的对齐方式,但其他问题则需要更广泛的了解文档有利于更大数量的一致性。另一方面,Unary Saliences决定是否需要与他人保持一致以进行检索(例如,从新西兰使用``类型的货币''}')。借助稀疏的一元saliences,我们能够修剪大量查询和文档令牌向量,并提高多向量检索的效率。我们学习了具有熵登记的线性编程的稀疏一元saliences,这表现优于实现稀疏性的其他方法。在零射击设置中,Aligner在10个NDCG@10中得分为51.1点,在贝尔基准中的13个任务上实现了新的猎犬最新的最新时间。此外,通过几个示例调整成对对准(<= 8)进一步提高了性能,最多可提高15.7分NDCG@10,以获取参数检索任务。一口气的一致性陈列物帮助我们仅保留20%的文档代币表示,而绩效损失最少。我们进一步表明,我们的模型通常会产生可解释的一致性,并在从较大的语言模型中初始化时会显着提高其性能。
Multi-vector retrieval models improve over single-vector dual encoders on many information retrieval tasks. In this paper, we cast the multi-vector retrieval problem as sparse alignment between query and document tokens. We propose AligneR, a novel multi-vector retrieval model that learns sparsified pairwise alignments between query and document tokens (e.g. `dog' vs. `puppy') and per-token unary saliences reflecting their relative importance for retrieval. We show that controlling the sparsity of pairwise token alignments often brings significant performance gains. While most factoid questions focusing on a specific part of a document require a smaller number of alignments, others requiring a broader understanding of a document favor a larger number of alignments. Unary saliences, on the other hand, decide whether a token ever needs to be aligned with others for retrieval (e.g. `kind' from `kind of currency is used in new zealand}'). With sparsified unary saliences, we are able to prune a large number of query and document token vectors and improve the efficiency of multi-vector retrieval. We learn the sparse unary saliences with entropy-regularized linear programming, which outperforms other methods to achieve sparsity. In a zero-shot setting, AligneR scores 51.1 points nDCG@10, achieving a new retriever-only state-of-the-art on 13 tasks in the BEIR benchmark. In addition, adapting pairwise alignments with a few examples (<= 8) further improves the performance up to 15.7 points nDCG@10 for argument retrieval tasks. The unary saliences of AligneR helps us to keep only 20% of the document token representations with minimal performance loss. We further show that our model often produces interpretable alignments and significantly improves its performance when initialized from larger language models.