论文标题
长文件检索的细粒蒸馏
Fine-Grained Distillation for Long Document Retrieval
论文作者
论文摘要
长期的文件检索旨在从大规模收藏中获取与查询相关的文档,在该系列中,知识蒸馏已成为事实上,通过模仿异质而强大但功能强大的跨编码器来改善回收师。但是,与段落或句子相反,在长期文件中检索的范围假设遇到了长期文档可能涵盖多个主题的范围。这最大化了它们的结构异质性,并提出了一个颗粒不匹配的问题,从而导致较低的蒸馏功效。在这项工作中,我们建议一个新的学习框架,即细粒度的蒸馏(FGD),用于长篇文档回收者。在保留传统的密集检索范式的同时,它首先会产生跨越不同细粒度的全球一致表示,然后仅在训练过程中应用多个晶状体排列蒸馏。在实验中,我们在两个长期检索基准测试基准上评估了我们的框架,这些基准显示出最新的性能。
Long document retrieval aims to fetch query-relevant documents from a large-scale collection, where knowledge distillation has become de facto to improve a retriever by mimicking a heterogeneous yet powerful cross-encoder. However, in contrast to passages or sentences, retrieval on long documents suffers from the scope hypothesis that a long document may cover multiple topics. This maximizes their structure heterogeneity and poses a granular-mismatch issue, leading to an inferior distillation efficacy. In this work, we propose a new learning framework, fine-grained distillation (FGD), for long-document retrievers. While preserving the conventional dense retrieval paradigm, it first produces global-consistent representations crossing different fine granularity and then applies multi-granular aligned distillation merely during training. In experiments, we evaluate our framework on two long-document retrieval benchmarks, which show state-of-the-art performance.