论文标题

通过说明的任务感知检索

Task-aware Retrieval with Instructions

论文作者

Asai, Akari, Schick, Timo, Lewis, Patrick, Chen, Xilun, Izacard, Gautier, Riedel, Sebastian, Hajishirzi, Hannaneh, Yih, Wen-tau

论文摘要

我们研究了通过说明的检索问题,在该问题中,检索系统的用户明确描述了他们的意图以及查询。我们旨在使用多任务指令调整开发通用任务感知的检索系统,该调整可以遵循人写的说明,以找到给定查询的最佳文档。我们介绍了大约40个检索数据集的第一个大型集合,其中包含说明,Berri和现在的TART,这是一个多任务检索系统,该系统在Berri接受了指令。 TART显示出强大的功能,可以通过说明适应新的检索任务,并在两个零射击检索基准的Beir和Lotte上推进了最先进的最大模型,超过了三倍的模型。我们进一步介绍了一个新的评估设置,即X^2-Retrival,以更好地反映现实世界中的方案,其中汇总了不同的域和任务,并且需要系统地找到对齐用户意图的文档。在此设置中,TART极大地胜过竞争基线,进一步证明了指导检索指令的有效性。

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can follow human-written instructions to find the best documents for a given query. We introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, BERRI, and present TART, a multi-task retrieval system trained on BERRI with instructions. TART shows strong capabilities to adapt to a new retrieval task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup, X^2-Retrieval to better reflect real-world scenarios, where diverse domains and tasks are pooled and a system needs to find documents aligning users' intents. In this setup, TART significantly outperforms competitive baselines, further demonstrating the effectiveness of guiding retrieval with instructions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源