通过说明的任务感知检索

论文标题

通过说明的任务感知检索

Task-aware Retrieval with Instructions

论文作者

Asai, Akari, Schick, Timo, Lewis, Patrick, Chen, Xilun, Izacard, Gautier, Riedel, Sebastian, Hajishirzi, Hannaneh, Yih, Wen-tau

论文摘要

我们研究了通过说明的检索问题，在该问题中，检索系统的用户明确描述了他们的意图以及查询。我们旨在使用多任务指令调整开发通用任务感知的检索系统，该调整可以遵循人写的说明，以找到给定查询的最佳文档。我们介绍了大约40个检索数据集的第一个大型集合，其中包含说明，Berri和现在的TART，这是一个多任务检索系统，该系统在Berri接受了指令。 TART显示出强大的功能，可以通过说明适应新的检索任务，并在两个零射击检索基准的Beir和Lotte上推进了最先进的最大模型，超过了三倍的模型。我们进一步介绍了一个新的评估设置，即X^2-Retrival，以更好地反映现实世界中的方案，其中汇总了不同的域和任务，并且需要系统地找到对齐用户意图的文档。在此设置中，TART极大地胜过竞争基线，进一步证明了指导检索指令的有效性。

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can follow human-written instructions to find the best documents for a given query. We introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, BERRI, and present TART, a multi-task retrieval system trained on BERRI with instructions. TART shows strong capabilities to adapt to a new retrieval task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup, X^2-Retrieval to better reflect real-world scenarios, where diverse domains and tasks are pooled and a system needs to find documents aligning users' intents. In this setup, TART significantly outperforms competitive baselines, further demonstrating the effectiveness of guiding retrieval with instructions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题