论文标题

与Colbert的OpenQA相关性指导的监督

Relevance-guided Supervision for OpenQA with ColBERT

论文作者

Khattab, Omar, Potts, Christopher, Zaharia, Matei

论文摘要

开放域问答的系统通常取决于在大型语料库中查找候选段落和从这些段落中提取答案的读者中的候选段落的系统。在最近的许多工作中,猎犬是一种学习的组成部分,它使用问题和段落的粗粒矢量表示。我们认为,这种建模选择不足以应对自然语言问题的复杂性。为了解决这个问题,我们定义了Colbert-QA,该Colbert-QA适应了可扩展的神经检索模型Colbert至OpenQA。 Colbert在问题和段落之间创造了细粒度的互动。我们提出了一种有效的弱监督策略,该策略使用Colbert创建自己的培训数据。这大大改善了OpenQA在自然问题,小队和Triviaqa上的检索,并且最终的系统在所有三个数据集中都达到了最先进的提取性OpenQA性能。

Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned component that uses coarse-grained vector representations of questions and passages. We argue that this modeling choice is insufficiently expressive for dealing with the complexity of natural language questions. To address this, we define ColBERT-QA, which adapts the scalable neural retrieval model ColBERT to OpenQA. ColBERT creates fine-grained interactions between questions and passages. We propose an efficient weak supervision strategy that iteratively uses ColBERT to create its own training data. This greatly improves OpenQA retrieval on Natural Questions, SQuAD, and TriviaQA, and the resulting system attains state-of-the-art extractive OpenQA performance on all three datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源