Ambigqa：回答模棱两可的开放域问题

论文标题

Ambigqa：回答模棱两可的开放域问题

AmbigQA: Answering Ambiguous Open-domain Questions

论文作者

Min, Sewon, Michael, Julian, Hajishirzi, Hannaneh, Zettlemoyer, Luke

论文摘要

歧义是开放域问题回答所固有的；尤其是在探索新主题时，很难提出一个有一个明确的答案的问题。在本文中，我们介绍了Ambigqa，这是一个新的开放域问题答案任务，其中涉及找到每个合理的答案，然后为每个问题重写每个问题以解决歧义。为了研究这项任务，我们构建了Ambignq，该数据集涵盖了现有的开放域QA基准NQ-OPEN的14,042个问题。我们发现，NQ-Open中的一半以上的问题是模棱两可的，具有不同的歧义来源，例如事件和实体参考。我们还为Ambigqa提出了强大的基线模型，我们从弱监督的学习中表明，结合了NQ-OPEN，强烈建议我们的新任务和数据将支持重大的未来研究工作。我们的数据和基线可在https://nlp.cs.washington.edu/ambigqa上找到。

Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, with diverse sources of ambiguity such as event and entity references. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data and baselines are available at https://nlp.cs.washington.edu/ambigqa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题