论文标题
几乎没有生成的对话查询重写
Few-Shot Generative Conversational Query Rewriting
论文作者
论文摘要
会话查询重写旨在将简洁的对话查询重新调整为完全独立的,无关的查询,该查询可以通过现有信息检索系统有效地处理。本文介绍了几种生成的方法来重写对话查询。我们根据规则和自我监督学习开发了两种方法,以使用大量临时搜索会话来生成弱监督数据,并微调GPT-2来重写对话性查询。在TREC对话辅助轨道上,我们弱监督的GPT-2重写器将最先进的排名准确性提高了12%,仅使用非常有限的手动查询重写。在零射击学习设置中,重写者仍然与以前的最新系统相当。我们的分析表明,GPT-2有效地掌握了任务语法并学会了捕获上下文依赖性,即使对于涉及组参考和长期依赖性的硬情况也是如此。
Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This paper presents a few-shot generative approach to conversational query rewriting. We develop two methods, based on rules and self-supervised learning, to generate weak supervision data using large amounts of ad hoc search sessions, and to fine-tune GPT-2 to rewrite conversational queries. On the TREC Conversational Assistance Track, our weakly supervised GPT-2 rewriter improves the state-of-the-art ranking accuracy by 12%, only using very limited amounts of manual query rewrites. In the zero-shot learning setting, the rewriter still gives a comparable result to previous state-of-the-art systems. Our analyses reveal that GPT-2 effectively picks up the task syntax and learns to capture context dependencies, even for hard cases that involve group references and long-turn dependencies.