几乎没有生成的对话查询重写

论文标题

几乎没有生成的对话查询重写

Few-Shot Generative Conversational Query Rewriting

论文作者

Yu, Shi, Liu, Jiahua, Yang, Jingqin, Xiong, Chenyan, Bennett, Paul, Gao, Jianfeng, Liu, Zhiyuan

论文摘要

会话查询重写旨在将简洁的对话查询重新调整为完全独立的，无关的查询，该查询可以通过现有信息检索系统有效地处理。本文介绍了几种生成的方法来重写对话查询。我们根据规则和自我监督学习开发了两种方法，以使用大量临时搜索会话来生成弱监督数据，并微调GPT-2来重写对话性查询。在TREC对话辅助轨道上，我们弱监督的GPT-2重写器将最先进的排名准确性提高了12％，仅使用非常有限的手动查询重写。在零射击学习设置中，重写者仍然与以前的最新系统相当。我们的分析表明，GPT-2有效地掌握了任务语法并学会了捕获上下文依赖性，即使对于涉及组参考和长期依赖性的硬情况也是如此。

Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This paper presents a few-shot generative approach to conversational query rewriting. We develop two methods, based on rules and self-supervised learning, to generate weak supervision data using large amounts of ad hoc search sessions, and to fine-tune GPT-2 to rewrite conversational queries. On the TREC Conversational Assistance Track, our weakly supervised GPT-2 rewriter improves the state-of-the-art ranking accuracy by 12%, only using very limited amounts of manual query rewrites. In the zero-shot learning setting, the rewriter still gives a comparable result to previous state-of-the-art systems. Our analyses reveal that GPT-2 effectively picks up the task syntax and learns to capture context dependencies, even for hard cases that involve group references and long-turn dependencies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题