论文标题
confnet2seq:来自口头问题的全长答案
ConfNet2Seq: Full Length Answer Generation from Spoken Questions
论文作者
论文摘要
对话和以任务为导向的对话系统旨在通过自然响应(例如文本或语音)与用户进行自然响应与用户进行交互。这些所需的响应是根据从知识来源检索到的事实产生的全长自然答案的形式。尽管已经广泛研究了从答案跨度产生自然答案的任务,但对口语内容而言,自然句子产生的研究很少。我们提出了一个新型系统,以从口头问题和Factoid答案中产生全长的自然语言答案。口语序列被紧凑为从预先训练的自动语音识别器中提取的混淆网络。这是我们最好的知识从图表输入(混乱网络)生成全长自然答案的首次尝试。我们发布了259,788个口头问题样本的大规模数据集,它们的FACTOID答案和相应的全长文本答案。遵循我们提出的方法,我们通过最佳ASR假设实现了可比的性能。
Conversational and task-oriented dialogue systems aim to interact with the user using natural responses through multi-modal interfaces, such as text or speech. These desired responses are in the form of full-length natural answers generated over facts retrieved from a knowledge source. While the task of generating natural answers to questions from an answer span has been widely studied, there has been little research on natural sentence generation over spoken content. We propose a novel system to generate full length natural language answers from spoken questions and factoid answers. The spoken sequence is compactly represented as a confusion network extracted from a pre-trained Automatic Speech Recognizer. This is the first attempt towards generating full-length natural answers from a graph input(confusion network) to the best of our knowledge. We release a large-scale dataset of 259,788 samples of spoken questions, their factoid answers and corresponding full-length textual answers. Following our proposed approach, we achieve comparable performance with best ASR hypothesis.