几乎没有前缀控制的发电机的桌面到文本生成

论文标题

几乎没有前缀控制的发电机的桌面到文本生成

Few-Shot Table-to-Text Generation with Prefix-Controlled Generator

论文作者

Luo, Yutao, Lu, Menghua, Liu, Gongshen, Wang, Shilin

论文摘要

神经桌面到文本的生成方法是渴望数据的，限制了其对低资源现实世界应用的适应性。先前的工作主要诉诸于训练的语言模型（PLM），以生成表格的表格摘要。但是，由于PLM的性质不受控制，它们通常包含幻觉内容。此外，很少研究表和序列之间的拓扑差异。最后但并非最不重要的一点是，在PLM上进行少量实例进行微调可能会导致过度贴合和灾难性的遗忘。为了减轻这些问题，我们提出了一种基于迅速的方法，即前缀控制的发电机（即PCG），以创建几乎没有表面的桌面到文本。我们为PLM的特定任务前缀提供了使表结构更适合预训练的输入的前缀。此外，我们生成一个特定于输入的前缀，以控制生成的文本的事实内容和单词顺序。 Wikibio数据集对不同领域（人类，书籍和歌曲）的自动评估都显示出对基线方法的实质性改进。

Neural table-to-text generation approaches are data-hungry, limiting their adaptation for low-resource real-world applications. Previous works mostly resort to Pre-trained Language Models (PLMs) to generate fluent summaries of a table. However, they often contain hallucinated contents due to the uncontrolled nature of PLMs. Moreover, the topological differences between tables and sequences are rarely studied. Last but not least, fine-tuning on PLMs with a handful of instances may lead to over-fitting and catastrophic forgetting. To alleviate these problems, we propose a prompt-based approach, Prefix-Controlled Generator (i.e., PCG), for few-shot table-to-text generation. We prepend a task-specific prefix for a PLM to make the table structure better fit the pre-trained input. In addition, we generate an input-specific prefix to control the factual contents and word order of the generated text. Both automatic and human evaluations on different domains (humans, books and songs) of the Wikibio dataset show substantial improvements over baseline approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题