PSP：预先训练的软提示，以进行几次抽象摘要

论文标题

PSP：预先训练的软提示，以进行几次抽象摘要

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

论文作者

Liu, Xiaochen, Gao, Yang, Bai, Yu, Li, Jiawei, Hu, Yinan, Huang, Heyan, Chen, Boxing

论文摘要

在自然语言产生中，很少有抽象性摘要已成为一项艰巨的任务。为了支持它，我们设计了一种新颖的软提示架构，加上及时的预训练以及有效的微调范式，并且仅调谐极轻的参数。软提示包括跨编码器上的连续输入嵌入和解码器，以适合生成模型的结构。重要的是，引入了文本中放置的新颖的内部贡献，以捕获文档级信息。目的是将注意力集中在理解文档上，以更好地提示该模型生成与文档相关的内容。摘要程序的第一步是通过自我监管的伪数据进行迅速进行预训练。这教授模型基本总结功能。然后，对模型进行微调，并用很少的示例进行微调。 CNN/DailyMail和XSUM数据集的实验结果表明，我们的方法仅具有0.1％的参数，优于所有模型参数调谐的全模型调整。它还超过了迅速调整的幅度，并与3％的参数相对于前缀调整提供了竞争成果。

Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we designed a novel soft prompts architecture coupled with a prompt pre-training plus fine-tuning paradigm that is effective and tunes only extremely light parameters. The soft prompts include continuous input embeddings across an encoder and a decoder to fit the structure of the generation models. Importantly, a novel inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. The first step in the summarization procedure is to conduct prompt pre-training with self-supervised pseudo-data. This teaches the model basic summarizing capabilities. The model is then fine-tuned with few-shot examples. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题