论文标题

PSP:预先训练的软提示,以进行几次抽象摘要

PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization

论文作者

Liu, Xiaochen, Gao, Yang, Bai, Yu, Li, Jiawei, Hu, Yinan, Huang, Heyan, Chen, Boxing

论文摘要

在自然语言产生中,很少有抽象性摘要已成为一项艰巨的任务。为了支持它,我们设计了一种新颖的软提示架构,加上及时的预训练以及有效的微调范式,并且仅调谐极轻的参数。软提示包括跨编码器上的连续输入嵌入和解码器,以适合生成模型的结构。重要的是,引入了文本中放置的新颖的内部贡献,以捕获文档级信息。目的是将注意力集中在理解文档上,以更好地提示该模型生成与文档相关的内容。摘要程序的第一步是通过自我监管的伪数据进行迅速进行预训练。这教授模型基本总结功能。然后,对模型进行微调,并用很少的示例进行微调。 CNN/DailyMail和XSUM数据集的实验结果表明,我们的方法仅具有0.1%的参数,优于所有模型参数调谐的全模型调整。它还超过了迅速调整的幅度,并与3%的参数相对于前缀调整提供了竞争成果。

Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we designed a novel soft prompts architecture coupled with a prompt pre-training plus fine-tuning paradigm that is effective and tunes only extremely light parameters. The soft prompts include continuous input embeddings across an encoder and a decoder to fit the structure of the generation models. Importantly, a novel inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. The first step in the summarization procedure is to conduct prompt pre-training with self-supervised pseudo-data. This teaches the model basic summarizing capabilities. The model is then fine-tuned with few-shot examples. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源