通过指导摘要原型来控制长度可控制的抽象摘要

论文标题

通过指导摘要原型来控制长度可控制的抽象摘要

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

论文作者

Saito, Itsumi, Nishida, Kyosuke, Nishida, Kosuke, Otsuka, Atsushi, Asano, Hisako, Tomita, Junji, Shindo, Hiroyuki, Matsumoto, Yuji

论文摘要

我们提出了一个新的可控制的抽象摘要模型。基于编码器模型模型的最新最新抽象摘要模型仅生成一个源文本摘要。但是，可控的摘要，尤其是长度的摘要，是实际应用的重要方面。对长度可控的抽象摘要的先前研究结合了解码器模块中的长度嵌入，以控制摘要长度。尽管长度嵌入可以控制停止解码的位置，但它们并未决定在长度约束中的摘要中应包含哪些信息。与以前的模型不同，我们的长度可控制的抽象摘要模型在编码器删除模型中而不是长度嵌入中包含一个单词级的提取模块。我们的模型分为两个步骤生成摘要。首先，我们的单词级提取器根据单词级别的重要性得分和长度约束从源文本中提取一系列重要单词（我们称其为“原型文本”）。其次，原型文本用作编码器模型的附加输入，该输入通过共同编码和复制原型文本和源文本的单词来生成摘要。由于原型文本是摘要内容和长度的指南，因此我们的模型可以产生信息丰富的长度控制摘要。 CNN/每日邮件数据集和新闻编辑室数据集的实验表明，我们的模型在长度控制的设置中优于先前的模型。

We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length embeddings in the decoder module for controlling the summary length. Although the length embeddings can control where to stop decoding, they do not decide which information should be included in the summary within the length constraint. Unlike the previous models, our length-controllable abstractive summarization model incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings. Our model generates a summary in two steps. First, our word-level extractor extracts a sequence of important words (we call it the "prototype text") from the source text according to the word-level importance scores and the length constraint. Second, the prototype text is used as additional input to the encoder-decoder model, which generates a summary by jointly encoding and copying words from both the prototype text and source text. Since the prototype text is a guide to both the content and length of the summary, our model can generate an informative and length-controlled summary. Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题