论文标题
窗口模型用于抽象性摘要的长文本
Windowing Models for Abstractive Summarization of Long Texts
论文作者
论文摘要
神经摘要模型受到固定大小的输入限制:如果文本长度超过模型的最大输入令牌数量,则某些文档内容(可能与摘要相关)会被截断,从而独立地汇总了最大输入尺寸的窗口,以使窗口和引线之间的信息流量流动,并汇总到窗户之间的信息流,并汇总到窗口之间的信息流量和潜在的总结。我们提出了窗口模型,用于(任意)长文本的神经抽象摘要。我们通过(1)允许编码器在输入文档的不同窗口上滑动以及(2)共享解码器并在不同的输入窗口中保留其状态,从而扩展了与指针生成器网络增强的序列到序列模型。我们探索两个窗口变体:静态窗口预计解码器应从每个窗口生成的令牌数(基于培训语料库统计);在动态窗口中,解码器学会了发射一个令牌,该令牌表明编码器转移到下一个输入窗口。经验结果使我们的模型在其预期的用例中有效:总结具有相关内容的长本文本不与文档开始不束缚的。
Neural summarization models suffer from the fixed-size input limitation: if text length surpasses the model's maximal number of input tokens, some document content (possibly summary-relevant) gets truncated Independently summarizing windows of maximal input size disallows for information flow between windows and leads to incoherent summaries. We propose windowing models for neural abstractive summarization of (arbitrarily) long texts. We extend the sequence-to-sequence model augmented with pointer generator network by (1) allowing the encoder to slide over different windows of the input document and (2) sharing the decoder and retaining its state across different input windows. We explore two windowing variants: Static Windowing precomputes the number of tokens the decoder should generate from each window (based on training corpus statistics); in Dynamic Windowing the decoder learns to emit a token that signals encoder's shift to the next input window. Empirical results render our models effective in their intended use-case: summarizing long texts with relevant content not bound to the very document beginning.