句子压缩的基于代币的基于CNN的方法

论文标题

句子压缩的基于代币的基于CNN的方法

A Token-wise CNN-based Method for Sentence Compression

论文作者

Hou, Weiwei, Suominen, Hanna, Koniusz, Piotr, Caldwell, Sabrina, Gedeon, Tom

论文摘要

句子压缩是一种自然语言处理（NLP）任务，旨在缩短原始句子并保留其关键信息。它的应用程序可以使许多领域受益，例如一个人可以建立语言教育的工具。但是，当前的方法主要基于经常性的神经网络（RNN）模型，该模型的处理速度差。为了解决这个问题，在本文中，我们提出了一个代币的卷积神经网络，基于CNN的模型以及来自Transformers（BERT）功能的预训练的双向编码器表示，用于基于删除的句子压缩。我们还将模型与基于RNN的模型和微调BERT进行了比较。尽管基于RNN的一种模型之一比给出相同输入的其他模型略有优于其他模型，但我们的基于CNN的模型的速度比基于RNN的方法快十倍。

Sentence compression is a Natural Language Processing (NLP) task aimed at shortening original sentences and preserving their key information. Its applications can benefit many fields e.g. one can build tools for language education. However, current methods are largely based on Recurrent Neural Network (RNN) models which suffer from poor processing speed. To address this issue, in this paper, we propose a token-wise Convolutional Neural Network, a CNN-based model along with pre-trained Bidirectional Encoder Representations from Transformers (BERT) features for deletion-based sentence compression. We also compare our model with RNN-based models and fine-tuned BERT. Although one of the RNN-based models outperforms marginally other models given the same input, our CNN-based model was ten times faster than the RNN-based approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题