论文标题
通过掩盖序列到序列生成的方面术语提取方面提取的条件增强
Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation
论文作者
论文摘要
方面术语提取旨在从审查文本中提取方面术语作为情感分析的意见目标。这项任务的最大挑战之一是缺乏足够的注释数据。尽管数据扩展可能是解决上述问题的有效技术,但它是无法控制的,因为它可能会意外地更改方面单词和方面标签。在本文中,我们将数据扩展作为有条件的生成任务:生成新句子的同时保留原始意见目标和标签。我们提出了一种掩盖序列到序列方法,用于有条件地增加方面术语提取。与现有的增强方法不同,我们的方法是可控的,使我们能够产生更多元化的句子。实验结果证实,我们的方法可大大减轻数据稀缺问题。它还有效地提高了几种当前模型的性能,用于方面术语提取。
Aspect term extraction aims to extract aspect terms from review texts as opinion targets for sentiment analysis. One of the big challenges with this task is the lack of sufficient annotated data. While data augmentation is potentially an effective technique to address the above issue, it is uncontrollable as it may change aspect words and aspect labels unexpectedly. In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels. We propose a masked sequence-to-sequence method for conditional augmentation of aspect term extraction. Unlike existing augmentation approaches, ours is controllable and allows us to generate more diversified sentences. Experimental results confirm that our method alleviates the data scarcity problem significantly. It also effectively boosts the performances of several current models for aspect term extraction.