有效的几次微调来摘要

论文标题

有效的几次微调来摘要

Efficient Few-Shot Fine-Tuning for Opinion Summarization

论文作者

Bražinskas, Arthur, Nallapati, Ramesh, Bansal, Mohit, Dreyer, Markus

论文摘要

抽象性摘要模型通常在大量的通用文本上进行预训练，然后在数十或数十万个带注释的样本上进行微调。但是，根据意见总结，不可用的大量注释的评论数据集与参考摘要配对，并且创建价格昂贵。这要求在小数据集上过度拟合进行微调方法。此外，通常不习惯于客户评论的细节，并且经过微调后，会产生出现不良和语义错误的摘要。为了解决这些问题，我们利用了一种基于适配器的有效的几种方法，正如我们所显示的，可以轻松存储内域知识。我们没有对整个模型进行微调，而是使用持有的评论作为伪摘要，以特定于任务的方式添加适配器并以特定于任务的方式进行培训。然后，在小型可用的人类通知数据集中微调适配器。我们表明，这种自我监督的适配器预训练分别在亚马逊和Yelp数据集上分别提高了标准微调和1.3 Rouge-L点的摘要质量。最后，为了摘要个性化，我们根据“关键字查询”的条件，自动从通用数据集创建。同样，我们以基于查询的方式对适配器进行了培训，以对客户评论进行调整，然后在注释的数据集中对其进行微调。这会导致整合的摘要内容反映在改善的连贯性和更少的冗余。

Abstractive summarization models are typically pre-trained on large amounts of generic texts, then fine-tuned on tens or hundreds of thousands of annotated samples. However, in opinion summarization, large annotated datasets of reviews paired with reference summaries are not available and would be expensive to create. This calls for fine-tuning methods robust to overfitting on small datasets. In addition, generically pre-trained models are often not accustomed to the specifics of customer reviews and, after fine-tuning, yield summaries with disfluencies and semantic mistakes. To address these problems, we utilize an efficient few-shot method based on adapters which, as we show, can easily store in-domain knowledge. Instead of fine-tuning the entire model, we add adapters and pre-train them in a task-specific way on a large corpus of unannotated customer reviews, using held-out reviews as pseudo summaries. Then, fine-tune the adapters on the small available human-annotated dataset. We show that this self-supervised adapter pre-training improves summary quality over standard fine-tuning by 2.0 and 1.3 ROUGE-L points on the Amazon and Yelp datasets, respectively. Finally, for summary personalization, we condition on aspect keyword queries, automatically created from generic datasets. In the same vein, we pre-train the adapters in a query-based manner on customer reviews and then fine-tune them on annotated datasets. This results in better-organized summary content reflected in improved coherence and fewer redundancies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题