嵌入幻觉以微调几次语言

论文标题

嵌入幻觉以微调几次语言

Embedding Hallucination for Few-Shot Language Fine-tuning

论文作者

Jian, Yiren, Gao, Chongyang, Vosoughi, Soroush

论文摘要

很少有语言学习者从预训练的模型中适应知识，从而从一个标记的句子中识别出新的课程。在这种情况下，对预训练的语言模型进行微调会导致严重的过度拟合。在本文中，我们提出了一种嵌入幻觉（Embedhalluc）方法，该方法生成辅助嵌入标签对以扩展微调数据集。幻觉者通过与歧视者进行对抗性游戏进行训练，以使幻觉的嵌入对微调数据集中的真实含义不利。通过使用扩展数据集进行培训，语言学习者有效地从各种幻觉的嵌入中学习，以克服过度的问题。实验表明，我们提出的方法在各种语言任务中有效，表现优于当前的微调方法。此外，我们表明，嵌入式的表现优于解决此过度问题的其他方法，例如共同的数据增强，半监督伪标记和正则化。该代码将在以下网址提供：https：//github.com/yiren-jian/embedhalluc。

Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences. In such settings, fine-tuning a pre-trained language model can cause severe over-fitting. In this paper, we propose an Embedding Hallucination (EmbedHalluc) method, which generates auxiliary embedding-label pairs to expand the fine-tuning dataset. The hallucinator is trained by playing an adversarial game with the discriminator, such that the hallucinated embedding is indiscriminative to the real ones in the fine-tuning dataset. By training with the extended dataset, the language learner effectively learns from the diverse hallucinated embeddings to overcome the over-fitting issue. Experiments demonstrate that our proposed method is effective in a wide range of language tasks, outperforming current fine-tuning methods. Further, we show that EmbedHalluc outperforms other methods that address this over-fitting problem, such as common data augmentation, semi-supervised pseudo-labeling, and regularization. The code will be made available at: https://github.com/yiren-jian/EmbedHalluc.

下载PDF全文

下载文献需遵守相关版权规定

论文标题