论文标题
嵌入幻觉以微调几次语言
Embedding Hallucination for Few-Shot Language Fine-tuning
论文作者
论文摘要
很少有语言学习者从预训练的模型中适应知识,从而从一个标记的句子中识别出新的课程。在这种情况下,对预训练的语言模型进行微调会导致严重的过度拟合。在本文中,我们提出了一种嵌入幻觉(Embedhalluc)方法,该方法生成辅助嵌入标签对以扩展微调数据集。幻觉者通过与歧视者进行对抗性游戏进行训练,以使幻觉的嵌入对微调数据集中的真实含义不利。通过使用扩展数据集进行培训,语言学习者有效地从各种幻觉的嵌入中学习,以克服过度的问题。实验表明,我们提出的方法在各种语言任务中有效,表现优于当前的微调方法。此外,我们表明,嵌入式的表现优于解决此过度问题的其他方法,例如共同的数据增强,半监督伪标记和正则化。该代码将在以下网址提供:https://github.com/yiren-jian/embedhalluc。
Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences. In such settings, fine-tuning a pre-trained language model can cause severe over-fitting. In this paper, we propose an Embedding Hallucination (EmbedHalluc) method, which generates auxiliary embedding-label pairs to expand the fine-tuning dataset. The hallucinator is trained by playing an adversarial game with the discriminator, such that the hallucinated embedding is indiscriminative to the real ones in the fine-tuning dataset. By training with the extended dataset, the language learner effectively learns from the diverse hallucinated embeddings to overcome the over-fitting issue. Experiments demonstrate that our proposed method is effective in a wide range of language tasks, outperforming current fine-tuning methods. Further, we show that EmbedHalluc outperforms other methods that address this over-fitting problem, such as common data augmentation, semi-supervised pseudo-labeling, and regularization. The code will be made available at: https://github.com/yiren-jian/EmbedHalluc.