论文标题
提示:基于及时的少数学习者的标签引导数据增强
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners
论文作者
论文摘要
大型预训练的语言模型(PLM)的最新进展导致自然语言理解(NLU)任务带来了令人印象深刻的增长,并带来了特定于任务的微调。但是,直接微调PLM在很大程度上依赖于足够的标记训练实例,这些实例通常很难获得。对PLM的迅速调整已显示出适用于各种下游几杆任务的功能。现有的作品研究基于迅速的NLU任务的基于及时的调整,主要集中于用语言器来得出适当的标签单词,或者生成及时的模板以从PLM中获取语义。此外,尽管在低资源场景中广泛采用了同义词替代的传统数据增强策略,但仅带来边际改进,以迅速基于迅速的几次学习。因此,出现了一个重要的研究问题:如何设计有效的数据增强方法来迅速基于几次调整?为此,考虑到标签语义对于及时基于基于及时的调整至关重要,我们提出了一个新颖的标签引导数据增强框架提示,该框架提示了该框架,该框架利用了丰富的标签语义信息以增加数据增强。几乎没有的文本分类任务的广泛实验结果通过有效利用标签语义和数据增强来证明了拟议框架的出色性能。我们的代码可在https://github.com/canyuchen/promptda上找到。
Recent advances in large pre-trained language models (PLMs) lead to impressive gains in natural language understanding (NLU) tasks with task-specific fine-tuning. However, directly fine-tuning PLMs heavily relies on sufficient labeled training instances, which are usually hard to obtain. Prompt-based tuning on PLMs has shown to be powerful for various downstream few-shot tasks. Existing works studying prompt-based tuning for few-shot NLU tasks mainly focus on deriving proper label words with a verbalizer or generating prompt templates to elicit semantics from PLMs. In addition, conventional data augmentation strategies such as synonym substitution, though widely adopted in low-resource scenarios, only bring marginal improvements for prompt-based few-shot learning. Thus, an important research question arises: how to design effective data augmentation methods for prompt-based few-shot tuning? To this end, considering the label semantics are essential in prompt-based tuning, we propose a novel label-guided data augmentation framework PromptDA, which exploits the enriched label semantic information for data augmentation. Extensive experiment results on few-shot text classification tasks demonstrate the superior performance of the proposed framework by effectively leveraging label semantics and data augmentation for natural language understanding. Our code is available at https://github.com/canyuchen/PromptDA.