低资源命名实体识别的基于及时的基于基于的文本

论文标题

低资源命名实体识别的基于及时的基于基于的文本

Prompt-based Text Entailment for Low-Resource Named Entity Recognition

论文作者

Li, Dongfang, Hu, Baotian, Chen, Qingcai

论文摘要

预先训练的语言模型（PLM）已应用于NLP任务并实现有希望的结果。然而，微调程序需要标记的目标域数据，这使得在低资源和非平凡的标签场景中很难学习。为了应对这些挑战，我们建议针对名为“实体识别”的低资源识别基于及时的文本构成（PTE），从而更好地利用PLM中的知识。我们首先将命名为“实体识别”的文本任务进行重新重新审议。带有特定于实体类型的提示的原始句子被送入PLM中，以获得每个候选人的需要分数。然后选择具有最高分数的实体类型作为最终标签。然后，我们将标签标签注入提示中，并将单词视为基本单元，而不是n-gram跨度，以减少通过n-grams枚举生成候选者的时间复杂性。实验结果表明，所提出的方法PTE在CONLL03数据集上实现了竞争性能，并且比在低资源设置中的MIT电影和少数网络数据集上的微调对应物更好。

Pre-trained Language Models (PLMs) have been applied in NLP tasks and achieve promising results. Nevertheless, the fine-tuning procedure needs labeled data of the target domain, making it difficult to learn in low-resource and non-trivial labeled scenarios. To address these challenges, we propose Prompt-based Text Entailment (PTE) for low-resource named entity recognition, which better leverages knowledge in the PLMs. We first reformulate named entity recognition as the text entailment task. The original sentence with entity type-specific prompts is fed into PLMs to get entailment scores for each candidate. The entity type with the top score is then selected as final label. Then, we inject tagging labels into prompts and treat words as basic units instead of n-gram spans to reduce time complexity in generating candidates by n-grams enumeration. Experimental results demonstrate that the proposed method PTE achieves competitive performance on the CoNLL03 dataset, and better than fine-tuned counterparts on the MIT Movie and Few-NERD dataset in low-resource settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题