论文标题
Kompetencer:通过远处的监督和转移学习,丹麦职位发布中的细粒度技能分类
Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning
论文作者
论文摘要
技能分类(SC)是从职位发布中对工作能力进行分类的任务。这项工作是SC中的首次应用于丹麦职位空缺数据。我们发布了第一个丹麦职位发布数据集:Kompetencer(EN:能力),注释了能力的嵌套跨度。为了改善粗粒注释,我们利用欧洲技能,能力,资格和职业(Esco; Le Vrang等,2014)分类法API通过遥远的监督获得细粒度的标签。我们研究了两个设置:零射击和少量分类设置。我们微调了基于英语的模型和Rembert(Chung等,2020),并将其与语言中的丹麦模型进行了比较。我们的结果表明,雷姆伯特在零射门和少量设置中的所有其他模型都大大优于所有其他模型。
Skill Classification (SC) is the task of classifying job competences from job postings. This work is the first in SC applied to Danish job vacancy data. We release the first Danish job posting dataset: Kompetencer (en: competences), annotated for nested spans of competences. To improve upon coarse-grained annotations, we make use of The European Skills, Competences, Qualifications and Occupations (ESCO; le Vrang et al., 2014) taxonomy API to obtain fine-grained labels via distant supervision. We study two setups: The zero-shot and few-shot classification setting. We fine-tune English-based models and RemBERT (Chung et al., 2020) and compare them to in-language Danish models. Our results show RemBERT significantly outperforms all other models in both the zero-shot and the few-shot setting.