通过持续学习探索针对预训练的跨语义模型的微调技术

论文标题

通过持续学习探索针对预训练的跨语义模型的微调技术

Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning

论文作者

Liu, Zihan, Winata, Genta Indra, Madotto, Andrea, Fung, Pascale

论文摘要

最近，对下游跨语言任务进行微调预训练的语言模型（例如，多语言BERT）已显示出令人鼓舞的结果。但是，微调过程不可避免地会改变预训练模型的参数，并削弱其跨语性能力，从而导致亚最佳性能。为了减轻这个问题，我们利用持续的学习来保留预训练模型的原始跨语性能力，当我们将其微调到下游任务。实验结果表明，我们的微调方法可以更好地保留句子检索任务中预训练模型的跨语性能力。我们的方法还比零击的跨语言部分标记和命名实体识别任务上的其他微调基线要获得的性能更好。

Recently, fine-tuning pre-trained language models (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and weakens its cross-lingual ability, which leads to sub-optimal performance. To alleviate this problem, we leverage continual learning to preserve the original cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks. The experimental result shows that our fine-tuning methods can better preserve the cross-lingual ability of the pre-trained model in a sentence retrieval task. Our methods also achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题