低资源神经机器翻译的动态课程学习

论文标题

低资源神经机器翻译的动态课程学习

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

论文作者

Xu, Chen, Hu, Bojie, Jiang, Yufan, Feng, Kai, Wang, Zeyang, Huang, Shen, Ju, Qi, Xiao, Tong, Zhu, Jingbo

论文摘要

近年来，大量数据使神经机器翻译（NMT）取得了巨大成功。但是，如果我们在小规模的语料库上训练这些模型，这仍然是一个挑战。在这种情况下，使用数据的方式似乎更为重要。在这里，我们研究了低资源NMT的有效使用培训数据。特别是，我们提出了一种动态课程学习（DCL）方法，以在培训中重新排序培训样本。与以前的工作不同，我们不使用静态评分功能进行重新排序。取而代之的是，训练样本的顺序通过两种方式动态确定 - 损失下降和模型能力。通过强调简单的样本，当前模型具有足够的能力学习，从而减轻了培训。我们在基于变压器的系统中测试DCL方法。实验结果表明，DCL在三个低资源机器翻译基准和WMT'16 EN-DE的不同大小的数据上的表现优于几个强基线。

Large amounts of data has made neural machine translation (NMT) a big success in recent years. But it is still a challenge if we train these models on small-scale corpora. In this case, the way of using data appears to be more important. Here, we investigate the effective use of training data for low-resource NMT. In particular, we propose a dynamic curriculum learning (DCL) method to reorder training samples in training. Unlike previous work, we do not use a static scoring function for reordering. Instead, the order of training samples is dynamically determined in two ways - loss decline and model competence. This eases training by highlighting easy samples that the current model has enough competence to learn. We test our DCL method in a Transformer-based system. Experimental results show that DCL outperforms several strong baselines on three low-resource machine translation benchmarks and different sized data of WMT' 16 En-De.

下载PDF全文

下载文献需遵守相关版权规定

论文标题