多语言神经机器翻译的多任务学习

论文标题

多语言神经机器翻译的多任务学习

Multi-task Learning for Multilingual Neural Machine Translation

论文作者

Wang, Yiren, Zhai, ChengXiang, Awadalla, Hany Hassan

论文摘要

虽然单语言数据已被证明可用于改善双语神经机器翻译（NMT），但有效，有效地利用单语言NMT（MNMT）系统的单语数据是一个较少探索的区域。在这项工作中，我们提出了一个多任务学习（MTL）框架，该框架可以通过BiteXT数据上的翻译任务共同训练该模型，并在单语数据上执行了两个DeNoing任务。我们对MNMT系统进行了大量的经验研究，该系统具有WMT数据集的10对语言对。我们表明，所提出的方法可以有效地改善高资源和低资源语言的翻译质量，其利润率很大，比单独的双语模型取得了明显更好的结果。我们还证明了在没有bitext培训数据的情况下，在零镜头设置中提出的方法的功效。此外，我们还显示了MTL对NMT和跨语性转移学习NLU任务的有效性；所提出的方法的表现优于大规模的规模模型，该模型训练了单个任务。

While monolingual data has been shown to be useful in improving bilingual neural machine translation (NMT), effectively and efficiently leveraging monolingual data for Multilingual NMT (MNMT) systems is a less explored area. In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data. We conduct extensive empirical studies on MNMT systems with 10 language pairs from WMT datasets. We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages with large margin, achieving significantly better results than the individual bilingual models. We also demonstrate the efficacy of the proposed approach in the zero-shot setup for language pairs without bitext training data. Furthermore, we show the effectiveness of MTL over pre-training approaches for both NMT and cross-lingual transfer learning NLU tasks; the proposed approach outperforms massive scale models trained on single task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题