基于参考语言的无监督神经机器翻译

论文标题

基于参考语言的无监督神经机器翻译

Reference Language based Unsupervised Neural Machine Translation

论文作者

Li, Zuchao, Zhao, Hai, Wang, Rui, Utiyama, Masao, Sumita, Eiichiro

论文摘要

利用一种通用语言作为辅助语言，以使其在机器翻译方面具有悠久的传统，并让监督的基于学习的机器翻译享受在没有源语言来实现目标语言平行语料库的情况下，使用良好的枢轴语言提供的增强功能。无监督的神经机器翻译（UNMT）的兴起几乎完全缓解了平行的诅咒，尽管由于其核心背面换算训练的线索的模糊性，UNMT仍会受到不满意的性能。通过将平行文献的使用扩展到源目标范式之外，我们提出了一个新的基于参考语言的框架，其中参考语言仅与来源共享平行语料库，但该语料库仍然足够清楚地表明通过拟议的参考协议的重新构造培训。实验结果表明，我们的方法比仅使用一种辅助语言的强基线相比，提高了UNMT的质量，证明了拟议的基于参考语言的UNMT的有用性并为社区建立了良好的开端。

Exploiting a common language as an auxiliary for better translation has a long tradition in machine translation and lets supervised learning-based machine translation enjoy the enhancement delivered by the well-used pivot language in the absence of a source language to target language parallel corpus. The rise of unsupervised neural machine translation (UNMT) almost completely relieves the parallel corpus curse, though UNMT is still subject to unsatisfactory performance due to the vagueness of the clues available for its core back-translation training. Further enriching the idea of pivot translation by extending the use of parallel corpora beyond the source-target paradigm, we propose a new reference language-based framework for UNMT, RUNMT, in which the reference language only shares a parallel corpus with the source, but this corpus still indicates a signal clear enough to help the reconstruction training of UNMT through a proposed reference agreement mechanism. Experimental results show that our methods improve the quality of UNMT over that of a strong baseline that uses only one auxiliary language, demonstrating the usefulness of the proposed reference language-based UNMT and establishing a good start for the community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题