帮助弱者会使您坚强：简单的多任务学习改善了非自动回忆的翻译人员

论文标题

帮助弱者会使您坚强：简单的多任务学习改善了非自动回忆的翻译人员

Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators

论文作者

Wang, Xinyou, Zheng, Zaixiang, Huang, Shujian

论文摘要

最近，由于其有效的平行解码，非自动入口（NAR）神经机器翻译模型已受到越来越多的关注。但是，NAR模型的概率框架需要在目标序列上有条件的独立性假设，而没有表征人类语言数据。在常规MLE培训下，这种缺点导致NAR模型的信息不足，因此与自动回归（AR）对应物相比，准确性不令人满意。在本文中，我们提出了一个简单而模型的多任务学习框架，以提供更有用的学习信号。在培训阶段，我们介绍了一组足够弱的AR解码器，这些解码器仅依靠NAR解码器提供的预测信息，迫使NAR解码器变得更强大，否则它将无法支持其弱的AR合作伙伴。 WMT和IWSLT数据集的实验表明，我们的方法可以一致地提高多个NAR基准的准确性，而无需添加任何其他解码开销。

Recently, non-autoregressive (NAR) neural machine translation models have received increasing attention due to their efficient parallel decoding. However, the probabilistic framework of NAR models necessitates conditional independence assumption on target sequences, falling short of characterizing human language data. This drawback results in less informative learning signals for NAR models under conventional MLE training, thereby yielding unsatisfactory accuracy compared to their autoregressive (AR) counterparts. In this paper, we propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals. During training stage, we introduce a set of sufficiently weak AR decoders that solely rely on the information provided by NAR decoder to make prediction, forcing the NAR decoder to become stronger or else it will be unable to support its weak AR partners. Experiments on WMT and IWSLT datasets show that our approach can consistently improve accuracy of multiple NAR baselines without adding any additional decoding overhead.

下载PDF全文

下载文献需遵守相关版权规定

论文标题