非自动入学的机器翻译：看起来不如看起来很快

论文标题

非自动入学的机器翻译：看起来不如看起来很快

Non-Autoregressive Machine Translation: It's Not as Fast as it Seems

论文作者

Helcl, Jindřich, Haddow, Barry, Birch, Alexandra

论文摘要

高效的机器翻译模型在商业上很重要，因为它们可以提高推理速度，并降低成本和碳排放。最近，对非自动回忆（NAR）模型有很大的兴趣，该模型有望更快地翻译。与NAR模型的研究同时，已经成功尝试创建优化的自动回归模型，这是WMT共享任务的有效翻译任务的一部分。在本文中，我们指出了有关NAR模型文献中存在的评估方法中的缺陷，并在最先进的NAR模型与共同任务的自动回报提交之间提供了公正的比较。我们为NAR模型的一致评估以及将NAR模型与其他广泛使用的方法进行比较以提高效率的方法的重要性。我们使用基于连接的 - 周期性分类（CTC）NAR模型运行实验，该模型在C ++中实现，并使用壁时钟时间将其与AR模型进行比较。我们的结果表明，尽管NAR模型在GPU上的速度更快，但批量较小，但在更现实的使用条件下它们几乎总是较慢。我们呼吁对未来工作中的NAR模型进行更现实和广泛的评估。

Efficient machine translation models are commercially important as they can increase inference speeds, and reduce costs and carbon emissions. Recently, there has been much interest in non-autoregressive (NAR) models, which promise faster translation. In parallel to the research on NAR models, there have been successful attempts to create optimized autoregressive models as part of the WMT shared task on efficient translation. In this paper, we point out flaws in the evaluation methodology present in the literature on NAR models and we provide a fair comparison between a state-of-the-art NAR model and the autoregressive submissions to the shared task. We make the case for consistent evaluation of NAR models, and also for the importance of comparing NAR models with other widely used methods for improving efficiency. We run experiments with a connectionist-temporal-classification-based (CTC) NAR model implemented in C++ and compare it with AR models using wall clock times. Our results show that, although NAR models are faster on GPUs, with small batch sizes, they are almost always slower under more realistic usage conditions. We call for more realistic and extensive evaluation of NAR models in future work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题