使用非自动回旋模型同时进行神经机器翻译的更快重新翻译

论文标题

使用非自动回旋模型同时进行神经机器翻译的更快重新翻译

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

论文作者

Han, Hyojung, Indurthi, Sathish, Zaidi, Mohd Abbas, Lakumarapu, Nikhil Kumar, Lee, Beomseok, Kim, Sangha, Kim, Chanwoo, Hwang, Inchul

论文摘要

最近，同时翻译引起了很多关注，因为它可以为实时活动或实时视频通话翻译等引人入胜的应用程序（例如字幕翻译）。这些翻译应用中的一些允许对部分翻译进行编辑，从而引起重新翻译方法。当前的重新翻译方法基于自回旋序列生成模型（RETA），该模型（reta）在（部分）翻译中生成焦油件令牌。随着源输入的增长，带有顺序生成的inretamodelslead的多个重新翻译，以增加传入源输入与相应目标输出之间的推理时间差距增加。此外，由于涉及大量推理操作，RETA模型不利于资源约束设备。在这项工作中，我们提出了一个基于非运动序列产生模型（Fretna）的更快的重新翻译系统，以克服上述局限性。我们在多个翻译任务上评估了所提出的模型，我们的模型将推理时间减少了几个订单，并且与RETA和流媒体（WAIT-K）模型相比，与RETA模型相比，与RETA模型相比，通过在翻译质量中降低了RETA模型时，提出的平均计算时间减少了20倍。在计算时间（低1.5倍）和翻译质量方面，它还优于基于流的Wait-K模型。

Recently, simultaneous translation has gathered a lot of attention since it enables compelling applications such as subtitle translation for a live event or real-time video-call translation. Some of these translation applications allow editing of partial translation giving rise to re-translation approaches. The current re-translation approaches are based on autoregressive sequence generation models (ReTA), which generate tar-get tokens in the (partial) translation sequentially. The multiple re-translations with sequential generation inReTAmodelslead to an increased inference time gap between the incoming source input and the corresponding target output as the source input grows. Besides, due to the large number of inference operations involved, the ReTA models are not favorable for resource-constrained devices. In this work, we propose a faster re-translation system based on a non-autoregressive sequence generation model (FReTNA) to overcome the aforementioned limitations. We evaluate the proposed model on multiple translation tasks and our model reduces the inference times by several orders and achieves a competitive BLEUscore compared to the ReTA and streaming (Wait-k) models.The proposed model reduces the average computation time by a factor of 20 when compared to the ReTA model by incurring a small drop in the translation quality. It also outperforms the streaming-based Wait-k model both in terms of computation time (1.5 times lower) and translation quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题