神经机器翻译的准确单词对齐诱导

论文标题

神经机器翻译的准确单词对齐诱导

Accurate Word Alignment Induction from Neural Machine Translation

论文作者

Chen, Yun, Liu, Yang, Chen, Guanhua, Jiang, Xin, Liu, Qun

论文摘要

尽管其最初的目标是共同学会对齐和翻译，但先前的研究表明，变形金刚通过其注意力机制捕获了糟糕的单词一致性。在本文中，我们表明注意力重量确实捕获了准确的单词对齐方式，并提出了两种新颖的单词对准诱导方法shift-att和shift-aet。主要思想是在按比例的目标令牌是解码器输入而不是解码器输出时，在步骤中诱导对齐。 Shift-Att是一种解释方法，它可以从变压器的注意力重量中诱导对齐，并且不需要参数更新或架构更改。 Shift-AET从附加对齐模块中提取对准，该模块紧密整合到变压器中，并通过对对称的移位对齐对齐的监督进行隔离训练。在三个公开可用数据集上的实验表明，两种方法的性能都优于其相应的神经基准，而Shift-AET的性能明显优于giza ++的1.4-4.8 AER点。

Despite its original goal to jointly learn to align and translate, prior researches suggest that Transformer captures poor word alignments through its attention mechanism. In this paper, we show that attention weights DO capture accurate word alignments and propose two novel word alignment induction methods Shift-Att and Shift-AET. The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input rather than the decoder output as in previous work. Shift-Att is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change. Shift-AET extracts alignments from an additional alignment module which is tightly integrated into Transformer and trained in isolation with supervision from symmetrized Shift-Att alignments. Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines and Shift-AET significantly outperforms GIZA++ by 1.4-4.8 AER points.

下载PDF全文

下载文献需遵守相关版权规定

论文标题