论文标题

很棒还是可怕的?在阿拉伯语评论的神经机器翻译中保存情感

Is it Great or Terrible? Preserving Sentiment in Neural Machine Translation of Arabic Reviews

论文作者

Saadany, Hadeel, Orasan, Constantin

论文摘要

自从神经机器翻译(NMT)的出现接近以来,自动翻译的质量取得了巨大改善。但是,NMT输出在某些低资源语言中仍然缺乏准确性,有时会出现需要大量后编辑的主要错误。这对于不遵循常见的词典语法标准(例如用户生成的内容(UGC))的文本特别明显。在本文中,我们调查将图书评论从阿拉伯语翻译成英语所涉及的挑战,尤其关注导致情感极性不正确翻译的错误。我们的研究指出了阿拉伯教育的特殊特征,研究了Google将阿拉伯语UGC翻译成英语的情感转移错误,分析了问题的原因,并提出了对阿拉伯语UGC翻译的错误类型。我们的分析表明,阿拉伯教资会的在线翻译工具的输出可能完全无法通过产生中性目标文本来传递情感,或者完全翻转目标词或短语的情感极性,因此传达错误的影响信息。我们通过对情感极性进行微调模型来解决这个问题,这表明这种方法可以大大有助于纠正阿拉伯语UGC在线翻译中检测到的情绪错误。

Since the advent of Neural Machine Translation (NMT) approaches there has been a tremendous improvement in the quality of automatic translation. However, NMT output still lacks accuracy in some low-resource languages and sometimes makes major errors that need extensive post-editing. This is particularly noticeable with texts that do not follow common lexico-grammatical standards, such as user generated content (UGC). In this paper we investigate the challenges involved in translating book reviews from Arabic into English, with particular focus on the errors that lead to incorrect translation of sentiment polarity. Our study points to the special characteristics of Arabic UGC, examines the sentiment transfer errors made by Google Translate of Arabic UGC to English, analyzes why the problem occurs, and proposes an error typology specific of the translation of Arabic UGC. Our analysis shows that the output of online translation tools of Arabic UGC can either fail to transfer the sentiment at all by producing a neutral target text, or completely flips the sentiment polarity of the target word or phrase and hence delivers a wrong affect message. We address this problem by fine-tuning an NMT model with respect to sentiment polarity showing that this approach can significantly help with correcting sentiment errors detected in the online translation of Arabic UGC.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源