论文标题
除非您这样做
Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It
论文作者
论文摘要
神经机器翻译(NMT)已被证明与语法性别斗争,这取决于人类参考人的性别,这可能会导致性别偏见。许多现有的问题方法试图通过在句子级别上明确或隐式地添加性别特征来控制目标语言中的性别变化。 在本文中,我们提出了将明确的单词级性别拐点标签纳入NMT的方案。当可以根据人类参考确定性别特征,或者可以自动标记性别标签时,我们探讨了这种性别造成控制的翻译的潜力,以评估英语对西班牙语和英语至德语的翻译。 我们发现,简单的现有方法可以使句子中的多个实体过度将性别功能化,并以标记的核心适应数据的形式提出有效的替代方案。我们还提出了一个扩展,以评估目标语言中相应的语言惯例(例如非二元拐点)的相应语言惯例的性别中性实体的翻译。
Neural Machine Translation (NMT) has been shown to struggle with grammatical gender that is dependent on the gender of human referents, which can cause gender bias effects. Many existing approaches to this problem seek to control gender inflection in the target language by explicitly or implicitly adding a gender feature to the source sentence, usually at the sentence level. In this paper we propose schemes for incorporating explicit word-level gender inflection tags into NMT. We explore the potential of this gender-inflection controlled translation when the gender feature can be determined from a human reference, or when a test sentence can be automatically gender-tagged, assessing on English-to-Spanish and English-to-German translation. We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence, and suggest effective alternatives in the form of tagged coreference adaptation data. We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention, such as a non-binary inflection, in the target language.