论文标题
我们在说谁?在语音翻译中处理人的名字
Who Are We Talking About? Handling Person Names in Speech Translation
论文作者
论文摘要
最近的工作表明,语音翻译的系统(ST)(ST)与自动语音识别(ASR)类似 - 处理人的名字。这种缺点不仅会导致可能严重扭曲输入含义的错误,而且还阻碍了在应用程序方案(例如计算机辅助解释)中采用此类系统,在这种情况下,命名实体的翻译(如人名称)至关重要。在本文中,我们首先分析ASR/ST系统的输出,以确定人格转录/翻译的失败原因。除了训练数据中的频率外,我们还将转介人的国籍确定为关键因素。然后,我们通过创建多语言模型来减轻问题,并通过强迫它们共同生成成绩单和翻译来进一步改善我们的ST系统,从而优先考虑前者优先于后者。总体而言,我们的解决方案导致令牌人的名称准确性的相对提高了三种语言对(En-> es,fr,it)的平均值47.8%。
Recent work has shown that systems for speech translation (ST) -- similarly to automatic speech recognition (ASR) -- poorly handle person names. This shortcoming does not only lead to errors that can seriously distort the meaning of the input, but also hinders the adoption of such systems in application scenarios (like computer-assisted interpreting) where the translation of named entities, like person names, is crucial. In this paper, we first analyse the outputs of ASR/ST systems to identify the reasons of failures in person name transcription/translation. Besides the frequency in the training data, we pinpoint the nationality of the referred person as a key factor. We then mitigate the problem by creating multilingual models, and further improve our ST systems by forcing them to jointly generate transcripts and translations, prioritising the former over the latter. Overall, our solutions result in a relative improvement in token-level person name accuracy by 47.8% on average for three language pairs (en->es,fr,it).