论文标题

带有循环反馈的级联模型,用于直接语音翻译

Cascaded Models With Cyclic Feedback For Direct Speech Translation

论文作者

Lam, Tsz Kin, Schamoni, Shigehiko, Riezler, Stefan

论文摘要

直接语音翻译描述了一个场景,其中只有语音输入和相应的翻译可用。众所周知,此类数据受到限制。我们提出了一种允许自动语音识别(ASR)和机器翻译(MT)级联的技术,以利用除域外MT和ASR数据之外的直接语音翻译数据。在训练前MT和ASR之后,我们使用一个反馈周期,其中MT系统的下游性能被用作通过自我训练来改善ASR系统的信号,并且MT组件在多个ASR输出上进行了微调,从而使其更宽容地拼写变化。与端到端语音翻译的比较,使用相同架构的组成部分和相同的数据显示,在librivoxdeen上最多可获得3.8个BLEU点,以及在Covost上最多可提供5.1个BLEU点,用于德语到英语的语音翻译。

Direct speech translation describes a scenario where only speech inputs and corresponding translations are available. Such data are notoriously limited. We present a technique that allows cascades of automatic speech recognition (ASR) and machine translation (MT) to exploit in-domain direct speech translation data in addition to out-of-domain MT and ASR data. After pre-training MT and ASR, we use a feedback cycle where the downstream performance of the MT system is used as a signal to improve the ASR system by self-training, and the MT component is fine-tuned on multiple ASR outputs, making it more tolerant towards spelling variations. A comparison to end-to-end speech translation using components of identical architecture and the same data shows gains of up to 3.8 BLEU points on LibriVoxDeEn and up to 5.1 BLEU points on CoVoST for German-to-English speech translation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源