论文标题

Vega-MT:JD Explore Academy翻译系统WMT22

Vega-MT: The JD Explore Academy Translation System for WMT22

论文作者

Zan, Changtong, Peng, Keqin, Ding, Liang, Qiu, Baopu, Liu, Boan, He, Shwai, Lu, Qingyu, Zhang, Zheng, Liu, Chuang, Liu, Weifeng, Zhan, Yibing, Tao, Dacheng

论文摘要

我们描述了JD探索学院对WMT 2022共享的一般翻译任务的提交。我们参加了所有高资源轨道和一条中型曲目,包括中文英语,德语英语,捷克语英语,俄罗斯 - 英语和日语英语。我们通过扩大两个主要因素,即语言对和模型大小,即\ textbf {vega-mt}系统来推动以前的工作的极限 - 进行翻译的双向培训。至于语言对,我们将“双向”扩展到“多向”设置,涵盖所有参与语言,以利用跨语言的常识,并将其转移到下游双语任务中。至于型号尺寸,我们将变压器量扩展到拥有近47亿参数的极大模型,以完全增强我们Vega-MT的模型容量。此外,我们采用数据增强策略,例如单语言数据的循环翻译以及双语和单语言数据的双向自我训练,以全面利用双语和单语言数据。为了使我们的Vega-MT适应通用域测试集,设计了概括调整。根据图1所示的sacrebleu的官方自动得分,我们在{Zh-en(33.5),En-Zh(49.7),de-en(33.7),en-de(33.7),en-de(37.8),cs-en(33.7),cs-en(33.7),cs-en(33.7),cs-en(54.9),en-cs(54.9),en-cs(41.4)和32.7 ru(32.7)上获得了第一名。 (45.1)和Ja-en(25.6)},分别在{en-ja(41.5)}上排名第三; W.R.T彗星,我们在{Zh-en(45.1),En-Zh(61.7),de-en(58.0),en-de(63.2),cs-en(74.7),ru-en(64.9),ru-en(64.9),en-en(64.9),en-en-en(64.9),en-ru(69.6)和eN-en-en-en-en-en-en-eN-en-en-en-en-en-en-en-en(65.1)(65.1)(65.1)上(in-en-en(65.1), (40.6)}分别。

We describe the JD Explore Academy's submission of the WMT 2022 shared general translation task. We participated in all high-resource tracks and one medium-resource track, including Chinese-English, German-English, Czech-English, Russian-English, and Japanese-English. We push the limit of our previous work -- bidirectional training for translation by scaling up two main factors, i.e. language pairs and model sizes, namely the \textbf{Vega-MT} system. As for language pairs, we scale the "bidirectional" up to the "multidirectional" settings, covering all participating languages, to exploit the common knowledge across languages, and transfer them to the downstream bilingual tasks. As for model sizes, we scale the Transformer-Big up to the extremely large model that owns nearly 4.7 Billion parameters, to fully enhance the model capacity for our Vega-MT. Also, we adopt the data augmentation strategies, e.g. cycle translation for monolingual data, and bidirectional self-training for bilingual and monolingual data, to comprehensively exploit the bilingual and monolingual data. To adapt our Vega-MT to the general domain test set, generalization tuning is designed. Based on the official automatic scores of constrained systems, in terms of the sacreBLEU shown in Figure-1, we got the 1st place on {Zh-En (33.5), En-Zh (49.7), De-En (33.7), En-De (37.8), Cs-En (54.9), En-Cs (41.4) and En-Ru (32.7)}, 2nd place on {Ru-En (45.1) and Ja-En (25.6)}, and 3rd place on {En-Ja(41.5)}, respectively; W.R.T the COMET, we got the 1st place on {Zh-En (45.1), En-Zh (61.7), De-En (58.0), En-De (63.2), Cs-En (74.7), Ru-En (64.9), En-Ru (69.6) and En-Ja (65.1)}, 2nd place on {En-Cs (95.3) and Ja-En (40.6)}, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源