论文标题

HYPER-X:多种语传输的统一超级核武器

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

论文作者

Üstün, Ahmet, Bisazza, Arianna, Bouma, Gosse, van Noord, Gertjan, Ruder, Sebastian

论文摘要

大量多语言模型对于跨任务和语言进行转移学习是有希望的。但是,现有方法在不同的任务语言组合中可用时无法完全利用培训数据。为了利用这种异质监督,我们提出了Hyper-X,这是一个单一的超级net工作,可以通过有效的适应来统一多任务和多语言学习。该模型生成了在任务和语言嵌入条件下的适配器模块的权重。通过学习结合任务和特定于语言的知识,我们的模型可以为看不见的语言和任务语言组合零射传递。我们对各种语言的实验表明,当可以使用多种资源的混合在一起时,Hyper-X可以实现最佳或竞争性的收益,而在标准方案中与强大的基线相当。与训练单独的适配器的方法相比,Hyper-X在参数和资源方面的效率也要高得多。最后,Hyper-X在新语言的几个方案中始终如一地产生强大的结果,显示了我们的方法的多功能性,而不是零射击传递。

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. This model generates weights for adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge, our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best or competitive gain when a mixture of multiple resources is available, while being on par with strong baselines in the standard scenario. Hyper-X is also considerably more efficient in terms of parameters and resources compared to methods that train separate adapters. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages, showing the versatility of our approach beyond zero-shot transfer.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源