HYPER-X：多种语传输的统一超级核武器

论文标题

HYPER-X：多种语传输的统一超级核武器

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

论文作者

Üstün, Ahmet, Bisazza, Arianna, Bouma, Gosse, van Noord, Gertjan, Ruder, Sebastian

论文摘要

大量多语言模型对于跨任务和语言进行转移学习是有希望的。但是，现有方法在不同的任务语言组合中可用时无法完全利用培训数据。为了利用这种异质监督，我们提出了Hyper-X，这是一个单一的超级net工作，可以通过有效的适应来统一多任务和多语言学习。该模型生成了在任务和语言嵌入条件下的适配器模块的权重。通过学习结合任务和特定于语言的知识，我们的模型可以为看不见的语言和任务语言组合零射传递。我们对各种语言的实验表明，当可以使用多种资源的混合在一起时，Hyper-X可以实现最佳或竞争性的收益，而在标准方案中与强大的基线相当。与训练单独的适配器的方法相比，Hyper-X在参数和资源方面的效率也要高得多。最后，Hyper-X在新语言的几个方案中始终如一地产生强大的结果，显示了我们的方法的多功能性，而不是零射击传递。

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. This model generates weights for adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge, our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best or competitive gain when a mixture of multiple resources is available, while being on par with strong baselines in the standard scenario. Hyper-X is also considerably more efficient in terms of parameters and resources compared to methods that train separate adapters. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages, showing the versatility of our approach beyond zero-shot transfer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题