通过参数效率转移学习探索多功能生成语言模型

论文标题

通过参数效率转移学习探索多功能生成语言模型

Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning

论文作者

Lin, Zhaojiang, Madotto, Andrea, Fung, Pascale

论文摘要

对下游语言生成任务进行微调预训练的生成语言模型已显示出令人鼓舞的结果。但是，这是由于每个任务都有一个大型模型的成本，这在低内存/功率方案（例如移动设备）中并不理想。在本文中，我们提出了一种使用单个大型预训练模型同时微调多个下游生成任务的有效方法。对五种不同语言生成任务的实验表明，通过仅对每个任务使用额外的2-3％参数，我们的模型可以维护甚至可以提高整个模型的微调性能。

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power scenarios (e.g., mobile). In this paper, we propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model. The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题