S-Prompts通过预先训练的变压器学习：OCCAM的剃须刀用于域增量学习

论文标题

S-Prompts通过预先训练的变压器学习：OCCAM的剃须刀用于域增量学习

S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

论文作者

Wang, Yabin, Huang, Zhiwu, Hong, Xiaopeng

论文摘要

最先进的深度神经网络仍在努力解决持续学习中的灾难性遗忘问题。在本文中，我们提出了一种简单的范式（称为S宣传）和两种具体方法，以高度降低最典型的连续学习场景之一，即域增量学习（DIL）。范式的关键思想是通过预先训练的变压器独立学习提示，以避免使用常规方法中通常出现的示例。这导致了双赢游戏，提示可以为每个域获得最佳状态。跨域的独立提示仅请求一个单一的跨凝结损失，以进行训练，而一个简单的K-NN操作作为推理的域标识符。学习范式得出了图像及时学习方法和新颖的语言图像及时学习方法。拥有出色的可伸缩性（每个域的参数增加0.03％），我们最好的方法比最优秀的三项标准DIL任务的最佳典范方法相对的相对改善（平均约为30％），甚至在使用视频时，他们的最佳范围相对超过了6％。源代码可在\ url {https://github.com/iamwangyabin/s-prompts}获得。

State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at \url{https://github.com/iamwangyabin/S-Prompts}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题