论文标题

通过可扩展的动态路由,任务注重的自我监督预训练

Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing

论文作者

Liu, Zhili, Han, Jianhua, Hong, Lanqing, Xu, Hang, Chen, Kai, Xu, Chunjing, Li, Zhenguo

论文摘要

自我监督的学习(SSL),尤其是对比方法,最近在没有语义注释的情况下学习有效的可转移表示形式,引起了吸引力。自我监督预训练的一种常见做法是使用尽可能多的数据。但是,对于特定的下游任务,涉及预训练中无关的数据可能会退化下游性能,从我们的广泛实验中可以看出。另一方面,对于现有的SSL方法,在预训练中使用不同的下游任务数据集进行不同任务是繁重的,并且不可行。为了解决此问题,我们提出了一种称为可扩展动态路由(SDR)的新型SSL范式,可以通过训练一次,并通过任务注定的预训练的预训练模型有效地部署到不同的下游任务。具体而言,我们使用各种子网络构建SDRNET,并通过数据感知的渐进式培训训练每个子网络。当下游任务到达时,我们将在所有预训练的子网络中路由,以获得最佳及其相应的权重。实验结果表明,我们的SDR可以同时训练256个子网络,该子网与在完整成像网上接受训练的统一模型提供了更好的传输性能,在11个下游分类任务上实现了最先进的(SOTA)平均准确性,而在Pascal VOC检测任务上获得了AP。

Self-supervised learning (SSL), especially contrastive methods, has raised attraction recently as it learns effective transferable representations without semantic annotations. A common practice for self-supervised pre-training is to use as much data as possible. For a specific downstream task, however, involving irrelevant data in pre-training may degenerate the downstream performance, observed from our extensive experiments. On the other hand, for existing SSL methods, it is burdensome and infeasible to use different downstream-task-customized datasets in pre-training for different tasks. To address this issue, we propose a novel SSL paradigm called Scalable Dynamic Routing (SDR), which can be trained once and deployed efficiently to different downstream tasks with task-customized pre-trained models. Specifically, we construct the SDRnet with various sub-nets and train each sub-net with only one subset of the data by data-aware progressive training. When a downstream task arrives, we route among all the pre-trained sub-nets to get the best along with its corresponding weights. Experiment results show that our SDR can train 256 sub-nets on ImageNet simultaneously, which provides better transfer performance than a unified model trained on the full ImageNet, achieving state-of-the-art (SOTA) averaged accuracy over 11 downstream classification tasks and AP on PASCAL VOC detection task.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源