重新访问视觉感知模型的弱监督预训练

论文标题

重新访问视觉感知模型的弱监督预训练

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

论文作者

Singh, Mannat, Gustafson, Laura, Adcock, Aaron, Reis, Vinicius de Freitas, Gedik, Bugra, Kosaraju, Raj Prateek, Mahajan, Dhruv, Girshick, Ross, Dollár, Piotr, van der Maaten, Laurens

论文摘要

模型预训练是现代视觉识别系统的基石。尽管在像ImageNet这样的数据集上进行了完全监督的预训练仍然是事实上的标准，但最近的研究表明，大规模监督的训练前训练可以超过完全监督的方法。本文使用现代版本的残留网络以及有史以来最大的图像和相应标签数据集对模型进行了弱监督的预训练。我们研究了所得模型在各种转移学习环境中的性能，包括零射传输。我们还将我们的模型与通过大规模自学学习获得的模型进行了比较。我们发现，在所有环境中，我们弱不足的监督模型都非常有竞争力，并且发现它们的表现大大优于他们的自我监督者。我们还包括研究我们的模型是否学会了可能令人不安的关联或刻板印象。总体而言，我们的结果为在视觉识别系统的发展中使用弱监督学习提供了一个令人信服的论点。我们的模型通过主题标签（赃物）薄弱地进行了监督。

Model pre-training is a cornerstone of modern visual recognition systems. Although fully supervised pre-training on datasets like ImageNet is still the de-facto standard, recent studies suggest that large-scale weakly supervised pre-training can outperform fully supervised approaches. This paper revisits weakly-supervised pre-training of models using hashtag supervision with modern versions of residual networks and the largest-ever dataset of images and corresponding hashtags. We study the performance of the resulting models in various transfer-learning settings including zero-shot transfer. We also compare our models with those obtained via large-scale self-supervised learning. We find our weakly-supervised models to be very competitive across all settings, and find they substantially outperform their self-supervised counterparts. We also include an investigation into whether our models learned potentially troubling associations or stereotypes. Overall, our results provide a compelling argument for the use of weakly supervised learning in the development of visual recognition systems. Our models, Supervised Weakly through hashtAGs (SWAG), are available publicly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题