论文标题

弱监督学习,并提供噪音标记图像的附带信息

Weakly Supervised Learning with Side Information for Noisy Labeled Images

论文作者

Cheng, Lele, Zhou, Xiangzeng, Zhao, Liming, Li, Dangwei, Shang, Hong, Zheng, Yun, Pan, Pan, Xu, Yinghui

论文摘要

在许多现实世界数据集(例如Webvision)中,基于DNN的分类器的性能通常受到嘈杂的标记数据的限制。为了解决此问题,某些与图像相关的侧面信息(例如字幕和标签)通常揭示了跨图像的基本关系。在本文中,我们通过使用侧面信息网络(SINET)提出了有效的弱监督学习,该学习旨在有效地进行具有严重嘈杂标签的大规模分类。提出的Sinet由视觉原型模块和一个噪声加权模块组成。视觉原型模块旨在通过引入侧面信息来为每个类别生成紧凑的表示。噪声加权模块旨在估计每个嘈杂图像的正确性,并在训练过程中为图像排名产生置信度评分。支撑的Sinet在很大程度上可以减轻嘈杂图像标签的负面影响,并且有益于培训高性能CNN的分类器。此外,我们发布了一个名为Aliproducts的细颗粒产品数据集,其中包含超过250万个嘈杂的Web图像,其中使用了50,000个细颗粒语义类别产生的查询,从互联网上爬了出来。对几个流行的基准测试(即网络视频,成像网和服装1M)以及我们提议的脂肪动物实现的广泛实验可实现最先进的性能。 Sinet赢得了2019年网络视频挑战赛的分类任务的第一名,并以很大的优势优于其他竞争对手。

In many real-world datasets, like WebVision, the performance of DNN based classifier is often limited by the noisy labeled data. To tackle this problem, some image related side information, such as captions and tags, often reveal underlying relationships across images. In this paper, we present an efficient weakly supervised learning by using a Side Information Network (SINet), which aims to effectively carry out a large scale classification with severely noisy labels. The proposed SINet consists of a visual prototype module and a noise weighting module. The visual prototype module is designed to generate a compact representation for each category by introducing the side information. The noise weighting module aims to estimate the correctness of each noisy image and produce a confidence score for image ranking during the training procedure. The propsed SINet can largely alleviate the negative impact of noisy image labels, and is beneficial to train a high performance CNN based classifier. Besides, we released a fine-grained product dataset called AliProducts, which contains more than 2.5 million noisy web images crawled from the internet by using queries generated from 50,000 fine-grained semantic classes. Extensive experiments on several popular benchmarks (i.e. Webvision, ImageNet and Clothing-1M) and our proposed AliProducts achieve state-of-the-art performance. The SINet has won the first place in the classification task on WebVision Challenge 2019, and outperformed other competitors by a large margin.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源