论文标题
W2N:从弱监督到嘈杂的对象检测的嘈杂监督
W2N:Switching From Weak Supervision to Noisy Supervision for Object Detection
论文作者
论文摘要
弱监督对象检测(WSOD)旨在仅训练需要图像级注释的对象检测器。最近,一些作品设法选择了从训练有素的WSOD网络生成的准确框,以监督半监督的检测框架以提高性能。但是,这些方法只需根据图像级标准将设置的训练分为标记和未标记的集合,以便选择足够标记或错误的局部盒子预测作为伪基真实性,从而导致检测性能的次优溶液解决方案。为了克服这个问题,我们提出了一个新颖的WSOD框架,其新范式从弱监督到嘈杂的监督(W2N)。通常,通过训练有素的WSOD网络产生的给定的伪基真实,我们提出了一种两模块迭代训练算法来完善伪标签并逐步监督更好的对象探测器。在定位适应模块中,我们提出正规化损失,以减少原始伪基真实性中判别零件的比例,从而获得更好的伪基真实性以进行进一步培训。在半监督的模块中,我们提出了两个任务实例级拆分方法,以选择用于训练半监督检测器的高质量标签。不同基准的实验结果验证了W2N的有效性,我们的W2N优于所有现有的纯WSOD方法和转移学习方法。我们的代码可在https://github.com/1170300714/w2n_wsod上公开获取。
Weakly-supervised object detection (WSOD) aims to train an object detector only requiring the image-level annotations. Recently, some works have managed to select the accurate boxes generated from a well-trained WSOD network to supervise a semi-supervised detection framework for better performance. However, these approaches simply divide the training set into labeled and unlabeled sets according to the image-level criteria, such that sufficient mislabeled or wrongly localized box predictions are chosen as pseudo ground-truths, resulting in a sub-optimal solution of detection performance. To overcome this issue, we propose a novel WSOD framework with a new paradigm that switches from weak supervision to noisy supervision (W2N). Generally, with given pseudo ground-truths generated from the well-trained WSOD network, we propose a two-module iterative training algorithm to refine pseudo labels and supervise better object detector progressively. In the localization adaptation module, we propose a regularization loss to reduce the proportion of discriminative parts in original pseudo ground-truths, obtaining better pseudo ground-truths for further training. In the semi-supervised module, we propose a two tasks instance-level split method to select high-quality labels for training a semi-supervised detector. Experimental results on different benchmarks verify the effectiveness of W2N, and our W2N outperforms all existing pure WSOD methods and transfer learning methods. Our code is publicly available at https://github.com/1170300714/w2n_wsod.