论文标题

噪音度量学习的平稳代理人损失

Smooth Proxy-Anchor Loss for Noisy Metric Learning

论文作者

Roig, Carlos, Varas, David, Masuda, Issey, Riveiro, Juan Carlos, Bou-Balust, Elisenda

论文摘要

在设计具有大量类别的系统时,许多工业应用使用度量学习作为规避可伸缩性问题的一种方式。因此,这种研究领域吸引了学术和非学术社区的极大兴趣。这样的工业应用需要大规模数据集,这些数据集通常由Web数据生成,因此通常包含大量嘈杂的标签。尽管公制学习系统对嘈杂的标签很敏感,但通常在文献中无法解决,这依赖于手动注释的数据集。 在这项工作中,我们提出了一种度量学习方法,该方法能够使用我们新颖的平滑代理锚定损失来克服嘈杂标签的存在。我们还提出了一种使用上述损失的架构,并通过两阶段的学习程序。首先,我们训练一个计算样本类知识的置信模块。其次,这些信心用于加重每个样品在嵌入训练中的影响。这将导致能够提供可靠的样品嵌入的系统。 我们将所述方法的性能与当前最新的度量学习损失(基于代理和基于配对的损失)进行了比较,并使用包含嘈杂标签的数据集进行培训。结果分别在召回@1中的2.63和3.29在多相似性和代理人损失方面提高了2.63和3.29,这证明我们的方法在噪音标记条件下优于公制学习的最新时间。

Many industrial applications use Metric Learning as a way to circumvent scalability issues when designing systems with a high number of classes. Because of this, this field of research is attracting a lot of interest from the academic and non-academic communities. Such industrial applications require large-scale datasets, which are usually generated with web data and, as a result, often contain a high number of noisy labels. While Metric Learning systems are sensitive to noisy labels, this is usually not tackled in the literature, that relies on manually annotated datasets. In this work, we propose a Metric Learning method that is able to overcome the presence of noisy labels using our novel Smooth Proxy-Anchor Loss. We also present an architecture that uses the aforementioned loss with a two-phase learning procedure. First, we train a confidence module that computes sample class confidences. Second, these confidences are used to weight the influence of each sample for the training of the embeddings. This results in a system that is able to provide robust sample embeddings. We compare the performance of the described method with current state-of-the-art Metric Learning losses (proxy-based and pair-based), when trained with a dataset containing noisy labels. The results showcase an improvement of 2.63 and 3.29 in Recall@1 with respect to MultiSimilarity and Proxy-Anchor Loss respectively, proving that our method outperforms the state-of-the-art of Metric Learning in noisy labeling conditions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源