论文标题
内省深度度量学习图像检索
Introspective Deep Metric Learning for Image Retrieval
论文作者
论文摘要
本文提出了一个内省的深度度量学习(IDML)框架,以进行不确定性感知图像的比较。传统的深度度量学习方法无论不确定性水平如何,都会在图像之间产生自信的语义距离。但是,我们认为,一个良好的相似性模型应谨慎考虑语义差异,以更好地处理模棱两可的图像以进行更强大的训练。为了实现这一目标,我们建议不仅使用语义嵌入,而且还使用伴随的不确定性嵌入来表示图像,该图像分别描述了图像的语义特征和歧义。我们进一步提出了一个内省的相似性指标,以在考虑其语义差异和歧义性的图像之间做出相似性判断。提出的IDML框架通过不确定性建模来提高深度度量学习的性能,并在广泛使用的CUB-200-2011,CARS196和Stanford Online Products数据集上获得最先进的结果,以进行图像检索和聚类。我们进一步对我们的框架进行了深入的分析,以证明IDML的有效性和可靠性。代码可在以下网址获得:https://github.com/wzzheng/idml。
This paper proposes an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images. Conventional deep metric learning methods produce confident semantic distances between images regardless of the uncertainty level. However, we argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training. To achieve this, we propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively. We further propose an introspective similarity metric to make similarity judgments between images considering both their semantic differences and ambiguities. The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets for image retrieval and clustering. We further provide an in-depth analysis of our framework to demonstrate the effectiveness and reliability of IDML. Code is available at: https://github.com/wzzheng/IDML.