深度度量学习符合深度聚类：一种新颖的无监督方法的特征嵌入方法

论文标题

深度度量学习符合深度聚类：一种新颖的无监督方法的特征嵌入方法

Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding

论文作者

Nguyen, Binh X., Nguyen, Binh D., Carneiro, Gustavo, Tjiputra, Erman, Tran, Quang D., Do, Thanh-Toan

论文摘要

无监督的深度度量学习（UDML）旨在从未标记的数据集中学习嵌入空间中的样本相似性。传统的UDML方法通常使用三胞胎损失或成对损失，需要开采正和负样本W.R.T.锚数据点。但是，由于没有标签信息，这在无监督的环境中具有挑战性。在本文中，我们提出了一种克服挑战的新UDML方法。特别是，我们建议使用深层聚类损失来学习代表语义类别的centroids，即伪标签。在学习过程中，这些质心也用于重建输入样品。因此，它确保了质心的代表性 - 每个质心代表视觉上相似的样品。因此，质心提供有关正（视觉上相似）和阴性（视觉上不同）样本的信息。基于伪标签，我们提出了一种新型的无监督度量损失，该损失可实施嵌入空间中样品的正浓度和负分离。基准数据集的实验结果表明，所提出的方法的表现优于其他UDML方法。

Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample similarities in the embedding space from an unlabeled dataset. Traditional UDML methods usually use the triplet loss or pairwise loss which requires the mining of positive and negative samples w.r.t. anchor data points. This is, however, challenging in an unsupervised setting as the label information is not available. In this paper, we propose a new UDML method that overcomes that challenge. In particular, we propose to use a deep clustering loss to learn centroids, i.e., pseudo labels, that represent semantic classes. During learning, these centroids are also used to reconstruct the input samples. It hence ensures the representativeness of centroids - each centroid represents visually similar samples. Therefore, the centroids give information about positive (visually similar) and negative (visually dissimilar) samples. Based on pseudo labels, we propose a novel unsupervised metric loss which enforces the positive concentration and negative separation of samples in the embedding space. Experimental results on benchmarking datasets show that the proposed approach outperforms other UDML methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题