超出可分离性：分析对比表示对相关亚群的线性可传递性

论文标题

超出可分离性：分析对比表示对相关亚群的线性可传递性

Beyond Separability: Analyzing the Linear Transferability of Contrastive Representations to Related Subpopulations

论文作者

HaoChen, Jeff Z., Wei, Colin, Kumar, Ananya, Ma, Tengyu

论文摘要

对比学习是一种从未标记数据中学习表示形式的高效方法。最近的作品表明，对比表示可以跨域转移，从而导致简单的无监督域适应性算法。特别是，即使两个域的表示距离远非距离，也可以准确地预测源域上的表示形式的线性分类器。我们将此现象称为线性可传递性。本文分析了对比表示在一般无监督的域适应设置中表现出线性可传递性的原因。我们证明，与来自不同域中不同类别的数据（例如，照相犬和卡通猫）的数据相比，不同域中的同一类的数据（例如，照相犬和卡通犬）与彼此相关时，可以进行线性可传递性。我们的分析处于现实状态，源和目标域可以具有无限密度比且相关的无限密度，并且它们在范围内具有遥远的表示。

Contrastive learning is a highly effective method for learning representations from unlabeled data. Recent works show that contrastive representations can transfer across domains, leading to simple state-of-the-art algorithms for unsupervised domain adaptation. In particular, a linear classifier trained to separate the representations on the source domain can also predict classes on the target domain accurately, even though the representations of the two domains are far from each other. We refer to this phenomenon as linear transferability. This paper analyzes when and why contrastive representations exhibit linear transferability in a general unsupervised domain adaptation setting. We prove that linear transferability can occur when data from the same class in different domains (e.g., photo dogs and cartoon dogs) are more related with each other than data from different classes in different domains (e.g., photo dogs and cartoon cats) are. Our analyses are in a realistic regime where the source and target domains can have unbounded density ratios and be weakly related, and they have distant representations across domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题