对无监督域适应的层次结构距离进行建模

论文标题

对无监督域适应的层次结构距离进行建模

Modeling Hierarchical Structural Distance for Unsupervised Domain Adaptation

论文作者

Xu, Yingxue, Wen, Guihua, Hu, Yang, Yang, Pei

论文摘要

无监督的域适应性（UDA）旨在通过利用标记的源数据来估算未标记目标域的可转移模型。基于最佳运输（OT）方法最近已被证明是具有稳固理论基础和竞争性能的UDA的有希望的解决方案。但是，这些方法中的大多数仅通过利用基于图像的全局嵌入的域不变特征来利用域的几何形状来关注域级的OT对齐。但是，图像的全球表示可能会破坏图像结构，从而导致丢失本地细节，这些细节提供了类别歧视性信息。这项研究提出了一种端到端的深层层次最佳运输方法（DEEPHOT），该方法旨在通过在域之间挖掘层次结构关系来学习域 - 不变和类别歧视性表示。主要思想是将域级的OT和图像级OT结合到统一的OT框架，分层最佳传输中，以建模域空间和图像空间中的基础几何形状。在DeepHot框架中，图像级OT充当域级别OT的接地距离度量，导致层次结构距离。与常规域级别OT的地面距离相比，图像级OT捕获了对分类有益的图像区域之间的结构关联。这样，DeepHot是一个统一的OT框架，不仅可以按域级别的OT对齐域，而且还通过图像级ot增强了判别能力。此外，为了克服高计算复杂性的局限性，我们通过在图像级ot中用切成薄片的wasserstein距离近似ot来提出强大而有效的深入实现，并完成迷你批量不平衡的域级别的OT。

Unsupervised domain adaptation (UDA) aims to estimate a transferable model for unlabeled target domains by exploiting labeled source data. Optimal Transport (OT) based methods have recently been proven to be a promising solution for UDA with a solid theoretical foundation and competitive performance. However, most of these methods solely focus on domain-level OT alignment by leveraging the geometry of domains for domain-invariant features based on the global embeddings of images. However, global representations of images may destroy image structure, leading to the loss of local details that offer category-discriminative information. This study proposes an end-to-end Deep Hierarchical Optimal Transport method (DeepHOT), which aims to learn both domain-invariant and category-discriminative representations by mining hierarchical structural relations among domains. The main idea is to incorporate a domain-level OT and image-level OT into a unified OT framework, hierarchical optimal transport, to model the underlying geometry in both domain space and image space. In DeepHOT framework, an image-level OT serves as the ground distance metric for the domain-level OT, leading to the hierarchical structural distance. Compared with the ground distance of the conventional domain-level OT, the image-level OT captures structural associations among local regions of images that are beneficial to classification. In this way, DeepHOT, a unified OT framework, not only aligns domains by domain-level OT, but also enhances the discriminative power through image-level OT. Moreover, to overcome the limitation of high computational complexity, we propose a robust and efficient implementation of DeepHOT by approximating origin OT with sliced Wasserstein distance in image-level OT and accomplishing the mini-batch unbalanced domain-level OT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题