adatriplet-ra：通过自适应三胞胎进行匹配的域匹配，并加强了无监督的域名适应性的注意力

论文标题

adatriplet-ra：通过自适应三胞胎进行匹配的域匹配，并加强了无监督的域名适应性的注意力

AdaTriplet-RA: Domain Matching via Adaptive Triplet and Reinforced Attention for Unsupervised Domain Adaptation

论文作者

Shu, Xinyao, Yan, Shiyang, Lu, Zhenyu, Wang, Xinshao, Xie, Yuan

论文摘要

无监督的域Adaption（UDA）是一项转移学习任务，可以使用源域的数据和注释，但在培训期间只能访问未标记的目标数据。大多数以前的方法试图通过执行源和目标域之间的分布对齐来最大程度地减少域间隙，该分布域具有明显的限制，即在域级别运行，但忽略了样本级别的差异。为了减轻这种弱点，我们建议使用域间样本匹配方案改善无监督的域适应任务。我们应用了广泛使用且强大的三重态损失，以匹配域间样本。为了减少训练过程中产生的不准确的伪标签的灾难性效应，我们提出了一种新型的不确定性测量方法，以自动选择可靠的伪标签，并逐步完善它们。我们应用了先进的离散放松牙龈软智能技术来实现适应性的TOPK方案以实现功能。此外，为了在域匹配中启用一批批量的全局排名优化，整个模型是通过使用平均精度（AP）作为奖励的策略梯度算法的新型加强注意机制优化的。我们的模型（称为\ textbf {\ textIt {adatriplet-ra}}）在几个公共基准数据集上实现了最新的结果，并且通过全面的消融研究来验证其有效性。我们的方法将基线的准确性提高了9.7 \％（RESNET-101）和VISDA数据集上的6.2 \％（RESNET-50），在域名数据集上将基线的精度提高了VISDA数据集和4.22 \％（Resnet-50）。 {源代码可在\ textit {https://github.com/shuxy0120/adatriplet-ra}}上公开获得。

Unsupervised domain adaption (UDA) is a transfer learning task where the data and annotations of the source domain are available but only have access to the unlabeled target data during training. Most previous methods try to minimise the domain gap by performing distribution alignment between the source and target domains, which has a notable limitation, i.e., operating at the domain level, but neglecting the sample-level differences. To mitigate this weakness, we propose to improve the unsupervised domain adaptation task with an inter-domain sample matching scheme. We apply the widely-used and robust Triplet loss to match the inter-domain samples. To reduce the catastrophic effect of the inaccurate pseudo-labels generated during training, we propose a novel uncertainty measurement method to select reliable pseudo-labels automatically and progressively refine them. We apply the advanced discrete relaxation Gumbel Softmax technique to realise an adaptive Topk scheme to fulfil the functionality. In addition, to enable the global ranking optimisation within one batch for the domain matching, the whole model is optimised via a novel reinforced attention mechanism with supervision from the policy gradient algorithm, using the Average Precision (AP) as the reward. Our model (termed \textbf{\textit{AdaTriplet-RA}}) achieves State-of-the-art results on several public benchmark datasets, and its effectiveness is validated via comprehensive ablation studies. Our method improves the accuracy of the baseline by 9.7\% (ResNet-101) and 6.2\% (ResNet-50) on the VisDa dataset and 4.22\% (ResNet-50) on the Domainnet dataset. {The source code is publicly available at \textit{https://github.com/shuxy0120/AdaTriplet-RA}}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题