用于在合成姿势估计器训练中弥合域间隙的样式转移剂

论文标题

用于在合成姿势估计器训练中弥合域间隙的样式转移剂

Style-transfer GANs for bridging the domain gap in synthetic pose estimator training

论文作者

Rojtberg, Pavel, Pöllabauer, Thomas, Kuijper, Arjan

论文摘要

考虑到当前CNN体系结构对大型训练集的依赖性，使用合成数据的可能性是诱人的，因为它允许生成几乎无限的标记训练数据。但是，产生此类数据是一项非平凡的任务，因为当前的CNN体系结构对真实数据和合成数据之间的域间隙敏感。我们建议对像素级图像翻译采用通用GAN模型，从而使域间隙本身作为学习问题。然后在训练或推理过程中使用获得的模型以弥合域间隙。在这里，我们专注于仅在合成CAD几何形状上训练单阶段Yolo6D对象姿势估计器，即使是近似近似的表面信息。使用配对的GAN模型时，我们使用基于边缘的中间域并引入不同的映射以表示未知的表面特性。与经过相同程度的域随机化训练的模型相比，我们的评估表明，模型性能有了显着改善，同时只需要额外的努力。

Given the dependency of current CNN architectures on a large training set, the possibility of using synthetic data is alluring as it allows generating a virtually infinite amount of labeled training data. However, producing such data is a non-trivial task as current CNN architectures are sensitive to the domain gap between real and synthetic data. We propose to adopt general-purpose GAN models for pixel-level image translation, allowing to formulate the domain gap itself as a learning problem. The obtained models are then used either during training or inference to bridge the domain gap. Here, we focus on training the single-stage YOLO6D object pose estimator on synthetic CAD geometry only, where not even approximate surface information is available. When employing paired GAN models, we use an edge-based intermediate domain and introduce different mappings to represent the unknown surface properties. Our evaluation shows a considerable improvement in model performance when compared to a model trained with the same degree of domain randomization, while requiring only very little additional effort.

下载PDF全文

下载文献需遵守相关版权规定

论文标题