可可 - 费用：带有内容条件样式编码器的几乎没有监督的图像翻译

论文标题

可可 - 费用：带有内容条件样式编码器的几乎没有监督的图像翻译

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder

论文作者

Saito, Kuniaki, Saenko, Kate, Liu, Ming-Yu

论文摘要

无监督的图像到图像的翻译打算在没有明确监督映射的情况下学习给定域中图像的映射到不同域中的类似图像。几乎没有射击的无监督图像到图像翻译进一步尝试通过利用推理时间提供的不看者域的示例图像来将模型推广到一个看不见的域。虽然非常成功，但现有的几张图像到图像翻译模型发现，在模拟看不见的域的外观时，很难保留输入图像的结构，我们称之为内容损失问题。当输入中的物体的姿势和示例图像截然不同时，这一点尤其严重。为了解决这个问题，我们提出了一个新的几弹图翻译模型Coco-Funit，该模型计算了在输入图像上调节的示例图像的样式嵌入，并将一个称为“恒定样式偏见”的新模块。通过与最先进的实验验证，我们的模型在解决内容损失问题方面显示出有效性。对于代码和预算模型，请查看https://nvlabs.github.io/coco-funit/。

Unsupervised image-to-image translation intends to learn a mapping of an image in a given domain to an analogous image in a different domain, without explicit supervision of the mapping. Few-shot unsupervised image-to-image translation further attempts to generalize the model to an unseen domain by leveraging example images of the unseen domain provided at inference time. While remarkably successful, existing few-shot image-to-image translation models find it difficult to preserve the structure of the input image while emulating the appearance of the unseen domain, which we refer to as the content loss problem. This is particularly severe when the poses of the objects in the input and example images are very different. To address the issue, we propose a new few-shot image translation model, COCO-FUNIT, which computes the style embedding of the example images conditioned on the input image and a new module called the constant style bias. Through extensive experimental validations with comparison to the state-of-the-art, our model shows effectiveness in addressing the content loss problem. For code and pretrained models, please check out https://nvlabs.github.io/COCO-FUNIT/ .

下载PDF全文

下载文献需遵守相关版权规定

论文标题