来自未注释的图像收集的隐式网格重建

论文标题

来自未注释的图像收集的隐式网格重建

Implicit Mesh Reconstruction from Unannotated Image Collections

论文作者

Tulsiani, Shubham, Kulkarni, Nilesh, Gupta, Abhinav

论文摘要

我们提出了一种方法，可以从单个RGB图像中推断对象的3D形状，纹理和摄像头姿势，仅使用带有前景掩码的类别级图像收集作为监督。我们将形状表示为图像条件的隐式函数，该函数将球体的表面转换为预测的网格的表面，同时还可以预测相应的纹理。为了获得学习的监督信号，我们强制执行：a）我们的预测应解释可用的图像证据，b）推断的3D结构应与学习的像素在表面映射上几乎一致。我们从经验上表明，我们的方法在先前的工作中有所改善，这些工作利用类似的监督，实际上竞争性地对使用更强大的监督的方法进行了竞争性。最后，随着我们的方法能够在有限的监督下进行学习，我们定性地证明了其在大约30个对象类别的集合中的适用性。

We present an approach to infer the 3D shape, texture, and camera pose for an object from a single RGB image, using only category-level image collections with foreground masks as supervision. We represent the shape as an image-conditioned implicit function that transforms the surface of a sphere to that of the predicted mesh, while additionally predicting the corresponding texture. To derive supervisory signal for learning, we enforce that: a) our predictions when rendered should explain the available image evidence, and b) the inferred 3D structure should be geometrically consistent with learned pixel to surface mappings. We empirically show that our approach improves over prior work that leverages similar supervision, and in fact performs competitively to methods that use stronger supervision. Finally, as our method enables learning with limited supervision, we qualitatively demonstrate its applicability over a set of about 30 object categories.

下载PDF全文

下载文献需遵守相关版权规定

论文标题