论文标题
PNERF:不确定3D视觉映射的概率神经场景表示
PNeRF: Probabilistic Neural Scene Representations for Uncertain 3D Visual Mapping
论文作者
论文摘要
最近,神经场景表征在视觉上为3D场景提供了非常令人印象深刻的结果,但是,他们的研究和进步主要仅限于计算机图形或计算机视觉中的虚拟模型的可视化,而无需明确考虑传感器和构成不确定性。但是,在机器人应用中,使用这种新颖的场景表示形式,需要考虑神经图中这种不确定性。因此,本文的目的是提出一种新的方法,用于使用不确定的培训数据来训练{\ em概率的神经场景表示},这可以使这些表示形式纳入机器人技术应用中。使用摄像头或深度传感器获取图像包含固有的不确定性,此外,用于学习3D模型的相机姿势也不完美。如果这些测量值用于训练而无需考虑其不确定性,则结果模型是非最佳选择的,并且所得场景表示可能包含诸如Blur和Inveve Geometry之类的伪影。在这项工作中,通过以概率方式专注于不确定信息的培训来研究与学习过程的不确定性整合问题。所提出的方法涉及以不确定性项的明确增加训练可能性,以使网络的学习概率分布相对于培训不确定性最小化。还将表明,除了更精确和一致的几何形状外,这还导致更准确的图像渲染质量。对合成数据集和实际数据集进行了验证,表明所提出的方法的表现优于最先进的方法。结果表明,即使训练数据受到限制,该提出的方法也能够呈现新颖的高质量视图。
Recently neural scene representations have provided very impressive results for representing 3D scenes visually, however, their study and progress have mainly been limited to visualization of virtual models in computer graphics or scene reconstruction in computer vision without explicitly accounting for sensor and pose uncertainty. Using this novel scene representation in robotics applications, however, would require accounting for this uncertainty in the neural map. The aim of this paper is therefore to propose a novel method for training {\em probabilistic neural scene representations} with uncertain training data that could enable the inclusion of these representations in robotics applications. Acquiring images using cameras or depth sensors contains inherent uncertainty, and furthermore, the camera poses used for learning a 3D model are also imperfect. If these measurements are used for training without accounting for their uncertainty, then the resulting models are non-optimal, and the resulting scene representations are likely to contain artifacts such as blur and un-even geometry. In this work, the problem of uncertainty integration to the learning process is investigated by focusing on training with uncertain information in a probabilistic manner. The proposed method involves explicitly augmenting the training likelihood with an uncertainty term such that the learnt probability distribution of the network is minimized with respect to the training uncertainty. It will be shown that this leads to more accurate image rendering quality, in addition to more precise and consistent geometry. Validation has been carried out on both synthetic and real datasets showing that the proposed approach outperforms state-of-the-art methods. The results show notably that the proposed method is capable of rendering novel high-quality views even when the training data is limited.