融合基于图像的室内定位的卷积神经网络和几何约束

论文标题

融合基于图像的室内定位的卷积神经网络和几何约束

Fusing Convolutional Neural Network and Geometric Constraint for Image-based Indoor Localization

论文作者

Song, Jingwei, Patel, Mitesh, Ghaffari, Maani

论文摘要

本文提出了一个新的基于图像的本地化框架，该框架通过融合卷积神经网络（CNN）和顺序图像的几何约束来明确定位相机/机器人。该相机使用单个或几个观察到的图像和具有6度自由姿势标签的训练图像进行定位。采用了暹罗网络结构来训练图像描述符网络，并且在训练集中检索了视觉上相似的候选图像，以几何定位测试图像。同时，概率运动模型基于恒定速度假设来预测姿势。最终使用其不确定性融合了两个估计的姿势，以产生准确的姿势预测。该方法利用几何不确定性，适用于以弥漫性照明为主的室内场景。模拟和实际数据集的实验证明了我们提出的方法的效率。结果进一步表明，与仅CNN方法相比，将基于CNN的框架与几何约束相结合可以提高准确性，尤其是当训练数据大小很小时。

This paper proposes a new image-based localization framework that explicitly localizes the camera/robot by fusing Convolutional Neural Network (CNN) and sequential images' geometric constraints. The camera is localized using a single or few observed images and training images with 6-degree-of-freedom pose labels. A Siamese network structure is adopted to train an image descriptor network, and the visually similar candidate image in the training set is retrieved to localize the testing image geometrically. Meanwhile, a probabilistic motion model predicts the pose based on a constant velocity assumption. The two estimated poses are finally fused using their uncertainties to yield an accurate pose prediction. This method leverages the geometric uncertainty and is applicable in indoor scenarios predominated by diffuse illumination. Experiments on simulation and real data sets demonstrate the efficiency of our proposed method. The results further show that combining the CNN-based framework with geometric constraint achieves better accuracy when compared with CNN-only methods, especially when the training data size is small.

下载PDF全文

下载文献需遵守相关版权规定

论文标题