自动驾驶汽车的基于卫星图像的跨视图本地化

论文标题

自动驾驶汽车的基于卫星图像的跨视图本地化

Satellite Image Based Cross-view Localization for Autonomous Vehicle

论文作者

Wang, Shan, Zhang, Yanhao, Vora, Ankit, Perincherry, Akhil, Li, Hongdong

论文摘要

现有的自动驾驶汽车的空间定位技术主要使用预先建造的3D-HD地图，通常是使用调查级3D映射车制造的，这不仅昂贵，而且还费力。本文表明，通过使用现成的高清卫星图像作为现成的地图，我们能够实现跨视图车辆的定位，达到令人满意的精度，从而为定位提供了更便宜，更实用的方法。尽管卫星图像用于跨视图本地化是一个既定的概念，但常规方法主要集中在图像检索上。本文介绍了一种新颖的跨视图本地化方法，该方法与传统的图像检索方法背道而驰。 Specifically, our method develops (1) a Geometric-align Feature Extractor (GaFE) that leverages measured 3D points to bridge the geometric gap between ground and overhead views, (2) a Pose Aware Branch (PAB) adopting a triplet loss to encourage pose-aware feature extraction, and (3) a Recursive Pose Refine Branch (RPRB) using the Levenberg-Marquardt (LM) algorithm to将初始姿势对准真正的车辆姿势。我们的方法在Kitti和Ford Multi-AV季节性数据集上进行了验证，作为地面视图和Google Maps作为卫星视图。结果表明，我们的方法在跨视图本地化方面具有优势，分别在$ 1 $米和$ 1^\ Circ $之内的中位空间和角度错误。

Existing spatial localization techniques for autonomous vehicles mostly use a pre-built 3D-HD map, often constructed using a survey-grade 3D mapping vehicle, which is not only expensive but also laborious. This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy, providing a cheaper and more practical way for localization. While the utilization of satellite imagery for cross-view localization is an established concept, the conventional methodology focuses primarily on image retrieval. This paper introduces a novel approach to cross-view localization that departs from the conventional image retrieval method. Specifically, our method develops (1) a Geometric-align Feature Extractor (GaFE) that leverages measured 3D points to bridge the geometric gap between ground and overhead views, (2) a Pose Aware Branch (PAB) adopting a triplet loss to encourage pose-aware feature extraction, and (3) a Recursive Pose Refine Branch (RPRB) using the Levenberg-Marquardt (LM) algorithm to align the initial pose towards the true vehicle pose iteratively. Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view. The results demonstrate the superiority of our method in cross-view localization with median spatial and angular errors within $1$ meter and $1^\circ$, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题