Bevuda：域自适应BEV 3D对象检测的多几何空间对齐

论文标题

Bevuda：域自适应BEV 3D对象检测的多几何空间对齐

BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

论文作者

Liu, Jiaming, Zhang, Rongyu, Li, Xiaoqi, Chi, Xiaowei, Chen, Zehui, Lu, Ming, Guo, Yandong, Zhang, Shanghang

论文摘要

以视觉为中心的鸟眼视图（BEV）感知显示出有希望的自主驾驶潜力。最近的工作主要集中于提高效率或准确性，但在面对环境改变时忽略了挑战，导致转移绩效的严重降解。对于BEV感知，我们找出了典型的现实跨域场景中存在的重要域差距，并全面解决了多视图3D对象检测的域适应性（DA）问题。由于BEV感知方法很复杂，并且包含多个组件，因此域移动在多个几何空间（即2D，3D Voxel，bev）上积累，使BEV DA甚至具有挑战性。在本文中，我们提出了一个多空间对齐教师学生（MATS）框架，以减轻域转移的积累，该域由深度意识的教师（DAT）和几何空间和几何空间对准学生（GAS）模型组成。 DAT巧妙地结合了目标激光雷达和可靠的深度预测，以构建深度感知信息，在体素和BEV特征空间中提取目标域特异性知识。然后，它将多个空间的足够领域知识转移到学生模型。为了共同减轻域移位，气体将多几何空间特征投射到共享的几何嵌入空间，并降低两个域之间的数据分布距离。为了验证我们的方法的有效性，我们在三个跨域场景上进行BEV 3D对象检测实验并实现最先进的性能。

Vision-centric bird-eye-view (BEV) perception has shown promising potential in autonomous driving. Recent works mainly focus on improving efficiency or accuracy but neglect the challenges when facing environment changing, resulting in severe degradation of transfer performance. For BEV perception, we figure out the significant domain gaps existing in typical real-world cross-domain scenarios and comprehensively solve the Domain Adaption (DA) problem for multi-view 3D object detection. Since BEV perception approaches are complicated and contain several components, the domain shift accumulation on multiple geometric spaces (i.e., 2D, 3D Voxel, BEV) makes BEV DA even challenging. In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model. DAT tactfully combines target lidar and reliable depth prediction to construct depth-aware information, extracting target domain-specific knowledge in Voxel and BEV feature spaces. It then transfers the sufficient domain knowledge of multiple spaces to the student model. In order to jointly alleviate the domain shift, GAS projects multi-geometric space features to a shared geometric embedding space and decreases data distribution distance between two domains. To verify the effectiveness of our method, we conduct BEV 3D object detection experiments on three cross-domain scenarios and achieve state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题