单眼3D对象检测的跨模式知识蒸馏网络

论文标题

单眼3D对象检测的跨模式知识蒸馏网络

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection

论文作者

Hong, Yu, Dai, Hang, Ding, Yong

论文摘要

利用基于激光雷达的探测器或实际激光点数据指导单眼3D检测带来了显着改善，例如伪驱动方法。但是，现有方法通常采用非端到端培训策略，并且不足以利用LiDAR信息，而LiDar数据的富裕潜力尚未得到充分利用。在本文中，我们提出了单眼3D检测的跨模式知识蒸馏（CMKD）网络，以有效地将知识从激光雷达模态转移到功能和响应上的图像模态。此外，我们通过将知识从大规模未标记的数据中提取并显着提高了性能，进一步扩展了CMKD作为半监督培训框架。在提交之前，CMKD在单眼3D检测器中排名$ 1^{st} $，与以前的先进方法相比，Kitti $ test $ SET和WAYMO $ VAL $ SEAD上都有具有显着性能增长的Waymo $ Val $。

Leveraging LiDAR-based detectors or real LiDAR point data to guide monocular 3D detection has brought significant improvement, e.g., Pseudo-LiDAR methods. However, the existing methods usually apply non-end-to-end training strategies and insufficiently leverage the LiDAR information, where the rich potential of the LiDAR data has not been well exploited. In this paper, we propose the Cross-Modality Knowledge Distillation (CMKD) network for monocular 3D detection to efficiently and directly transfer the knowledge from LiDAR modality to image modality on both features and responses. Moreover, we further extend CMKD as a semi-supervised training framework by distilling knowledge from large-scale unlabeled data and significantly boost the performance. Until submission, CMKD ranks $1^{st}$ among the monocular 3D detectors with publications on both KITTI $test$ set and Waymo $val$ set with significant performance gains compared to previous state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题