Waymo打开数据集挑战的第一名解决方案 - 3D检测和域适应

论文标题

Waymo打开数据集挑战的第一名解决方案 - 3D检测和域适应

1st Place Solution for Waymo Open Dataset Challenge -- 3D Detection and Domain Adaptation

论文作者

Ding, Zhuangzhuang, Hu, Yihan, Ge, Runzhou, Huang, Li, Chen, Sijia, Wang, Yu, Liao, Jie

论文摘要

In this technical report, we introduce our winning solution "HorizonLiDAR3D" for the 3D detection track and the domain adaptation track in Waymo Open Dataset Challenge at CVPR 2020. Many existing 3D object detectors include prior-based anchor box design to account for different scales and aspect ratios and classes of objects, which limits its capability of generalization to a different dataset or domain and requires post-processing (e.g. Non-Maximum抑制（NMS））。我们提出了一个单阶段，无锚和NMS的3D点云对象检测器AFDET，使用对象密钥点来编码3D属性，并学习无需手工设计或学习锚固端的端到端点云对象检测。 AFDET在我们的获胜解决方案中是强大的基准，并且在挑战期间对这一基线进行了重大改进。具体来说，我们设计了更强大的网络，并使用致密化和点绘画来增强点云数据。为了利用相机信息，我们通过将它们投影到相机空间并收集基于图像的感知信息来附加/绘制其他属性。最终检测性能还受益于3D检测轨道和域自适应轨道中模型集合和测试时间增强（TTA）。我们的解决方案在3D检测轨道和域适应轨道上分别以77.11％的MAPH/L2和69.49％的MAPH/L2获得第一名。

In this technical report, we introduce our winning solution "HorizonLiDAR3D" for the 3D detection track and the domain adaptation track in Waymo Open Dataset Challenge at CVPR 2020. Many existing 3D object detectors include prior-based anchor box design to account for different scales and aspect ratios and classes of objects, which limits its capability of generalization to a different dataset or domain and requires post-processing (e.g. Non-Maximum Suppression (NMS)). We proposed a one-stage, anchor-free and NMS-free 3D point cloud object detector AFDet, using object key-points to encode the 3D attributes, and to learn an end-to-end point cloud object detection without the need of hand-engineering or learning the anchors. AFDet serves as a strong baseline in our winning solution and significant improvements are made over this baseline during the challenges. Specifically, we design stronger networks and enhance the point cloud data using densification and point painting. To leverage camera information, we append/paint additional attributes to each point by projecting them to camera space and gathering image-based perception information. The final detection performance also benefits from model ensemble and Test-Time Augmentation (TTA) in both the 3D detection track and the domain adaptation track. Our solution achieves the 1st place with 77.11% mAPH/L2 and 69.49% mAPH/L2 respectively on the 3D detection track and the domain adaptation track.

下载PDF全文

下载文献需遵守相关版权规定

论文标题