MONET3D：实时迈向准确的单眼3D对象定位

论文标题

MONET3D：实时迈向准确的单眼3D对象定位

MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time

论文作者

Zhou, Xichuan, Peng, Yicong, Long, Chunqiao, Ren, Fengbo, Shi, Cong

论文摘要

事实证明，单程多对象检测和3D空间中的定位是一项艰巨的任务。 MONET3D算法是一个新颖有效的框架，可以在单眼图像中预测每个对象的3D位置，并为每个对象绘制一个3D边界框。 MONET3D方法将相邻对象的空间几何相关性的先验知识纳入深度神经网络训练过程中，以提高3D对象定位的准确性。 Kitti数据集的实验表明，预测3D空间中对象的深度和水平坐标的精度分别可以达到96.25 \％和94.74 \％。此外，该方法可以实现27.85 fps的实时图像处理，显示出嵌入式高级驾驶辅助系统应用的有希望的潜力。我们的代码可在https://github.com/cqulearningsystemgroup/yicongpeng上公开获取。

Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image and draw a 3D bounding box for each object. The MoNet3D method incorporates prior knowledge of the spatial geometric correlation of neighbouring objects into the deep neural network training process to improve the accuracy of 3D object localization. Experiments on the KITTI dataset show that the accuracy for predicting the depth and horizontal coordinates of objects in 3D space can reach 96.25\% and 94.74\%, respectively. Moreover, the method can realize the real-time image processing at 27.85 FPS, showing promising potential for embedded advanced driving-assistance system applications. Our code is publicly available at https://github.com/CQUlearningsystemgroup/YicongPeng.

下载PDF全文

下载文献需遵守相关版权规定

论文标题