可扩展的Yolo：从RGB-D图像中检测3D对象

论文标题

可扩展的Yolo：从RGB-D图像中检测3D对象

Expandable YOLO: 3D Object Detection from RGB-D Images

论文作者

Takahashi, Masahiro, Moro, Alessandro, Ji, Yonghoon, Umeda, Kazunori

论文摘要

本文旨在构建一个轻重的对象检测器，该对象检测器从立体声摄像机中输入深度和颜色图像。具体而言，通过将Yolov3的网络体系结构扩展到中间，可以沿深度方向输出。此外，引入了3D空间中的uninon（iou）的交集，以确认区域提取结果的准确性。在深度学习的领域中，将使用距离信息作为输入的对象检测器进行了积极研究，以利用自动驾驶。但是，常规检测器具有较大的网络结构，实时属性受到损害。如上所述构建的检测器的有效性是使用数据集验证的。该实验的结果是，提出的模型能够输出3D边界框并检测到隐藏的人的部分。此外，模型的处理速度为44.35 fps。

This paper aims at constructing a light-weight object detector that inputs a depth and a color image from a stereo camera. Specifically, by extending the network architecture of YOLOv3 to 3D in the middle, it is possible to output in the depth direction. In addition, Intersection over Uninon (IoU) in 3D space is introduced to confirm the accuracy of region extraction results. In the field of deep learning, object detectors that use distance information as input are actively studied for utilizing automated driving. However, the conventional detector has a large network structure, and the real-time property is impaired. The effectiveness of the detector constructed as described above is verified using datasets. As a result of this experiment, the proposed model is able to output 3D bounding boxes and detect people whose part of the body is hidden. Further, the processing speed of the model is 44.35 fps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题