使用分布式智能边缘传感器的3D语义场景感知

论文标题

使用分布式智能边缘传感器的3D语义场景感知

3D Semantic Scene Perception using Distributed Smart Edge Sensors

论文作者

Bultmann, Simon, Behnke, Sven

论文摘要

我们为3D语义场景感知提供了一个系统，该系统由分布式智能边缘传感器网络组成。传感器节点基于嵌入式CNN推理加速器和RGB-D和热摄像机。有效的视力CNN模型用于对象检测，语义分割和人姿势估计实时运行设备。 2D人关键点估计，随着RGB-D深度估计值的增强以及语义注释的点云从传感器流到了中央后端，其中多种观点融合到了同类中心的3D语义场景模型中。由于图像解释是在本地计算的，因此仅通过网络发送语义信息。原始图像保留在传感器板上，大大降低了所需的带宽，并减轻了观察到的人的隐私风险。我们在实验室中挑战现实世界中的多人场景中评估了拟议的系统。提出的感知系统提供了一个完整的场景视图，其中包含语义注释的3D几何形状，并实时估计多个人的3D姿势。

We present a system for 3D semantic scene perception consisting of a network of distributed smart edge sensors. The sensor nodes are based on an embedded CNN inference accelerator and RGB-D and thermal cameras. Efficient vision CNN models for object detection, semantic segmentation, and human pose estimation run on-device in real time. 2D human keypoint estimations, augmented with the RGB-D depth estimate, as well as semantically annotated point clouds are streamed from the sensors to a central backend, where multiple viewpoints are fused into an allocentric 3D semantic scene model. As the image interpretation is computed locally, only semantic information is sent over the network. The raw images remain on the sensor boards, significantly reducing the required bandwidth, and mitigating privacy risks for the observed persons. We evaluate the proposed system in challenging real-world multi-person scenes in our lab. The proposed perception system provides a complete scene view containing semantically annotated 3D geometry and estimates 3D poses of multiple persons in real time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题