论文标题
多模式专家网络用于自动驾驶
Multi-modal Experts Network for Autonomous Driving
论文作者
论文摘要
从感官数据中端到端的学习显示了自主驾驶中有希望的结果。尽管采用许多传感器可以增强世界的感知,并应导致自动驾驶汽车的更健壮和可靠的行为,但训练和部署此类网络的挑战,并且在考虑的环境中至少遇到了两个问题。第一个是随着传感设备的数量增加计算复杂性。另一个是网络现象过于适应最简单,最有用的输入。我们通过一个小说,精心量身定制的多模式专家网络体系结构来应对这两个挑战,并提出了多阶段培训程序。该网络包含一个门控机制,该机制使用混合的离散连续策略在每个推理时间步骤中选择最相关的输入。我们证明了在配备三个摄像头和一辆激光镜头的1/6比例卡车上提出的方法的合理性。
End-to-end learning from sensory data has shown promising results in autonomous driving. While employing many sensors enhances world perception and should lead to more robust and reliable behavior of autonomous vehicles, it is challenging to train and deploy such network and at least two problems are encountered in the considered setting. The first one is the increase of computational complexity with the number of sensing devices. The other is the phenomena of network overfitting to the simplest and most informative input. We address both challenges with a novel, carefully tailored multi-modal experts network architecture and propose a multi-stage training procedure. The network contains a gating mechanism, which selects the most relevant input at each inference time step using a mixed discrete-continuous policy. We demonstrate the plausibility of the proposed approach on our 1/6 scale truck equipped with three cameras and one LiDAR.