论文标题
长期计划通过对自动无人机进行深入的强化学习
Long-Term Planning with Deep Reinforcement Learning on Autonomous Drones
论文作者
论文摘要
在本文中,我们研究了基于现实生活中举行的无人机赛车比赛的长期计划场景。我们在Neurips 2019上为“无人机:无人机赛车比赛”创建的框架进行了该实验。赛车环境是使用Microsoft的Airsim无人机赛车实验室创建的。在我们案例中,一种模拟的四极管是一种增强学习代理,已通过策略近端优化(PPO)算法训练,能够成功地与正在运行经典路径计划算法的另一个模拟四极管竞争。代理观测值包括来自IMU传感器的数据,通过仿真和对手无人机GPS信息获得的无人机的GPS坐标。在培训期间,使用对手无人机GPS信息有助于处理复杂的状态空间,作为专家指导,可以进行有效且稳定的培训过程。本文中执行的所有实验均可在我们的GitHub存储库中使用代码找到并复制
In this paper, we study a long-term planning scenario that is based on drone racing competitions held in real life. We conducted this experiment on a framework created for "Game of Drones: Drone Racing Competition" at NeurIPS 2019. The racing environment was created using Microsoft's AirSim Drone Racing Lab. A reinforcement learning agent, a simulated quadrotor in our case, has trained with the Policy Proximal Optimization(PPO) algorithm was able to successfully compete against another simulated quadrotor that was running a classical path planning algorithm. Agent observations consist of data from IMU sensors, GPS coordinates of drone obtained through simulation and opponent drone GPS information. Using opponent drone GPS information during training helps dealing with complex state spaces, serving as expert guidance allows for efficient and stable training process. All experiments performed in this paper can be found and reproduced with code at our GitHub repository