通过SIM到现实的增强学习，动态的两足动作

论文标题

通过SIM到现实的增强学习，动态的两足动作

Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning

论文作者

Yu, Fangzhou, Batke, Ryan, Dao, Jeremy, Hurst, Jonathan, Green, Kevin, Fern, Alan

论文摘要

为了使腿部机器人与人类和动物的运动能力相匹配，它们不仅必须产生强大的周期性步行和跑步，而且还必须在名义运动步态和更专业的瞬态动作之间无缝切换。尽管最近在双足机器人的控制方面取得了进步，但几乎没有重点放在产生高度动态的行为上。利用强化学习制定政策来控制腿部机器人的最新工作表明，在产生强大的步行行为方面取得了成功。但是，这些学习的政策很难在单个网络上表达多种不同行为。受腿部机器人的常规优化控制技术的启发，这项工作应用了一个经常性的策略来执行四步，90度转弯，使用从优化的单个刚性车身模型轨迹生成的参考数据训练。我们提出了一个新型的培训框架，该培训框架使用结尾终端奖励从预先计算的轨迹数据中学习特定行为，并证明了双皮德机器人Cassie上的硬件成功转移。

For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcement learning to produce policies for control of legged robots have demonstrated success in producing robust walking behaviors. However, these learned policies have difficulty expressing a multitude of different behaviors on a single network. Inspired by conventional optimization-based control techniques for legged robots, this work applies a recurrent policy to execute four-step, 90 degree turns trained using reference data generated from optimized single rigid body model trajectories. We present a novel training framework using epilogue terminal rewards for learning specific behaviors from pre-computed trajectory data and demonstrate a successful transfer to hardware on the bipedal robot Cassie.

下载PDF全文

下载文献需遵守相关版权规定

论文标题