剩余技能政策：学习一个基于适应技巧的动作空间，用于增强机器人技术

论文标题

剩余技能政策：学习一个基于适应技巧的动作空间，用于增强机器人技术

Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics

论文作者

Rana, Krishan, Xu, Ming, Tidd, Brendan, Milford, Michael, Sünderhauf, Niko

论文摘要

基于技能的强化学习（RL）已成为一种有希望的策略，以利用先验知识来加速机器人学习。技能通常是从专家示范中提取的，并嵌入到潜在空间中，可以由高级RL代理将其作为动作采样。但是，这个技能空间是广阔的，并不是所有技能都与给定的机器人状态有关，从而使探索变得困难。此外，下游RL代理仅限于学习与用于构建技能空间的任务在结构上相似的任务。我们首先使用国家条件的生成模型在技能空间中提出加速探索，以直接将高级代理偏向基于先前的经验与给定状态相关的技能。接下来，我们提出了一种低级剩余政策，以适应精细的技能适应，以使下游RL代理适应看不见的任务变化。最后，我们验证了四个具有挑战性的操纵任务的方法，这些任务与用于建立技能空间的任务不同，这表明了我们在跨任务变化中学习的能力，同时显着加速了勘探，表现优于先前的工作。代码和视频可在我们的项目网站上找到：https：//krishanrana.github.io/reskill。

Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.

下载PDF全文

下载文献需遵守相关版权规定

论文标题