论文标题

HMRL:稀疏奖励加强学习问题的超级摩托学学习问题

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

论文作者

Hua, Yun, Wang, Xiangfeng, Jin, Bo, Li, Wenhao, Yan, Junchi, He, Xiaofeng, Zha, Hongyuan

论文摘要

尽管现有的元强化学习方法取得了成功,但他们仍然很难有效地学习元素策略,以实现稀疏奖励问题的RL问题。在这方面,我们为稀疏的奖励RL问题开发了一种新型的元加强学习框架,称为Hyper-Meta RL(HMRL)。它由三个模块组成,包括交叉环境元状态嵌入模块,该模块构建了一个常见的元状态空间以适应不同的环境。基于元状态的环境特定的元奖励成型,通过跨环境知识互补性有效地扩展了原始的稀疏奖励轨迹,因此,元政策通过形状的元奖励实现了更好的概括和效率。稀疏回报环境的实验表明,HMRL对可转让性和政策学习效率的优越性。

In spite of the success of existing meta reinforcement learning methods, they still have difficulty in learning a meta policy effectively for RL problems with sparse reward. In this respect, we develop a novel meta reinforcement learning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems. It is consisted with three modules including the cross-environment meta state embedding module which constructs a common meta state space to adapt to different environments; the meta state based environment-specific meta reward shaping which effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity and as a consequence the meta policy achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse-reward environments show the superiority of HMRL on both transferability and policy learning efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源