论文标题

基于课程的不对称多任务增强学习

Curriculum-based Asymmetric Multi-task Reinforcement Learning

论文作者

Huang, Hanchi, Ye, Deheng, Shen, Li, Liu, Wei

论文摘要

我们介绍了CAMRL,这是第一个基于课程的非对称多任务学习(AMTL)算法,用于处理多个增强学习(RL)任务。为了减轻基于课程的AMTL定制一次性训练顺序的负面影响,CAMRL根据培训时间,整体绩效以及任务之间的绩效差距在平行单任务RL和非对称多任务RL(MTRL)之间切换其训练模式。为了灵活地利用多源的先验知识并减少AMTL中的负转移,我们自定义具有多个可区分排名功能的复合损失,并通过交替优化和Frank-Wolfe算法来优化损失。基于不确定性的高参数自动调整还可以在优化过程中消除艰苦的超参数分析的需求。通过优化复合损失,CAMRL预测了下一个训练任务,并不断重新审视传输矩阵和网络权重。我们已经对多任务RL进行了广泛的基准进行了实验,涵盖了Gym-Minigrid,Meta-World,Atari视频游戏,基于视觉的Pybullet任务和RLBENCH,以显示CAMRL对相应的单任务RL Algorithm和Team-Teas-Teas-Teas-Teas-The-ArtArt Mtrl Mtrl Algorith的改进。该代码可在以下网址找到:https://github.com/huanghanchi/camrl

We introduce CAMRL, the first curriculum-based asymmetric multi-task learning (AMTL) algorithm for dealing with multiple reinforcement learning (RL) tasks altogether. To mitigate the negative influence of customizing the one-off training order in curriculum-based AMTL, CAMRL switches its training mode between parallel single-task RL and asymmetric multi-task RL (MTRL), according to an indicator regarding the training time, the overall performance, and the performance gap among tasks. To leverage the multi-sourced prior knowledge flexibly and to reduce negative transfer in AMTL, we customize a composite loss with multiple differentiable ranking functions and optimize the loss through alternating optimization and the Frank-Wolfe algorithm. The uncertainty-based automatic adjustment of hyper-parameters is also applied to eliminate the need of laborious hyper-parameter analysis during optimization. By optimizing the composite loss, CAMRL predicts the next training task and continuously revisits the transfer matrix and network weights. We have conducted experiments on a wide range of benchmarks in multi-task RL, covering Gym-minigrid, Meta-world, Atari video games, vision-based PyBullet tasks, and RLBench, to show the improvements of CAMRL over the corresponding single-task RL algorithm and state-of-the-art MTRL algorithms. The code is available at: https://github.com/huanghanchi/CAMRL

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源