论文标题
与想象力的融合:基于模型的合作多代理增强学习
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
论文作者
论文摘要
最近,使用相同的计算预算和单一环境中的培训时间,基于模型的代理的性能比无模型的代理更好。但是,由于多代理系统的复杂性,很难学习环境模型。当将基于模型的方法应用于多代理任务时,重要的复合误差可能会阻碍学习过程。本文提出了一种基于价值分解方法的基于隐式模型的多代理增强学习方法。在这种方法下,代理可以与学习的虚拟环境进行交互,并根据潜在空间中想象的未来状态评估当前状态的价值,从而使代理具有远见。我们的方法可以应用于任何多代理价值分解方法。实验结果表明,我们的方法提高了不同部分可观察到的马尔可夫决策过程域的样本效率。
Recently, model-based agents have achieved better performance than model-free ones using the same computational budget and training time in single-agent environments. However, due to the complexity of multi-agent systems, it is tough to learn the model of the environment. The significant compounding error may hinder the learning process when model-based methods are applied to multi-agent tasks. This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. Under this method, agents can interact with the learned virtual environment and evaluate the current state value according to imagined future states in the latent space, making agents have the foresight. Our approach can be applied to any multi-agent value decomposition method. The experimental results show that our method improves the sample efficiency in different partially observable Markov decision process domains.