贝叶斯元强化学习流量信号控制

论文标题

贝叶斯元强化学习流量信号控制

Bayesian Meta-reinforcement Learning for Traffic Signal Control

论文作者

Zou, Yayi, Qin, Zhiwei

论文摘要

近年来，与传统的控制方法相比，元素增强学习方法的兴趣越来越多，该方法取得了更好的性能。但是，以前的方法在复杂情况下在训练过程中的适应性和稳定性方面缺乏鲁棒性，这在很大程度上限制了其在现实世界交通信号控制中的应用。在本文中，我们提出了一种新型基于价值的贝叶斯元强化学习框架BM-DQN，以通过利用从现有场景中学到的良好训练的先验知识来稳健地加速学习过程。该框架基于我们提出的对梯度 - EM贝叶斯元学习的快速适应变化以及DQN的快速更新优势，该框架可以快速适应具有持续学习能力和稳健性至不确定性的新场景。关于限制的2D导航和流量信号控制的实验表明，在新方案中，我们提出的框架比以前的方法更快，更牢固，具体来说，在异构方案中，持续学习能力更好。

In recent years, there has been increasing amount of interest around meta reinforcement learning methods for traffic signal control, which have achieved better performance compared with traditional control methods. However, previous methods lack robustness in adaptation and stability in training process in complex situations, which largely limits its application in real-world traffic signal control. In this paper, we propose a novel value-based Bayesian meta-reinforcement learning framework BM-DQN to robustly speed up the learning process in new scenarios by utilizing well-trained prior knowledge learned from existing scenarios. This framework is based on our proposed fast-adaptation variation to Gradient-EM Bayesian Meta-learning and the fast-update advantage of DQN, which allows for fast adaptation to new scenarios with continual learning ability and robustness to uncertainty. The experiments on restricted 2D navigation and traffic signal control show that our proposed framework adapts more quickly and robustly in new scenarios than previous methods, and specifically, much better continual learning ability in heterogeneous scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题