Google Research Football中基于图形神经网络的代理

论文标题

Google Research Football中基于图形神经网络的代理

Graph Neural Network based Agent in Google Research Football

论文作者

Niu, Yizhan, Liu, Jinglong, Shi, Yuhao, Zhu, Jiren

论文摘要

深度神经网络（DNN）可以近似增强学习的价值功能或政策，这使得增强学习算法更强大。但是，某些DNN，例如卷积神经网络（CNN），不能提取足够的信息或花费太长时间，无法在强化学习的特定情况下从输入中获得足够的功能。例如，Google Research Football的输入数据是一种训练代理参加足球的强化学习环境，是球员位置的小地图。该信息不仅包含在玩家的坐标中，还包含在不同玩家之间的关系中。 CNN不能提取足够的信息，也不能花太长时间训练。为了解决这个问题，本文提出了一个以图神经网络（GNN）为模型的深Q学习网络（DQN）。 GNN将输入数据转换为一个更好地代表足球运动员位置的图形，以便提取更多不同玩家之间相互作用的信息。使用两个GNN近似其本地和目标价值功能，该DQN允许玩家通过使用价值功能来查看每个预期的动作的前瞻性值，从而从经验中学习。拟议的模型通过优于其他步骤少得多的DRL模型，证明了GNN在足球比赛中的力量。

Deep neural networks (DNN) can approximate value functions or policies for reinforcement learning, which makes the reinforcement learning algorithms more powerful. However, some DNNs, such as convolutional neural networks (CNN), cannot extract enough information or take too long to obtain enough features from the inputs under specific circumstances of reinforcement learning. For example, the input data of Google Research Football, a reinforcement learning environment which trains agents to play football, is the small map of players' locations. The information is contained not only in the coordinates of players, but also in the relationships between different players. CNNs can neither extract enough information nor take too long to train. To address this issue, this paper proposes a deep q-learning network (DQN) with a graph neural network (GNN) as its model. The GNN transforms the input data into a graph which better represents the football players' locations so that it extracts more information of the interactions between different players. With two GNNs to approximate its local and target value functions, this DQN allows players to learn from their experience by using value functions to see the prospective value of each intended action. The proposed model demonstrated the power of GNN in the football game by outperforming other DRL models with significantly fewer steps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题