利用班级抽象通过剩余策略梯度方法进行常识性增强学习

论文标题

利用班级抽象通过剩余策略梯度方法进行常识性增强学习

Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods

论文作者

Höpner, Niklas, Tiddi, Ilaria, van Hoof, Herke

论文摘要

使加强学习（RL）代理能够利用知识库，同时从经验中学习有望提高知识密集型领域的RL。但是，事实证明，它很难利用并非为环境量身定制的知识。我们建议使用开源知识图中存在的子类关系来抽象远离特定对象。我们开发了一种剩余的策略梯度方法，该方法能够整合类层次结构中不同抽象级别的知识。我们的方法可提高样本效率和概括为常见游戏中未见的对象，但我们还研究了故障模式，例如在提取的类知识中的噪音过多或几乎没有类结构的环境。

Enabling reinforcement learning (RL) agents to leverage a knowledge base while learning from experience promises to advance RL in knowledge intensive domains. However, it has proven difficult to leverage knowledge that is not manually tailored to the environment. We propose to use the subclass relationships present in open-source knowledge graphs to abstract away from specific objects. We develop a residual policy gradient method that is able to integrate knowledge across different abstraction levels in the class hierarchy. Our method results in improved sample efficiency and generalisation to unseen objects in commonsense games, but we also investigate failure modes, such as excessive noise in the extracted class knowledge or environments with little class structure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题