通过任务时间逻辑和深度强化学习的系统概括

论文标题

通过任务时间逻辑和深度强化学习的系统概括

Systematic Generalisation through Task Temporal Logic and Deep Reinforcement Learning

论文作者

León, Borja G., Shanahan, Murray, Belardinelli, Francesco

论文摘要

这项工作介绍了一种神经符号剂，将深度加固学习（DRL）与时间逻辑（TL）结合在一起，以实现系统的零射击，即从未见过的正式指定说明的概括。特别是，我们提出了一个神经符号框架，其中符号模块将TL规范转换为有助于训练靶向概括的DRL代理的形式，而神经模块则系统地学习以解决给定的任务。我们研究了在不同环境中系统学习的出现，并发现卷积层的体系结构在推广到新指令时是关键。我们还提供了证据表明，从一些培训示例中学习时，诸如否定的否定示例，这是否定的。

This work introduces a neuro-symbolic agent that combines deep reinforcement learning (DRL) with temporal logic (TL) to achieve systematic zero-shot, i.e., never-seen-before, generalisation of formally specified instructions. In particular, we present a neuro-symbolic framework where a symbolic module transforms TL specifications into a form that helps the training of a DRL agent targeting generalisation, while a neural module learns systematically to solve the given tasks. We study the emergence of systematic learning in different settings and find that the architecture of the convolutional layers is key when generalising to new instructions. We also provide evidence that systematic learning can emerge with abstract operators such as negation when learning from a few training examples, which previous research have struggled with.

下载PDF全文

下载文献需遵守相关版权规定

论文标题