论文标题

对部分可观察到的MDP的深度主动推断

Deep Active Inference for Partially Observable MDPs

论文作者

van der Himst, Otto, Lanillos, Pablo

论文摘要

已经提出了深入的主动推论,是一种可扩展的感知和行动方法,涉及大型政策和国家空间。但是,当前模型仅限于完全可观察到的域。在本文中,我们描述了一个深层的活跃推理模型,该模型可以直接从高维感觉输入中学习成功的策略。深度学习体系结构优化了预期自由能的变体,并通过变异自动编码器来编码连续状态表示。我们在OpenAI基准中表明,我们的方法具有可比性或更好的性能,而Deep Q-Learning是一种最先进的深度强化学习算法。

Deep active inference has been proposed as a scalable approach to perception and action that deals with large policy and state spaces. However, current models are limited to fully observable domains. In this paper, we describe a deep active inference model that can learn successful policies directly from high-dimensional sensory inputs. The deep learning architecture optimizes a variant of the expected free energy and encodes the continuous state representation by means of a variational autoencoder. We show, in the OpenAI benchmark, that our approach has comparable or better performance than deep Q-learning, a state-of-the-art deep reinforcement learning algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源