论文标题
改进的记忆学习
Improved Memories Learning
论文作者
论文摘要
我们提出了改进的记忆学习(IMEL),这是一种新颖的算法,将增强学习(RL)变成有监督的学习(SL)问题,并将神经网络(NN)的作用划定在插值中。 IMEL由两个组成部分组成。第一个是经验的储藏。每种经验都是根据该政策的非参数程序改进来更新的,该政策是根据有限的单样本蒙特卡洛估计计算的。第二个是NN回归剂,它可以作为输入的改进储层(上下文点)的经验,并通过插值计算策略。 NN学会了测量状态之间的相似性,以通过平均体验来计算长期预测,而不是通过编码NN参数中的问题结构。我们提出了初步结果,并提出IMEL作为评估更复杂模型和电感偏见的优点的基线方法。
We propose Improved Memories Learning (IMeL), a novel algorithm that turns reinforcement learning (RL) into a supervised learning (SL) problem and delimits the role of neural networks (NN) to interpolation. IMeL consists of two components. The first is a reservoir of experiences. Each experience is updated based on a non-parametric procedural improvement of the policy, computed as a bounded one-sample Monte Carlo estimate. The second is a NN regressor, which receives as input improved experiences from the reservoir (context points) and computes the policy by interpolation. The NN learns to measure the similarity between states in order to compute long-term forecasts by averaging experiences, rather than by encoding the problem structure in the NN parameters. We present preliminary results and propose IMeL as a baseline method for assessing the merits of more complex models and inductive biases.