Q学习持续学习方法的研究

论文标题

Q学习持续学习方法的研究

A Study of Continual Learning Methods for Q-Learning

论文作者

Bagus, Benedikt, Gepperth, Alexander

论文摘要

我们介绍了一项关于在增强学习（RL）方案中使用持续学习方法（CL）方法的实证研究，据我们所知，该方法以前尚未得到描述。 CL是一个非常活跃的研究主题，与非平稳数据分布下的机器学习有关。尽管这自然适用于RL，但使用专用CL方法仍然很少见。这可能是由于以下事实：CL方法通常将CL问题分解为固定分布的不结合子任务，即这些子任务的开始是已知的，并且子任务是非矛盾的。在这项研究中，我们对RL问题中选定的CL方法进行了经验比较，在RL问题中，身体模拟的机器人必须按视力遵循赛马场。为了使CL方法适用，我们限制了RL设置，并引入了已知发作的非冲突子任务，但是，这些子任务并不脱节，并且从学习者的角度来看，其分布仍然非平稳。我们的结果表明，与“经验重播”的基线技术相比，专用的CL方法可以显着改善学习。

We present an empirical study on the use of continual learning (CL) methods in a reinforcement learning (RL) scenario, which, to the best of our knowledge, has not been described before. CL is a very active recent research topic concerned with machine learning under non-stationary data distributions. Although this naturally applies to RL, the use of dedicated CL methods is still uncommon. This may be due to the fact that CL methods often assume a decomposition of CL problems into disjoint sub-tasks of stationary distribution, that the onset of these sub-tasks is known, and that sub-tasks are non-contradictory. In this study, we perform an empirical comparison of selected CL methods in a RL problem where a physically simulated robot must follow a racetrack by vision. In order to make CL methods applicable, we restrict the RL setting and introduce non-conflicting subtasks of known onset, which are however not disjoint and whose distribution, from the learner's point of view, is still non-stationary. Our results show that dedicated CL methods can significantly improve learning when compared to the baseline technique of "experience replay".

下载PDF全文

下载文献需遵守相关版权规定

论文标题