论文标题
Q学习是有效的吗?扩展分析
Is Q-Learning Provably Efficient? An Extended Analysis
论文作者
论文摘要
这项工作扩展了对本文中介绍的理论结果的分析,这证明是有效的吗? Jin等人的作者。我们包括一项相关研究的调查,以将与可能与模型的强化学习最重要的线索相关的理论保证的必要性进行背景化。我们还阐述了证明中使用的推理,以突出显示主要结果的关键步骤,表明使用UCB探索的Q学习实现了样本效率,该效率与任何基于模型的方法都可以实现的最佳遗憾相匹配。
This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.