用随机的重量猜测分析加强学习基准

论文标题

用随机的重量猜测分析加强学习基准

Analyzing Reinforcement Learning Benchmarks with Random Weight Guessing

论文作者

Oller, Declan, Glasmachers, Tobias, Cuccu, Giuseppe

论文摘要

我们提出了一种基于得分分布的标准强化学习（RL）基准测试的新颖方法，用于分析和可视化标准加固学习的复杂性。通过随机猜测其参数，然后对基准任务进行评估来生成大量策略网络；他们汇总结果的研究提供了对基准复杂性的见解。我们的方法通过完全避免学习来确保评估的客观性：策略网络参数是使用随机权重猜测（RWG）生成的，使我们的方法对（i）经典的RL设置，（ii）任何学习算法，以及（iii）高参数调谐。我们表明，这种方法隔离了环境的复杂性，突出了特定类型的挑战，并为对任务难度的统计分析提供了适当的基础。我们测试了Openai体育馆的各种经典控制基准测试的方法，在那里我们表明，小型未经训练的网络可以为各种任务提供强大的基线。即使没有逐步学习，产生的网络也经常表现出良好的性能，偶然地突出了一些流行的基准测试的微不足道。

We propose a novel method for analyzing and visualizing the complexity of standard reinforcement learning (RL) benchmarks based on score distributions. A large number of policy networks are generated by randomly guessing their parameters, and then evaluated on the benchmark task; the study of their aggregated results provide insights into the benchmark complexity. Our method guarantees objectivity of evaluation by sidestepping learning altogether: the policy network parameters are generated using Random Weight Guessing (RWG), making our method agnostic to (i) the classic RL setup, (ii) any learning algorithm, and (iii) hyperparameter tuning. We show that this approach isolates the environment complexity, highlights specific types of challenges, and provides a proper foundation for the statistical analysis of the task's difficulty. We test our approach on a variety of classic control benchmarks from the OpenAI Gym, where we show that small untrained networks can provide a robust baseline for a variety of tasks. The networks generated often show good performance even without gradual learning, incidentally highlighting the triviality of a few popular benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题