看看你看的地方！显着性指导的Q-NETWORKS用于视觉增强学习中的概括

论文标题

看看你看的地方！显着性指导的Q-NETWORKS用于视觉增强学习中的概括

Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning

论文作者

Bertoin, David, Zouitine, Adil, Zouitine, Mehdi, Rachelson, Emmanuel

论文摘要

尽管在模拟的视觉控制任务中效率出色，但深厚的增强学习政策表现出令人失望的能力，可以在输入培训图像中跨越跨干扰的能力。图像统计或分心背景元素的变化是防止这种控制策略的概括和现实世界中适用性的陷阱。我们阐述了这样的直觉，即良好的视觉策略应该能够确定哪些像素对其决策很重要，并保留对图像跨图像的重要信息来源的识别。这意味着对具有较小概括差距的政策进行培训应集中在如此重要的像素上，而忽略其他像素。这导致引入显着性引导的Q-Networks（SGQN），这是一种通用的视觉增强学习方法，与任何值函数学习方法兼容。 SGQN极大地提高了软演员批评者的概括能力，并且在DeepMind Control Genralization基准上胜过现有的现有方法，为训练效率，概括差距和政策解释性提供了新的参考。

Deep reinforcement learning policies, despite their outstanding efficiency in simulated visual control tasks, have shown disappointing ability to generalize across disturbances in the input training images. Changes in image statistics or distracting background elements are pitfalls that prevent generalization and real-world applicability of such control policies. We elaborate on the intuition that a good visual policy should be able to identify which pixels are important for its decision, and preserve this identification of important sources of information across images. This implies that training of a policy with small generalization gap should focus on such important pixels and ignore the others. This leads to the introduction of saliency-guided Q-networks (SGQN), a generic method for visual reinforcement learning, that is compatible with any value function learning method. SGQN vastly improves the generalization capability of Soft Actor-Critic agents and outperforms existing stateof-the-art methods on the Deepmind Control Generalization benchmark, setting a new reference in terms of training efficiency, generalization gap, and policy interpretability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题