论文标题

重建行动以解释深度强化学习

Reconstructing Actions To Explain Deep Reinforcement Learning

论文作者

Chen, Xuan, Wang, Zifan, Fan, Yucai, Jin, Bonan, Mardziel, Piotr, Joe-Wong, Carlee, Datta, Anupam

论文摘要

特征归因一直是用深神经网络(DNNS)解释投入特征的基础构建基础,但是当应用于深度强化学习(RL)时,面临着新的挑战。我们提出了一种新的方法来解释深层RL动作,通过定义一类\ emph {Action Rectruction}的效果,这些方法可以模仿一个网络,该网络是一个深层的网络。这种方法使我们能够回答比直接应用DNN归因方法更复杂的解释性问题,在构建我们的动作重建时,我们适应了\ emph {行为级属性}。它还允许我们定义\ emph {nenseal},这是一种定量评估我们方法的解释性的度量。我们对各种ATARI游戏的实验表明,基于扰动的归因方法比替代归因方法更适合重建动作来解释深度RL药物,并且与使用注意力相比,表现出更大的\ emph {nocalsibal}。我们进一步表明,行动重建使我们能够演示深层代理商如何学习Pac-Man游戏。

Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL. This approach allows us to answer more complex explainability questions than direct application of DNN attribution methods, which we adapt to \emph{behavior-level attributions} in building our action reconstructions. It also allows us to define \emph{agreement}, a metric for quantitatively evaluating the explainability of our methods. Our experiments on a variety of Atari games suggest that perturbation-based attribution methods are significantly more suitable in reconstructing actions to explain the deep RL agent than alternative attribution methods, and show greater \emph{agreement} than existing explainability work utilizing attention. We further show that action reconstruction allows us to demonstrate how a deep agent learns to play Pac-Man game.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源