基于自上而下的随机理智检查的缺点，以评估深神经网络解释

论文标题

基于自上而下的随机理智检查的缺点，以评估深神经网络解释

Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations

论文作者

Binder, Alexander, Weber, Leander, Lapuschkin, Sebastian, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech

论文摘要

虽然对解释的评估是迈向值得信赖的模型的重要一步，但需要仔细地进行，而所采用的指标需要得到充分理解。特定模型的随机测试通常被高估，并被视为选择或丢弃某些解释方法的唯一标准。为了解决该测试的缺点，我们首先要观察基于随机的理智检查[1]和模型输出忠实度度量（例如[25]）之间的解释方法的实验差距。我们确定基于模型随机化的理智检查的局限性，以评估解释。首先，我们表明，使用零像素的协方差创建的非信息归因地图很容易在这种类型的检查中获得高分。其次，我们表明自上而下的模型随机化可以保留具有高概率的正向通行激活的尺度。也就是说，即使在网络在其顶部随机化之后，具有较大激活的通道也有很高的限制性对输出做出巨大贡献。因此，只能在一定程度上期望随机化后的解释。这解释了观察到的实验差距。总而言之，这些结果表明，基于模型的理智检查是对归因方法进行排名的标准的不足。

While the evaluation of explanations is an important step towards trustworthy models, it needs to be done carefully, and the employed metrics need to be well-understood. Specifically model randomization testing is often overestimated and regarded as a sole criterion for selecting or discarding certain explanation methods. To address shortcomings of this test, we start by observing an experimental gap in the ranking of explanation methods between randomization-based sanity checks [1] and model output faithfulness measures (e.g. [25]). We identify limitations of model-randomization-based sanity checks for the purpose of evaluating explanations. Firstly, we show that uninformative attribution maps created with zero pixel-wise covariance easily achieve high scores in this type of checks. Secondly, we show that top-down model randomization preserves scales of forward pass activations with high probability. That is, channels with large activations have a high probility to contribute strongly to the output, even after randomization of the network on top of them. Hence, explanations after randomization can only be expected to differ to a certain extent. This explains the observed experimental gap. In summary, these results demonstrate the inadequacy of model-randomization-based sanity checks as a criterion to rank attribution methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题