论文标题
解释说明:深网的公理特征交互
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks
论文作者
论文摘要
最近的工作在解释神经网络行为方面表现出了巨大的希望。特别是,特征归因方法解释了哪些功能对于模型在给定输入上的预测最为重要。但是,对于许多任务,仅知道哪些功能对模型的预测很重要,可能无法提供足够的见解来理解模型行为。模型中功能之间的相互作用可以更好地帮助我们不仅了解模型,还可以帮助我们了解某些功能比其他功能更重要的原因。在这项工作中,我们介绍了集成的黑森人,这是集成梯度的扩展,该梯度解释了神经网络中的成对特征相互作用。综合的黑森人克服了以前方法的几种理论局限性来解释相互作用,并且与此类先前的方法不同,不仅限于特定的体系结构或神经网络类别。此外,我们发现当特征数量较大时,我们的方法比现有方法快,并且在现有定量基准测试方面的先前方法胜过以前的方法。可在https://github.com/suinleelab/path_explain上找到代码
Recent work has shown great promise in explaining neural network behavior. In particular, feature attribution methods explain which features were most important to a model's prediction on a given input. However, for many tasks, simply knowing which features were important to a model's prediction may not provide enough insight to understand model behavior. The interactions between features within the model may better help us understand not only the model, but also why certain features are more important than others. In this work, we present Integrated Hessians, an extension of Integrated Gradients that explains pairwise feature interactions in neural networks. Integrated Hessians overcomes several theoretical limitations of previous methods to explain interactions, and unlike such previous methods is not limited to a specific architecture or class of neural network. Additionally, we find that our method is faster than existing methods when the number of features is large, and outperforms previous methods on existing quantitative benchmarks. Code available at https://github.com/suinleelab/path_explain