论文标题
高维模型解释:公理方法
High Dimensional Model Explanations: an Axiomatic Approach
论文作者
论文摘要
复杂的黑盒机器学习模型定期用于关键决策域。这引起了几个呼吁算法解释性。文献中提出的许多解释算法分别对每个功能分配了重要性。但是,这种解释无法捕获一组特征的关节效应。确实,到目前为止,很少有作品正式分析了高维模型的解释。在本文中,我们提出了一种新型的高维模型解释方法,该方法捕获了特征子集的关节效应。 我们提出了一个新的公理化,以概括Banzhaf指数。我们的方法也可以被视为通过高阶多项式对黑盒模型的近似。换句话说,这项工作证明了通用Banzhaf索引用作模型解释合理的合理性,证明它独特地满足了一组天然的Desiderata,并且它是黑盒模型的最佳局部近似值。 我们对衡量标准的经验评估突出了它如何设法捕获理想的行为,而其他不满足我们公理的措施则以无法预测的方式行事。
Complex black-box machine learning models are regularly used in critical decision-making domains. This has given rise to several calls for algorithmic explainability. Many explanation algorithms proposed in literature assign importance to each feature individually. However, such explanations fail to capture the joint effects of sets of features. Indeed, few works so far formally analyze high-dimensional model explanations. In this paper, we propose a novel high dimension model explanation method that captures the joint effect of feature subsets. We propose a new axiomatization for a generalization of the Banzhaf index; our method can also be thought of as an approximation of a black-box model by a higher-order polynomial. In other words, this work justifies the use of the generalized Banzhaf index as a model explanation by showing that it uniquely satisfies a set of natural desiderata and that it is the optimal local approximation of a black-box model. Our empirical evaluation of our measure highlights how it manages to capture desirable behavior, whereas other measures that do not satisfy our axioms behave in an unpredictable manner.