论文标题
校准解释
Calibrate to Interpret
论文作者
论文摘要
值得信赖的机器学习正在推动大量ML社区工作,以提高ML接受和采用。值得信赖的机器学习的主要方面是以下内容:公平,不确定性,鲁棒性,解释性和正式保证。这些各个领域中的每个领域都获得了ML社区的兴趣,这是相关出版物数量可见的。但是,很少有作品能够解决这些领域之间的互连。在本文中,我们通过研究校准与解释之间的关系,展示了不确定性和解释性之间的第一个联系。由于给定模型的校准改变了其分数的方式,并且解释方法通常依赖于这些分数,因此可以肯定地假设模型的置信度与我们解释这种模型的能力相互作用。在本文中,我们在对图像分类任务进行培训的网络中显示,解释对置信度敏感。它使我们提出了一种简单的做法来改善解释结果:校准解释。
Trustworthy machine learning is driving a large number of ML community works in order to improve ML acceptance and adoption. The main aspect of trustworthy machine learning are the followings: fairness, uncertainty, robustness, explainability and formal guaranties. Each of these individual domains gains the ML community interest, visible by the number of related publications. However few works tackle the interconnection between these fields. In this paper we show a first link between uncertainty and explainability, by studying the relation between calibration and interpretation. As the calibration of a given model changes the way it scores samples, and interpretation approaches often rely on these scores, it seems safe to assume that the confidence-calibration of a model interacts with our ability to interpret such model. In this paper, we show, in the context of networks trained on image classification tasks, to what extent interpretations are sensitive to confidence-calibration. It leads us to suggest a simple practice to improve the interpretation outcomes: Calibrate to Interpret.