论文标题
部分可观测时空混沌系统的无模型预测
Shapley value-based approaches to explain the robustness of classifiers in machine learning
论文作者
论文摘要
使用算法 - 不足的方法是一个新兴领域,用于解释各个特征对预测结果的贡献。尽管重点放在解释预测本身上,但已经做了一些解释这些模型的鲁棒性,即,每个功能如何有助于实现这种鲁棒性。在本文中,我们提出使用沙普利值来解释每个特征对模型鲁棒性的贡献,该功能以接收器操作特征(ROC)曲线以及ROC曲线(AUC)下的面积来衡量。在一个说明性的例子的帮助下,我们证明了解释ROC曲线并可视化这些曲线中的不确定性的拟议思想。对于不平衡的数据集,使用Precision-Recall曲线(PRC)被认为更合适,因此我们还演示了如何借助Shapley值解释PRC。鲁棒性的解释可以通过多种方式帮助分析师,例如,它可以通过识别可以删除的不相关特征来帮助选择功能选择以降低计算复杂性。它还可以帮助确定具有重要贡献或负面贡献的功能。
The use of algorithm-agnostic approaches is an emerging area of research for explaining the contribution of individual features towards the predicted outcome. Whilst there is a focus on explaining the prediction itself, a little has been done on explaining the robustness of these models, that is, how each feature contributes towards achieving that robustness. In this paper, we propose the use of Shapley values to explain the contribution of each feature towards the model's robustness, measured in terms of Receiver-operating Characteristics (ROC) curve and the Area under the ROC curve (AUC). With the help of an illustrative example, we demonstrate the proposed idea of explaining the ROC curve, and visualising the uncertainties in these curves. For imbalanced datasets, the use of Precision-Recall Curve (PRC) is considered more appropriate, therefore we also demonstrate how to explain the PRCs with the help of Shapley values. The explanation of robustness can help analysts in a number of ways, for example, it can help in feature selection by identifying the irrelevant features that can be removed to reduce the computational complexity. It can also help in identifying the features having critical contributions or negative contributions towards robustness.