关于可解释的机器学习的安全：最大的偏差方法

论文标题

关于可解释的机器学习的安全：最大的偏差方法

On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach

论文作者

Wei, Dennis, Nair, Rahul, Dhurandhar, Amit, Varshney, Kush R., Daly, Elizabeth M., Singh, Moninder

论文摘要

可解释和可解释的机器学习已经引起了人们的兴趣激增。我们将重点放在安全方面，是激增背后的关键动机，并使可解释性和安全性之间的关系更加定量。为了评估安全性，我们通过优化问题介绍了最大偏差的概念，以从被认为是安全的参考模型中找到监督学习模型的最大偏差。然后，我们展示解释性如何促进此安全性评估。对于包括决策树，广义线性和加性模型在内的模型，可以精确有效地计算最大偏差。对于不被认为是可解释的树合奏，离散优化技术仍然可以提供信息界限。对于更广泛的分段Lipschitz功能，我们利用多军匪徒文献来表明可解释性在最大偏差上产生更严格的（遗憾）。我们提出了案例研究，包括一项关于抵押贷款批准的研究，以说明我们的方法以及有关模型的见解，这些模型可能从偏差最大化中获得。

Interpretable and explainable machine learning has seen a recent surge of interest. We focus on safety as a key motivation behind the surge and make the relationship between interpretability and safety more quantitative. Toward assessing safety, we introduce the concept of maximum deviation via an optimization problem to find the largest deviation of a supervised learning model from a reference model regarded as safe. We then show how interpretability facilitates this safety assessment. For models including decision trees, generalized linear and additive models, the maximum deviation can be computed exactly and efficiently. For tree ensembles, which are not regarded as interpretable, discrete optimization techniques can still provide informative bounds. For a broader class of piecewise Lipschitz functions, we leverage the multi-armed bandit literature to show that interpretability produces tighter (regret) bounds on the maximum deviation. We present case studies, including one on mortgage approval, to illustrate our methods and the insights about models that may be obtained from deviation maximization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题