论文标题
亚历克斯:基于积极学习的增强模型的解释性
ALEX: Active Learning based Enhancement of a Model's Explainability
论文作者
论文摘要
主动学习(AL)算法试图以自举的方式构建具有最少标记示例的有效分类器。尽管标准的AL启发式方法,例如为分类模型的注释选择这些要点最不自信的预测,但没有经验研究来查看这些启发式方法是否导致了对人类更容易解释的模型。在数据驱动的学习时代,这是追求的重要研究方向。本文描述了我们进行的工作,以开发一个AL选择功能,除了模型有效性外,还旨在提高自举步骤中模型的解释性。具体而言,我们提出的选择函数除了分类器模型外,还训练一个“解释器”模型,并有利于平均使用数据的不同部分来解释预测类的情况。最初的实验表现出令人鼓舞的趋势,表明这种启发式措施可以导致开发更有效,更可解释的端到端数据驱动的分类器。
An active learning (AL) algorithm seeks to construct an effective classifier with a minimal number of labeled examples in a bootstrapping manner. While standard AL heuristics, such as selecting those points for annotation for which a classification model yields least confident predictions, there has been no empirical investigation to see if these heuristics lead to models that are more interpretable to humans. In the era of data-driven learning, this is an important research direction to pursue. This paper describes our work-in-progress towards developing an AL selection function that in addition to model effectiveness also seeks to improve on the interpretability of a model during the bootstrapping steps. Concretely speaking, our proposed selection function trains an `explainer' model in addition to the classifier model, and favours those instances where a different part of the data is used, on an average, to explain the predicted class. Initial experiments exhibited encouraging trends in showing that such a heuristic can lead to developing more effective and more explainable end-to-end data-driven classifiers.