论文标题
面对不确定性估计的明确期望
Towards Clear Expectations for Uncertainty Estimation
论文作者
论文摘要
如果不确定性量化(UQ)对于实现值得信赖的机器学习(ML)至关重要,则大多数UQ方法都有不同的评估协议和不一致的评估协议。我们声称这种不一致的原因是社区对UQ期望的不明确要求。本意见论文通过通过五个下游任务指定这些要求来提供新的观点,我们期望不确定性得分具有实质性的预测能力。我们仔细设计这些下游任务,以反映ML模型的现实使用寿命。在7个分类数据集的示例基准上,我们没有观察到最新的内在UQ方法与简单基线的统计优势。我们认为,我们的发现质疑为什么我们量化不确定性并呼吁根据被证明与ML从业者相关的指标进行标准化协议进行标准化协议。
If Uncertainty Quantification (UQ) is crucial to achieve trustworthy Machine Learning (ML), most UQ methods suffer from disparate and inconsistent evaluation protocols. We claim this inconsistency results from the unclear requirements the community expects from UQ. This opinion paper offers a new perspective by specifying those requirements through five downstream tasks where we expect uncertainty scores to have substantial predictive power. We design these downstream tasks carefully to reflect real-life usage of ML models. On an example benchmark of 7 classification datasets, we did not observe statistical superiority of state-of-the-art intrinsic UQ methods against simple baselines. We believe that our findings question the very rationale of why we quantify uncertainty and call for a standardized protocol for UQ evaluation based on metrics proven to be relevant for the ML practitioner.