论文标题
集成分配蒸馏的一般框架
A general framework for ensemble distribution distillation
论文作者
论文摘要
在预测和不确定性估计方面,神经网络的集合比单个网络具有更好的性能。此外,集合可以使不确定性分解为核心(数据)和认知(模型)组件,从而更完整地了解了预测性不确定性。合奏蒸馏是将合奏压缩到单个模型中的过程,通常会导致更精简的模型,该模型仍然优于单个合奏成员。不幸的是,标准蒸馏消除了合奏的自然不确定性分解。我们提出了一个一般框架,用于以保留分解的方式提炼回归和分类合奏。我们证明了我们的框架所需的行为,并表明其预测性能与标准蒸馏相当。
Ensembles of neural networks have been shown to give better performance than single networks, both in terms of predictions and uncertainty estimation. Additionally, ensembles allow the uncertainty to be decomposed into aleatoric (data) and epistemic (model) components, giving a more complete picture of the predictive uncertainty. Ensemble distillation is the process of compressing an ensemble into a single model, often resulting in a leaner model that still outperforms the individual ensemble members. Unfortunately, standard distillation erases the natural uncertainty decomposition of the ensemble. We present a general framework for distilling both regression and classification ensembles in a way that preserves the decomposition. We demonstrate the desired behaviour of our framework and show that its predictive performance is on par with standard distillation.