论文标题
在线协议:预测分配变化的神经网络的性能
Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift
论文作者
论文摘要
最近,Miller等。结果表明,模型的分布(ID)精度与几个OOD基准上的分布(OOD)精度具有很强的线性相关性 - 它们称为“准确性”的现象。虽然一种用于模型选择的有用工具(即,最有可能执行最佳OOD的模型是具有最高ID精度的模型),但此事实无助于估计模型的实际OOD性能,而无需访问标记的OOD验证集。在本文中,我们展示了一种类似但令人惊讶的现象,也与神经网络分类器对之间的一致性一致:每当准确的在线成立时,我们都会观察到,任何两个神经网络的预测(具有潜在的不同架构)之间的OOD一致性也观察到与他们的ID一致性的强度相关性。此外,我们观察到OOD与ID协议的斜率和偏差与OOD与ID准确性的偏差非常匹配。我们称之为“协议”的现象具有重要的实际应用:没有任何标记的数据,我们可以预测分类器的OOD准确性},因为只需使用未标记的数据就可以估算OOD一致性。我们的预测算法在同意在线达成的变化中都胜过以前的方法,而且令人惊讶的是,当准确性不在线上时。这种现象还提供了对深神经网络的新见解:与在线准确性,一致性的同意不同,似乎仅适用于神经网络分类器。
Recently, Miller et al. showed that a model's in-distribution (ID) accuracy has a strong linear correlation with its out-of-distribution (OOD) accuracy on several OOD benchmarks -- a phenomenon they dubbed ''accuracy-on-the-line''. While a useful tool for model selection (i.e., the model most likely to perform the best OOD is the one with highest ID accuracy), this fact does not help estimate the actual OOD performance of models without access to a labeled OOD validation set. In this paper, we show a similar but surprising phenomenon also holds for the agreement between pairs of neural network classifiers: whenever accuracy-on-the-line holds, we observe that the OOD agreement between the predictions of any two pairs of neural networks (with potentially different architectures) also observes a strong linear correlation with their ID agreement. Furthermore, we observe that the slope and bias of OOD vs ID agreement closely matches that of OOD vs ID accuracy. This phenomenon, which we call agreement-on-the-line, has important practical applications: without any labeled data, we can predict the OOD accuracy of classifiers}, since OOD agreement can be estimated with just unlabeled data. Our prediction algorithm outperforms previous methods both in shifts where agreement-on-the-line holds and, surprisingly, when accuracy is not on the line. This phenomenon also provides new insights into deep neural networks: unlike accuracy-on-the-line, agreement-on-the-line appears to only hold for neural network classifiers.