论文标题
训练深神经网络和其他分类器的稳定性
Stability for the Training of Deep Neural Networks and Other Classifiers
论文作者
论文摘要
我们研究了用于深神经网络(DNN)和其他分类器的损失最小化训练过程的稳定性。尽管在通过所谓的损失函数训练期间优化了分类器,但分类器的性能通常通过某种准确性来评估,例如量化良好分类对象比例的总体准确性。这导致了稳定的指导问题:通过训练减少损失是否总是会提高准确性?我们正式化了稳定性的概念,并提供了不稳定的例子。我们的主要结果由分类器上的两个新的条件组成,如果满足两种情况,则确保训练的稳定性,即随着损失的降低,我们的准确性范围很紧。我们还可以在训练集上获得足够的稳定性条件,从而将数据歧管的平坦部分确定为潜在的不稳定性来源。后一种条件在培训数据集上明确可验证。我们的结果不取决于用于培训的算法,只要损失随训练而减少。
We examine the stability of loss-minimizing training processes that are used for deep neural networks (DNN) and other classifiers. While a classifier is optimized during training through a so-called loss function, the performance of classifiers is usually evaluated by some measure of accuracy, such as the overall accuracy which quantifies the proportion of objects that are well classified. This leads to the guiding question of stability: does decreasing loss through training always result in increased accuracy? We formalize the notion of stability, and provide examples of instability. Our main result consists of two novel conditions on the classifier which, if either is satisfied, ensure stability of training, that is we derive tight bounds on accuracy as loss decreases. We also derive a sufficient condition for stability on the training set alone, identifying flat portions of the data manifold as potential sources of instability. The latter condition is explicitly verifiable on the training dataset. Our results do not depend on the algorithm used for training, as long as loss decreases with training.