论文标题
体重相关如何影响深神经网络的概括能力
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
论文作者
论文摘要
本文研究了深层神经网络中重量相关性的新型概念,并讨论了其对网络泛化能力的影响。对于完全连接的层,重量相关性定义为神经元的重量向量之间的平均余弦相似性,对于卷积层,重量相关性定义为滤镜矩阵之间的余弦相似性。从理论上讲,我们表明,重量相关性可以并且应将其纳入PAC Bayesian框架中以进行神经网络的概括,而所产生的概括性结合相对于重量相关性是单调的。我们制定了一种新的复杂度度量,该度量可以通过重量相关性来提高PAC贝叶斯度量,并在实验上确认它能够比现有措施更精确地对一组网络的概括误差进行排名。更重要的是,我们开发了一个新的正规机构进行培训,并提供了广泛的实验,表明我们的新方法可以大大减少概括误差。
This paper studies the novel concept of weight correlation in deep neural networks and discusses its impact on the networks' generalisation ability. For fully-connected layers, the weight correlation is defined as the average cosine similarity between weight vectors of neurons, and for convolutional layers, the weight correlation is defined as the cosine similarity between filter matrices. Theoretically, we show that, weight correlation can, and should, be incorporated into the PAC Bayesian framework for the generalisation of neural networks, and the resulting generalisation bound is monotonic with respect to the weight correlation. We formulate a new complexity measure, which lifts the PAC Bayes measure with weight correlation, and experimentally confirm that it is able to rank the generalisation errors of a set of networks more precisely than existing measures. More importantly, we develop a new regulariser for training, and provide extensive experiments that show that the generalisation error can be greatly reduced with our novel approach.