论文标题
利用深层神经网络的全部容量,同时避免被目标稀疏性过度拟合
Exploiting the Full Capacity of Deep Neural Networks while Avoiding Overfitting by Targeted Sparsity Regularization
论文作者
论文摘要
在相对较小的数据集上训练深层神经网络时,过度拟合是最常见的问题之一。在这里,我们证明了神经网络激活稀疏性是一个可靠的指标,用于过度拟合,我们用来提出新型的靶向稀疏性可视化和正则化策略。基于这些策略,我们能够理解和抵消因激活稀疏性和滤波器相关性引起的过度拟合。我们的结果表明,有针对性的稀疏正规化可以有效地用于使知名的数据集和架构正规化,图像分类性能显着提高,同时均超过了辍学和批处理归一化。最终,我们的研究揭示了对激活稀疏性和网络容量矛盾的概念的新见解,通过证明有针对性的稀疏性正规化可以实现显着和歧视性的特征学习,同时即使经过过度训练,也可以利用深层模型的全部能力而没有过度拟合。
Overfitting is one of the most common problems when training deep neural networks on comparatively small datasets. Here, we demonstrate that neural network activation sparsity is a reliable indicator for overfitting which we utilize to propose novel targeted sparsity visualization and regularization strategies. Based on these strategies we are able to understand and counteract overfitting caused by activation sparsity and filter correlation in a targeted layer-by-layer manner. Our results demonstrate that targeted sparsity regularization can efficiently be used to regularize well-known datasets and architectures with a significant increase in image classification performance while outperforming both dropout and batch normalization. Ultimately, our study reveals novel insights into the contradicting concepts of activation sparsity and network capacity by demonstrating that targeted sparsity regularization enables salient and discriminative feature learning while exploiting the full capacity of deep models without suffering from overfitting, even when trained excessively.