论文标题
深度神经体系结构中基于正规化的修剪
Regularization-based Pruning of Irrelevant Weights in Deep Neural Architectures
论文作者
论文摘要
如今,剥削数百万参数的深度神经网络是深度学习应用程序的规范。这是一个潜在的问题,因为培训所需的大量计算资源以及过度透明度网络的概括性能可能会丧失。我们在本文中提出了一种通过正则化技术学习稀疏神经拓扑的方法,该技术可以识别非相关权重并有选择地缩小其规范,同时为相关的权重进行经典更新。该技术是经典重量衰减的改进,是基于正规化项的定义,无论其形式如何,它都可以添加到任何损失功能中,从而在许多不同的情况下可利用统一的一般框架。通过迭代修剪算法来处理被确定为无关的参数的实际消除。我们在不同的图像分类和自然语言生成任务上测试了提出的技术,在稀疏性和指标方面,以比标准杆或更好的竞争者获得结果,同时获得了强大的模型压缩。
Deep neural networks exploiting millions of parameters are nowadays the norm in deep learning applications. This is a potential issue because of the great amount of computational resources needed for training, and of the possible loss of generalization performance of overparametrized networks. We propose in this paper a method for learning sparse neural topologies via a regularization technique which identifies non relevant weights and selectively shrinks their norm, while performing a classic update for relevant ones. This technique, which is an improvement of classical weight decay, is based on the definition of a regularization term which can be added to any loss functional regardless of its form, resulting in a unified general framework exploitable in many different contexts. The actual elimination of parameters identified as irrelevant is handled by an iterative pruning algorithm. We tested the proposed technique on different image classification and Natural language generation tasks, obtaining results on par or better then competitors in terms of sparsity and metrics, while achieving strong models compression.