深层神经网络通过结构化的正则化修剪

论文标题

深层神经网络通过结构化的正则化修剪

Deep Neural Networks pruning via the Structured Perspective Regularization

论文作者

Cacciola, Matteo, Frangioni, Antonio, Li, Xinlin, Lodi, Andrea

论文摘要

在机器学习中，人工神经网络（ANN）是一种非常强大的工具，广泛用于许多应用中。通常，所选的（深）架构包括许多层，因此包括大量参数，这使培训，存储和推理昂贵。这促使一系列有关将原始网络压缩成较小网络而不会过分牺牲性能的研究。在许多提出的压缩方法中，最受欢迎的方法之一是\ emph {Pruning}，其中ANN的整个元素（链接，节点，通道，\ ldots）和相应的权重删除。由于该问题的性质本质上是组合的（要修剪的要素，什么不是），因此我们提出了一种基于操作研究工具的新修剪方法。我们从为该问题的天然混合组编程模型开始，然后使用透视化重新制作技术来增强其持续放松。从该重新制定中投射指标变量产生了一个新的正则化术语，我们称之为结构化的正则化，从而导致初始体系结构的结构化修剪。我们测试了应用于CIFAR-10，CIFAR-100和Imagenet数据集的一些重新NET架构，获得了竞争性能W.R.T.

In Machine Learning, Artificial Neural Networks (ANNs) are a very powerful tool, broadly used in many applications. Often, the selected (deep) architectures include many layers, and therefore a large amount of parameters, which makes training, storage and inference expensive. This motivated a stream of research about compressing the original networks into smaller ones without excessively sacrificing performances. Among the many proposed compression approaches, one of the most popular is \emph{pruning}, whereby entire elements of the ANN (links, nodes, channels, \ldots) and the corresponding weights are deleted. Since the nature of the problem is inherently combinatorial (what elements to prune and what not), we propose a new pruning method based on Operational Research tools. We start from a natural Mixed-Integer-Programming model for the problem, and we use the Perspective Reformulation technique to strengthen its continuous relaxation. Projecting away the indicator variables from this reformulation yields a new regularization term, which we call the Structured Perspective Regularization, that leads to structured pruning of the initial architecture. We test our method on some ResNet architectures applied to CIFAR-10, CIFAR-100 and ImageNet datasets, obtaining competitive performances w.r.t.~the state of the art for structured pruning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题