通过结构化连续稀疏来增强有效的深网络

论文标题

通过结构化连续稀疏来增强有效的深网络

Growing Efficient Deep Networks by Structured Continuous Sparsification

论文作者

Yuan, Xin, Savarese, Pedro, Maire, Michael

论文摘要

我们开发了一种在培训过程中发展深层网络体系结构的方法，这是由准确性和稀疏目标的原则结合驱动的。与在全尺寸型号或超级网络体系结构上运行的现有修剪或架构搜索技术不同，我们的方法可以从小型，简单的种子体系结构开始，并动态生长和修剪层和过滤器。通过将离散网络结构优化的连续松弛与采样稀疏子网的方案相结合，我们产生紧凑的，修剪的网络，同时也大大降低了培训的计算费用。例如，与基线Resnet-50相比，我们获得了$ 49.7 \％$推理拖球和$ 47.4 \％$ $的培训拖节，同时保持$ 75.2 \％$ $ $ $ TOP-1的准确性 - 所有这些都没有任何专用的微调阶段。跨CIFAR，Imagenet，Pascal VOC和Penn Treebank进行的实验，具有用于图像分类和语义分割的卷积网络，以及用于语言建模的经常性网络，证明我们既训练更快又产生了比竞争性的建筑学或搜索方法更快且产生更有效的网络。

We develop an approach to growing deep network architectures over the course of training, driven by a principled combination of accuracy and sparsity objectives. Unlike existing pruning or architecture search techniques that operate on full-sized models or supernet architectures, our method can start from a small, simple seed architecture and dynamically grow and prune both layers and filters. By combining a continuous relaxation of discrete network structure optimization with a scheme for sampling sparse subnetworks, we produce compact, pruned networks, while also drastically reducing the computational expense of training. For example, we achieve $49.7\%$ inference FLOPs and $47.4\%$ training FLOPs savings compared to a baseline ResNet-50 on ImageNet, while maintaining $75.2\%$ top-1 accuracy -- all without any dedicated fine-tuning stage. Experiments across CIFAR, ImageNet, PASCAL VOC, and Penn Treebank, with convolutional networks for image classification and semantic segmentation, and recurrent networks for language modeling, demonstrate that we both train faster and produce more efficient networks than competing architecture pruning or search methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题