翻转：通过标志翻转揭开冗余重量

论文标题

翻转：通过标志翻转揭开冗余重量

FlipOut: Uncovering Redundant Weights via Sign Flipping

论文作者

Apostol, Andrei, Stol, Maarten, Forré, Patrick

论文摘要

现代神经网络虽然在许多任务上实现了最新的结果，但往往具有大量参数，从而增加了培训时间和资源使用情况。可以通过修剪来缓解此问题。但是，现有的方法通常需要大量的参数调整或修剪的多个循环，并进行收敛以获得有利的准确性 - 比较权衡。为了解决这些问题，我们提出了一种新颖的修剪方法，该方法使用$ 0 $ $ 0 $（即签名翻转）的振荡，该方法在训练过程中进行了重量，以确定其显着性。我们的方法可以在网络融合之前进行修剪，由于其超参数具有良好的默认值，因此几乎不需要调整努力，并且可以直接针对用户所需的稀疏度。我们的实验在各种对象分类体系结构上执行，表明它具有现有方法的竞争力，并且在大多数测试的架构中都达到了$ 99.6 \％$ $ $ 99.6 \％$的稀疏度的最先进性能。为了获得可重复性，我们在https://github.com/andreixyz/flipout上公开发布代码。

Modern neural networks, although achieving state-of-the-art results on many tasks, tend to have a large number of parameters, which increases training time and resource usage. This problem can be alleviated by pruning. Existing methods, however, often require extensive parameter tuning or multiple cycles of pruning and retraining to convergence in order to obtain a favorable accuracy-sparsity trade-off. To address these issues, we propose a novel pruning method which uses the oscillations around $0$ (i.e. sign flips) that a weight has undergone during training in order to determine its saliency. Our method can perform pruning before the network has converged, requires little tuning effort due to having good default values for its hyperparameters, and can directly target the level of sparsity desired by the user. Our experiments, performed on a variety of object classification architectures, show that it is competitive with existing methods and achieves state-of-the-art performance for levels of sparsity of $99.6\%$ and above for most of the architectures tested. For reproducibility, we release our code publicly at https://github.com/AndreiXYZ/flipout.

下载PDF全文

下载文献需遵守相关版权规定

论文标题