Neurofabric：识别培训先验稀疏网络的理想拓扑

论文标题

Neurofabric：识别培训先验稀疏网络的理想拓扑

NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse Networks

论文作者

Isakov, Mihailo, Kinsy, Michel A.

论文摘要

深度神经网络的较长培训时间是机器学习研究中的瓶颈。快速训练的主要障碍是记忆的二次增长以及密集和卷积层相对于其信息带宽的要求。最近，已经提出了培训“先验”稀疏网络，作为允许图层保留高信息带宽的方法，同时保持内存和计算低。但是，在这些网络中应使用稀疏拓扑的选择尚不清楚。在这项工作中，我们为选择层内拓扑的理论基础提供了理论基础。首先，我们得出了一种新的稀疏神经网络初始化方案，该方案使我们能够探索非常稀疏的网络的空间。接下来，我们评估了几种拓扑结构，并表明看似相似的拓扑通常在可达到的准确性上可能会有很大的差异。为了解释这些差异，我们开发了一种无数据的启发式方法，可以独立于数据集对网络进行培训。然后，我们得出了一组构成良好拓扑的要求，并得出一个满足所有这些拓扑的单个拓扑。

Long training times of deep neural networks are a bottleneck in machine learning research. The major impediment to fast training is the quadratic growth of both memory and compute requirements of dense and convolutional layers with respect to their information bandwidth. Recently, training `a priori' sparse networks has been proposed as a method for allowing layers to retain high information bandwidth, while keeping memory and compute low. However, the choice of which sparse topology should be used in these networks is unclear. In this work, we provide a theoretical foundation for the choice of intra-layer topology. First, we derive a new sparse neural network initialization scheme that allows us to explore the space of very deep sparse networks. Next, we evaluate several topologies and show that seemingly similar topologies can often have a large difference in attainable accuracy. To explain these differences, we develop a data-free heuristic that can evaluate a topology independently from the dataset the network will be trained on. We then derive a set of requirements that make a good topology, and arrive at a single topology that satisfies all of them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题