论文标题
馈送前向神经网络的乘数交替方向方法的分析
An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks
论文作者
论文摘要
在这项工作中,我们提出了一种兼容的神经网络训练算法,其中我们使用了乘数的交替方向方法(ADMM)和迭代性最小二乘方法。这种方法背后的动机是进行一种训练可扩展的神经网络的方法,可以并行化。这些特征使得此算法适用于硬件实现。与SGD和ADAM相比,我们的精度分别达到了6.9 \%和6.8 \%,其四层神经网络在HIGGS数据集上的隐藏大小为28。同样,我们可以在IRIS数据集上分别观察到21.0 \%和2.2 \%的准确性提高,并在IRIS数据集上具有三层神经网络,其隐藏大小为8。这是在这种方法中使用矩阵倒置,这是对硬件实施的挑战,在这种方法中是挑战性的。我们评估了避免矩阵反转对ADMM准确性的影响,我们观察到我们可以使用迭代最小二乘方法安全地替换矩阵反转并保持所需的性能。同样,实现方法的计算复杂性是关于输入数据集和网络隐藏大小的多项式的。
In this work, we present a hardware compatible neural network training algorithm in which we used alternating direction method of multipliers (ADMM) and iterative least-square methods. The motive behind this approach was to conduct a method of training neural networks that is scalable and can be parallelised. These characteristics make this algorithm suitable for hardware implementation. We have achieved 6.9\% and 6.8\% better accuracy comparing to SGD and Adam respectively, with a four-layer neural network with hidden size of 28 on HIGGS dataset. Likewise, we could observe 21.0\% and 2.2\% accuracy improvement comparing to SGD and Adam respectively, on IRIS dataset with a three-layer neural network with hidden size of 8. This is while the use of matrix inversion, which is challenging for hardware implementation, is avoided in this method. We assessed the impact of avoiding matrix inversion on ADMM accuracy and we observed that we can safely replace matrix inversion with iterative least-square methods and maintain the desired performance. Also, the computational complexity of the implemented method is polynomial regarding dimensions of the input dataset and hidden size of the network.