FPGA加速方法，用于使用乘数和LSMR的交替方向方法训练前馈神经网络

论文标题

FPGA加速方法，用于使用乘数和LSMR的交替方向方法训练前馈神经网络

An FPGA Accelerated Method for Training Feed-forward Neural Networks Using Alternating Direction Method of Multipliers and LSMR

论文作者

Foumani, Seyedeh Niusha Alavi, Guo, Ce, Luk, Wayne

论文摘要

在这个项目中，我们成功设计，实施，部署和测试了一种用于神经网络培训的新型FPGA加速算法。该算法本身是在独立研究选项中开发的。该训练方法基于乘数算法的交替方向方法，该方法具有强大的并行特性，并避免了通过使用LSMR在硬件设计中有问题的矩阵反转的程序。作为一个中级阶段，我们完全实现了C语言中的ADMM-LSMR方法，用于具有灵活数量的图层和隐藏大小的馈电神经网络。我们证明了该方法可以以定点算术的方式运行，而不会损害准确性。接下来，我们使用Intel FPGA SDK设计了算法的FPGA加速版，并进行了广泛的优化阶段，然后成功地在Intel Arria 10 GX FPGA上进行了该程序。与同等的CPU实施相比，FPGA加速程序的速度高达6倍，同时实现了有希望的准确性。

In this project, we have successfully designed, implemented, deployed and tested a novel FPGA accelerated algorithm for neural network training. The algorithm itself was developed in an independent study option. This training method is based on Alternating Direction Method of Multipliers algorithm, which has strong parallel characteristics and avoids procedures such as matrix inversion that are problematic in hardware designs by employing LSMR. As an intermediate stage, we fully implemented the ADMM-LSMR method in C language for feed-forward neural networks with a flexible number of layers and hidden size. We demonstrated that the method can operate with fixed-point arithmetic without compromising the accuracy. Next, we devised an FPGA accelerated version of the algorithm using Intel FPGA SDK for OpenCL and performed extensive optimisation stages followed by successful deployment of the program on an Intel Arria 10 GX FPGA. The FPGA accelerated program showed up to 6 times speed up comparing to equivalent CPU implementation while achieving promising accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题