关于减少电压FPGA的深度学习的弹性

论文标题

关于减少电压FPGA的深度学习的弹性

On the Resilience of Deep Learning for Reduced-voltage FPGAs

论文作者

Givaki, Kamyar, Salami, Behzad, Hojabr, Reza, Tayaranian, S. M. Reza, Khonsari, Ahmad, Rahmati, Dara, Gorgin, Saeid, Cristal, Adrian, Unsal, Osman S.

论文摘要

深度神经网络（DNNS）本质上是计算密集型的，也是渴望的。硬件加速器（例如现场可编程门阵列（FPGA））是一个有前途的解决方案，可以满足嵌入式和高性能计算（HPC）系统的这些要求。在FPGA以及CPU和GPU中，侵袭性电压缩放在标称水平以下是最小化功率最小化的有效技术。不幸的是，由于定时问题，将电压缩放到晶体管阈值时，位翼断层开始出现，从而造成了弹性问题。本文在实验中评估了在存在电压下强调FPGA的断层的情况下，尤其是在芯片记忆中，DNNS的训练阶段的弹性。为了实现这一目标，我们已经通过实验评估了LENET-5的弹性以及针对CIFAR-10数据集的特殊设计的网络，该数据集具有不同的重置线性单元（Relu）和双曲线切线（TANH）的不同激活功能。我们发现，现代FPGA在极低的电压水平上足够强大，并且可以在训练迭代中自动掩盖相关的低压相关故障，因此无需昂贵的软件或面向硬件的软件故障缓解技术，例如ECC。需要多10％的训练迭代来填补准确性的空白。该观察结果是在实际FPGA织物上测量的未涉及断层率相对较低的结果。我们还通过随机生成的断层注射运动大大提高了LENET-5网络的故障率，并观察到训练精度开始降低。当故障率增加时，带有TANH激活函数的网络在准确性方面优于relu的网络，例如，当断层速率为30％时，精度差为4.92％。

Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue. This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software- or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e., <0.1\%, measured on real FPGA fabrics. We have also increased the fault rate significantly for the LeNet-5 network by randomly generated fault injection campaigns and observed that the training accuracy starts to degrade. When the fault rate increases, the network with Tanh activation function outperforms the one with Relu in terms of accuracy, e.g., when the fault rate is 30% the accuracy difference is 4.92%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题