论文标题
Enpheeph:用于尖峰和压缩深神经网络的故障注入框架
enpheeph: A Fault Injection Framework for Spiking and Compressed Deep Neural Networks
论文作者
论文摘要
深神经网络(DNNS)的研究重点是提高现实部署的性能和准确性,导致新模型,例如尖峰神经网络(SNNS)以及优化技术,例如压缩网络的量化和修剪。但是,这些创新模型和优化技术的部署引入了可能的可靠性问题,这是DNNS在安全至关重要应用中广泛使用的支柱,例如自动驾驶。此外,扩展技术节点具有同时发生多个故障的相关风险,这是最先进的弹性分析中未解决的可能性。 为了更好地对DNN的可靠性分析,我们提出了Enpheeph,这是用于尖峰和压缩DNN的断层注入框架。 Enpheeph框架可以在专用硬件设备(例如GPU)上进行优化的执行,同时提供完整的自定义性来研究不同的故障模型,并模拟各种可靠性约束和用例。因此,这些故障可以在SNN上执行,以及对基础代码进行最小化修改的压缩网络,这一壮举是其他最先进的工具无法实现的。 为了评估我们的Enpheeph框架,我们通过不同的压缩技术分析了不同DNN和SNN模型的弹性。通过注入随机和增加的故障,我们表明DNN可以显示出每个参数的断层速率低至7 x 10 ^(-7)故障的准确性降低,精度下降高于40%。当执行Enpheeph时,运行时开销少于基线执行时间的20%,同时执行100 000个故障,至少比最先进的框架低10倍,这使得Enpheeph Future-Proffure-Proffore for Complect FARD INTERICTION场景。 我们在https://github.com/alexei95/enpheeph上发布enpheeph。
Research on Deep Neural Networks (DNNs) has focused on improving performance and accuracy for real-world deployments, leading to new models, such as Spiking Neural Networks (SNNs), and optimization techniques, e.g., quantization and pruning for compressed networks. However, the deployment of these innovative models and optimization techniques introduces possible reliability issues, which is a pillar for DNNs to be widely used in safety-critical applications, e.g., autonomous driving. Moreover, scaling technology nodes have the associated risk of multiple faults happening at the same time, a possibility not addressed in state-of-the-art resiliency analyses. Towards better reliability analysis for DNNs, we present enpheeph, a Fault Injection Framework for Spiking and Compressed DNNs. The enpheeph framework enables optimized execution on specialized hardware devices, e.g., GPUs, while providing complete customizability to investigate different fault models, emulating various reliability constraints and use-cases. Hence, the faults can be executed on SNNs as well as compressed networks with minimal-to-none modifications to the underlying code, a feat that is not achievable by other state-of-the-art tools. To evaluate our enpheeph framework, we analyze the resiliency of different DNN and SNN models, with different compression techniques. By injecting a random and increasing number of faults, we show that DNNs can show a reduction in accuracy with a fault rate as low as 7 x 10 ^ (-7) faults per parameter, with an accuracy drop higher than 40%. Run-time overhead when executing enpheeph is less than 20% of the baseline execution time when executing 100 000 faults concurrently, at least 10x lower than state-of-the-art frameworks, making enpheeph future-proof for complex fault injection scenarios. We release enpheeph at https://github.com/Alexei95/enpheeph.