论文标题
Stonne:灵活神经网络加速器的详细架构模拟器
STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators
论文作者
论文摘要
如今,专门架构的设计是为了加速深度神经网络(DNNS)的推理程序,这是当今研究的蓬勃发展领域。第一代刚性提案已迅速被更先进的柔性加速器体系结构所取代,能够有效地支持各种层类型和尺寸。随着设计的复杂性的增长,研究人员拥有可以使用循环精确的模拟工具,以便在设计早期阶段具有快速准确的设计空间探索,并快速量化建筑增强功能的功效。为此,我们提出了Stonne(神经网络引擎的仿真工具),这是一种循环精确,高度模块化且高度扩展的仿真框架,可以端到端评估灵活的加速器架构,运行完整的现代DNN模型。我们使用Stonne对最近提出的Maeri体系结构进行建模,并展示它如何紧密接近公开可用的BSV编码的Maeri实现的性能结果。然后,我们进行了全面的评估,并证明为Maeri实施的折叠策略导致计算单元的利用率很低(在5个DNN模型中平均为25%),最终转化为绩效差。
The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. First-generation rigid proposals have been rapidly replaced by more advanced flexible accelerator architectures able to efficiently support a variety of layer types and dimensions. As the complexity of the designs grows, it is more and more appealing for researchers to have cycle-accurate simulation tools at their disposal to allow for fast and accurate design-space exploration, and rapid quantification of the efficacy of architectural enhancements during the early stages of a design. To this end, we present STONNE (Simulation TOol of Neural Network Engines), a cycle-accurate, highly-modular and highly-extensible simulation framework that enables end-to-end evaluation of flexible accelerator architectures running complete contemporary DNN models. We use STONNE to model the recently proposed MAERI architecture and show how it can closely approach the performance results of the publicly available BSV-coded MAERI implementation. Then, we conduct a comprehensive evaluation and demonstrate that the folding strategy implemented for MAERI results in very low compute unit utilization (25% on average across 5 DNN models) which in the end translates into poor performance.