论文标题

使用无线网络包装的2.5D DNN加速器的DataFlow-Architecture共同设计

Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package

论文作者

Guirado, Robert, Kwon, Hyoukjun, Abadal, Sergi, Alarcón, Eduard, Krishna, Tushar

论文摘要

深度神经网络(DNN)模型的规模和复杂性继续增长,要求更高的计算能力实现实时推断。为了有效地提供此类计算需求,正在开发和部署硬件加速器。这自然需要有效的扩展机制,以根据应用要求提高计算密度。 2.5D插入器上的集成已成为一种有前途的解决方案,但是正如我们在这项工作中所显示的那样,在网络包装(NOP)中的有限的插入器带宽和多个啤酒花可以降低该方法的好处。为了应对这一挑战,我们提出了Wienna,这是一种基于无线NOP的2.5D DNN加速器。在Wienna,无线NOP将一系列DNN加速器芯片连接到全球缓冲芯片,提供高带宽的多播功能。在这里,我们还标识了最有效的数据流样式,该样式利用了无线NOP在每一层上的高带宽多播功能。 Wienna的面积和电源额外的费用适中,比基于插座的NOP设计的吞吐量高2.2x-5.1倍,能量高38.2%。

Deep neural network (DNN) models continue to grow in size and complexity, demanding higher computational power to enable real-time inference. To efficiently deliver such computational demands, hardware accelerators are being developed and deployed across scales. This naturally requires an efficient scale-out mechanism for increasing compute density as required by the application. 2.5D integration over interposer has emerged as a promising solution, but as we show in this work, the limited interposer bandwidth and multiple hops in the Network-on-Package (NoP) can diminish the benefits of the approach. To cope with this challenge, we propose WIENNA, a wireless NoP-based 2.5D DNN accelerator. In WIENNA, the wireless NoP connects an array of DNN accelerator chiplets to the global buffer chiplet, providing high-bandwidth multicasting capabilities. Here, we also identify the dataflow style that most efficienty exploits the wireless NoP's high-bandwidth multicasting capability on each layer. With modest area and power overheads, WIENNA achieves 2.2X--5.1X higher throughput and 38.2% lower energy than an interposer-based NoP design.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源