抓地力：图形神经网络加速器体系结构

论文标题

抓地力：图形神经网络加速器体系结构

GRIP: A Graph Neural Network Accelerator Architecture

论文作者

Kiningham, Kevin, Re, Christopher, Levis, Philip

论文摘要

我们提出Grip，这是一种旨在低延迟推断的图形神经网络加速器体系结构。 Acceleratinggnns具有挑战性，因为它们结合了两种不同类型的计算类型：以算术密集型顶点操作和以内存密集的为中心的边缘操作。 Grip将GNN推断为可以在硬件中实现的固定的以边缘和顶点执行阶段的固定组。然后，我们将每个单元专门用于每个阶段中发现的唯一计算结构。对于以顶点为中心的阶段，GRIP使用高性能矩阵乘发动机，并与专用的内存子系统相结合以改善重复使用。对于以边缘为中心的阶段，抓地力使用多个并行预取和还原引擎来减轻内存访问中的不规则性。最后，GRIP支持多个GNNN优化，包括一种称为顶点瓷砖的新型优化，该优化增加了重量数据的重复使用。我们通过执行合成，位置和路线来评估GRIP，以实现28nm实现，能够为多种广泛使用的GNN模型（GCN，Gragrsage，Gragrsage，GCN和GIN）执行推断。在几个基准图中，与CPU和GPU基线相比，它分别将第99个百分位潜伏期减少了17倍和23倍，而仅绘制5W。

We present GRIP, a graph neural network accelerator architecture designed for low-latency inference. AcceleratingGNNs is challenging because they combine two distinct types of computation: arithmetic-intensive vertex-centric operations and memory-intensive edge-centric operations. GRIP splits GNN inference into a fixed set of edge- and vertex-centric execution phases that can be implemented in hardware. We then specialize each unit for the unique computational structure found in each phase.For vertex-centric phases, GRIP uses a high performance matrix multiply engine coupled with a dedicated memory subsystem for weights to improve reuse. For edge-centric phases, GRIP use multiple parallel prefetch and reduction engines to alleviate the irregularity in memory accesses. Finally, GRIP supports severalGNN optimizations, including a novel optimization called vertex-tiling which increases the reuse of weight data.We evaluate GRIP by performing synthesis and place and route for a 28nm implementation capable of executing inference for several widely-used GNN models (GCN, GraphSAGE, G-GCN, and GIN). Across several benchmark graphs, it reduces 99th percentile latency by a geometric mean of 17x and 23x compared to a CPU and GPU baseline, respectively, while drawing only 5W.

下载PDF全文

下载文献需遵守相关版权规定

论文标题