探索图形处理加速器的内存访问模式

论文标题

探索图形处理加速器的内存访问模式

Exploring Memory Access Patterns for Graph Processing Accelerators

论文作者

Dann, Jonas, Ritter, Daniel, Fröning, Holger

论文摘要

业务和技术的最新趋势（例如机器学习，社交网络分析）受益于在数据库和数据科学平台中存储和处理越来越多的图形结构数据。 FPGA作为用于图形处理的加速器，具有可自定义的内存层次结构，承诺解决传统硬件上固有的不规则内存访问模式引起的性能问题（例如CPU）。但是，开发此类硬件加速器仍在耗时，并且难度和基准测试是未标准化的，这阻碍了对内存访问模式变化的影响和图形处理加速器的系统工程的影响。在这项工作中，我们提出了一个模拟环境，以分析基于模拟其内存访问模式的图形处理加速器。此外，我们在两个最先进的FPGA图处理加速器上评估了我们的方法，并显示了可重复性，可比性以及示例缩短的开发过程。诸如FPGA之类的加速器硬件上实施周期准确的内部数据流程会大大减少实现时间，增加基准参数透明度，并允许比较图形处理方法。

Recent trends in business and technology (e.g., machine learning, social network analysis) benefit from storing and processing growing amounts of graph-structured data in databases and data science platforms. FPGAs as accelerators for graph processing with a customizable memory hierarchy promise solving performance problems caused by inherent irregular memory access patterns on traditional hardware (e.g., CPU). However, developing such hardware accelerators is yet time-consuming and difficult and benchmarking is non-standardized, hindering comprehension of the impact of memory access pattern changes and systematic engineering of graph processing accelerators. In this work, we propose a simulation environment for the analysis of graph processing accelerators based on simulating their memory access patterns. Further, we evaluate our approach on two state-of-the-art FPGA graph processing accelerators and show reproducibility, comparablity, as well as the shortened development process by an example. Not implementing the cycle-accurate internal data flow on accelerator hardware like FPGAs significantly reduces the implementation time, increases the benchmark parameter transparency, and allows comparison of graph processing approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题