论文标题
高性能应用特定加速器中算法多端口记忆的设计空间探索
Design Space Exploration of Algorithmic Multi-Port Memories in High-Performance Application-Specific Accelerators
论文作者
论文摘要
内存负载/存储指令在域特异性加速器中消耗执行时间和能耗的重要组成部分。为了设计高度并行系统,从工作负载中提取每个粒度的可用并行性。在这些高性能设计中,在每种粒度上的最大使用需要使用多端口记忆。当前,True多派设计不太受欢迎,因为没有2个端口以外的多端子内存固有的EDA支持,使用更多端口需要电路级实现,从而需要较高的设计时间。在这项工作中,我们介绍了ASIC中算法多端口记忆(AMM)设计空间探索的框架。我们研究文献中的不同AMM设计,讨论如何将它们纳入具有不同记忆深度,端口配置和银行结构的前RTL Aladdin框架中。从我们从Machsuite(加速器基准套件)对所选应用程序的分析中,我们了解并量化了在存储器访问模式中空间位置低的应用中,在具有低空间位置的应用中,对AMM的潜在用途(作为真实的多端口记忆)。
Memory load/store instructions consume an important part in execution time and energy consumption in domain-specific accelerators. For designing highly parallel systems, available parallelism at each granularity is extracted from the workloads. The maximal use of parallelism at each granularity in these high-performance designs requires the utilization of multi-port memories. Currently, true multiport designs are less popular because there is no inherent EDA support for multiport memory beyond 2-ports, utilizing more ports requires circuit-level implementation and hence a high design time. In this work, we present a framework for Design Space Exploration of Algorithmic Multi-Port Memories (AMM) in ASICs. We study different AMM designs in the literature, discuss how we incorporate them in the Pre-RTL Aladdin Framework with different memory depth, port configurations and banking structures. From our analysis on selected applications from the MachSuite (accelerator benchmark suite), we understand and quantify the potential use of AMMs (as true multiport memories) for high performance in applications with low spatial locality in memory access patterns.