论文标题
PartitionPim:快速处理中的实用回忆分区
PartitionPIM: Practical Memristive Partitions for Fast Processing-in-Memory
论文作者
论文摘要
数字记忆过程中的记忆处理通过能够在横式阵列中具有状态逻辑的基本存储设备克服了存储墙。通过添加回忆分区来动态划分横梁阵列进一步增加了并行性,从而克服了记忆中的回忆处理中固有的权衡。分区的算法拓扑是高度独特的,最近被利用以加速乘法(带有32个分区的11倍)和排序(14倍,有16个分区)。然而,从未考虑过的物理分区的物理实施,例如外围解码器和控制信息,也可能导致巨大的不切实际。本文通过几种新颖的技术克服了这一挑战,呈现了有效的熟悉分区实践设计。我们首先将回忆分区的算法属性形式化为串行,并行和半平行操作。外围开销是通过一种新的半门技术来解决的,该技术可以通过可忽略不计的高效解码。控制开销是通过仔细地减少回忆分区的操作集,同时通过使用诸如共享索引和模式生成器之类的技术来解决可忽略的性能影响。最终,这些有效的实用解决方案,结合了巨大的算法潜力,可能会彻底改变数字记忆过程。
Digital memristive processing-in-memory overcomes the memory wall through a fundamental storage device capable of stateful logic within crossbar arrays. Dynamically dividing the crossbar arrays by adding memristive partitions further increases parallelism, thereby overcoming an inherent trade-off in memristive processing-in-memory. The algorithmic topology of partitions is highly unique, and was recently exploited to accelerate multiplication (11x with 32 partitions) and sorting (14x with 16 partitions). Yet, the physical implementation of memristive partitions, such as the peripheral decoders and the control message, has never been considered and may lead to vast impracticality. This paper overcomes that challenge with several novel techniques, presenting efficient practical designs of memristive partitions. We begin by formalizing the algorithmic properties of memristive partitions into serial, parallel, and semi-parallel operations. Peripheral overhead is addressed via a novel technique of half-gates that enables efficient decoding with negligible overhead. Control overhead is addressed by carefully reducing the operation set of memristive partitions, while resulting in negligible performance impact, by utilizing techniques such as shared indices and pattern generators. Ultimately, these efficient practical solutions, combined with the vast algorithmic potential, may revolutionize digital memristive processing-in-memory.