新月：驯服记忆不规则，以加速深点云分析

论文标题

新月：驯服记忆不规则，以加速深点云分析

Crescent: Taming Memory Irregularities for Accelerating Deep Point Cloud Analytics

论文作者

Feng, Yu, Hammonds, Gunnar, Gan, Yiming, Zhu, Yuhao

论文摘要

点云中的3D感知正在改变未来智能机器的感知能力。但是，点云算法受到不规则内存访问的困扰，导致记忆子系统中的效率低下，这瓶颈总体效率。本文提出了Crescent，这是一种算法 - 硬件的共同设计系统，该系统在Deep Point Cloud Analytics中驯服了不规则，同时达到了高精度。为此，我们介绍了两种近似技术，即近似邻居搜索以及有选择的银行冲突责任，它们“正规化”了DRAM和SRAM内存访问。但是，这样做必然会引入准确性损失，我们通过将近似值整合到网络培训过程中的新网络培训程序来减轻。从本质上讲，我们的培训程序训练以特定近似设置为条件的模型，因此保留了高精度。实验表明，与优化的基线加速器相比，新月将性能翻了一番，并将能量消耗减半，而精度损失<1％。我们论文的代码可在以下网址获得：https：//github.com/horizon-research/crescent。

3D perception in point clouds is transforming the perception ability of future intelligent machines. Point cloud algorithms, however, are plagued by irregular memory accesses, leading to massive inefficiencies in the memory sub-system, which bottlenecks the overall efficiency. This paper proposes Crescent, an algorithm-hardware co-design system that tames the irregularities in deep point cloud analytics while achieving high accuracy. To that end, we introduce two approximation techniques, approximate neighbor search and selectively bank conflict elision, that "regularize" the DRAM and SRAM memory accesses. Doing so, however, necessarily introduces accuracy loss, which we mitigate by a new network training procedure that integrates approximation into the network training process. In essence, our training procedure trains models that are conditioned upon a specific approximate setting and, thus, retain a high accuracy. Experiments show that Crescent doubles the performance and halves the energy consumption compared to an optimized baseline accelerator with < 1% accuracy loss. The code of our paper is available at: https://github.com/horizon-research/crescent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题