可学习的混合精液和降低尺寸降低共同设计用于低存储激活

论文标题

可学习的混合精液和降低尺寸降低共同设计用于低存储激活

Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation

论文作者

Tai, Yu-Shan, Chang, Cheng-Yang, Teng, Chieh-Fang, AnYeu, Wu

论文摘要

最近，深度卷积神经网络（CNN）取得了许多引人注目的结果。但是，在资源受限的边缘设备上部署CNN受到有限的存储器带宽的约束，用于在推理期间（即激活）传输大型中介数据。现有的研究利用混合精液和降低尺寸来降低计算复杂性，但对其在激活压缩的应用中的关注较少。为了进一步利用激活的冗余，我们提出了可学习的混合精液和尺寸缩小共同设计系统，该系统将通道分为组并根据其重要性分配特定的压缩策略。此外，提出的动态搜索技术会扩大搜索空间，并自动找到最佳的位宽度分配。我们的实验结果表明，与RESNET18和MOBILENETV2上现有的混合精液方法相比，所提出的方法的准确度提高了3.54％/1.27％，并节省了0.18/2.02位。

Recently, deep convolutional neural networks (CNNs) have achieved many eye-catching results. However, deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference, i.e., activation. Existing research utilizes mixed-precision and dimension reduction to reduce computational complexity but pays less attention to its application for activation compression. To further exploit the redundancy in activation, we propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates specific compression policies according to their importance. In addition, the proposed dynamic searching technique enlarges search space and finds out the optimal bit-width allocation automatically. Our experimental results show that the proposed methods improve 3.54%/1.27% in accuracy and save 0.18/2.02 bits per value over existing mixed-precision methods on ResNet18 and MobileNetv2, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题