论文标题

HMQ:硬件友好的CNN的混合精度量化块

HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs

论文作者

Habi, Hai Victor, Jennings, Roy H., Netzer, Arnon

论文摘要

网络量化的最新工作使用混合精度量化产生了最新结果。对于许多有效的边缘设备硬件实现的必要要求是它们的量化器是统一的,并且具有两个阈值。在这项工作中,我们介绍了硬件友好的混合精度量化块(HMQ),以满足此要求。 HMQ是一个混合精度量化块,将gumbel-softmax估计器重新用于一对量化参数的平滑估计器,即位,位宽度和阈值。 HMQ使用它来搜索量化方案的有限空间。从经验上讲,我们将HMQ应用于量化在CIFAR10和Imagenet上训练的分类模型。对于ImageNet,我们量化了四个不同的体系结构,并表明,尽管对我们的量化方案有附加限制,但我们还达到了竞争性,在某些情况下是最先进的结果。

Recent work in network quantization produced state-of-the-art results using mixed precision quantization. An imperative requirement for many efficient edge device hardware implementations is that their quantizers are uniform and with power-of-two thresholds. In this work, we introduce the Hardware Friendly Mixed Precision Quantization Block (HMQ) in order to meet this requirement. The HMQ is a mixed precision quantization block that repurposes the Gumbel-Softmax estimator into a smooth estimator of a pair of quantization parameters, namely, bit-width and threshold. HMQs use this to search over a finite space of quantization schemes. Empirically, we apply HMQs to quantize classification models trained on CIFAR10 and ImageNet. For ImageNet, we quantize four different architectures and show that, in spite of the added restrictions to our quantization scheme, we achieve competitive and, in some cases, state-of-the-art results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源