论文标题
扩大和挤压:迈向准确有效的QNN
Widening and Squeezing: Towards Accurate and Efficient QNNs
论文作者
论文摘要
量化神经网络(QNN)对行业非常有吸引力,因为它们非常廉价的计算和存储开销,但是它们的性能仍然比具有全精度参数的网络要差。大多数现有方法旨在通过利用更有效的训练技术来增强QNN的性能,尤其是二进制神经网络。但是,我们发现量化特征的表示能力比实验的全精度特征要弱得多。我们通过将原始完整网络中的功能投射到高维量化功能中来解决此问题。同时,将消除冗余量化功能,以避免某些数据集的维度增长。然后,将建立一个紧凑的量化神经网络,但具有足够的表示能力。基准数据集上的实验结果表明,所提出的方法能够建立QNN的参数和计算要少得多,但与完整精确基线模型的性能几乎相同,例如$ 29.9 \%$ $ $ TOP-1在Imagenet ILSVRC 2012数据集中二进制RESNET-18的错误。
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. However, we find the representation capability of quantization features is far weaker than full-precision features by experiments. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features. Simultaneously, redundant quantization features will be eliminated in order to avoid unrestricted growth of dimensions for some datasets. Then, a compact quantization neural network but with sufficient representation ability will be established. Experimental results on benchmark datasets demonstrate that the proposed method is able to establish QNNs with much less parameters and calculations but almost the same performance as that of full-precision baseline models, e.g. $29.9\%$ top-1 error of binary ResNet-18 on the ImageNet ILSVRC 2012 dataset.