CNN推理在边缘计算中卸载的基于注意力的功能压缩

论文标题

CNN推理在边缘计算中卸载的基于注意力的功能压缩

Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

论文作者

Li, Nan, Iosifidis, Alexandros, Zhang, Qi

论文摘要

本文研究了Device-Edge共同推导系统中CNN推断的计算卸载。受到新兴范式语义通信的启发，我们提出了一种基于自动编码器的新型CNN体系结构（AECNN），以在末端设备上有效提取功能提取。我们根据CNN中的通道注意方法设计一个功能压缩模块，以通过选择最重要的功能来压缩中间数据。为了进一步减少通信开销，我们可以使用熵编码来删除压缩数据中的统计冗余。在接收器，我们设计了一个轻量级解码器，通过从接收到的压缩数据中学习以提高准确性来重建中间数据。为了固定融合，我们使用逐步的方法来训练基于Resnet-50体系结构获得的神经网络。实验结果表明，AECNN可以以超过4％的精度损失来压缩超过256倍的中间数据，这表现优于最先进的工作，即Bottlenet ++。与将推理任务直接卸载到Edge服务器相比，AECNN可以较早地完成推理任务，尤其是在较差的无线通道条件下，这突出了AECNN在保证时间限制内确保更高准确性方面的有效性。

This paper studies the computational offloading of CNN inference in device-edge co-inference systems. Inspired by the emerging paradigm semantic communication, we propose a novel autoencoder-based CNN architecture (AECNN), for effective feature extraction at end-device. We design a feature compression module based on the channel attention method in CNN, to compress the intermediate data by selecting the most important features. To further reduce communication overhead, we can use entropy encoding to remove the statistical redundancy in the compressed data. At the receiver, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To fasten the convergence, we use a step-by-step approach to train the neural networks obtained based on ResNet-50 architecture. Experimental results show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss, which outperforms the state-of-the-art work, BottleNet++. Compared to offloading inference task directly to edge server, AECNN can complete inference task earlier, in particular, under poor wireless channel condition, which highlights the effectiveness of AECNN in guaranteeing higher accuracy within time constraint.

下载PDF全文

下载文献需遵守相关版权规定

论文标题