修剪的轻巧编码器用于计算机视觉

论文标题

修剪的轻巧编码器用于计算机视觉

Pruned Lightweight Encoders for Computer Vision

论文作者

Žádník, Jakub, Mäkitalo, Markku, Jääskeläinen, Pekka

论文摘要

关键的关键计算机视觉系统（例如自动驾驶或无人机控制）在将神经网络推理转换为远程计算机时需要快速图像或视频压缩。为了确保在近传感器边缘设备上的低潜伏期，我们建议使用具有恒定比特率和修剪编码配置的轻质编码器，即ASTC和JPEG XS。修剪引入了严重的失真，我们表明可以通过在减压后用压缩数据来恢复神经网络。这种方法不会修改网络体系结构或需要编码格式修改。通过使用压缩数据集重新训练，我们分别降低了由于ASTC压缩而导致的联合（MIOU）降解的分类准确性和分割平均值分别为4.9-5.0个百分点（PP）和4.4-4.0 pp。使用相同的方法，MIOU由于JPEG XS压缩而在主配置文件下损失损失，从编码速度来恢复为2.7-2.3 pp。我们的ASTC Encoder实现比JPEG快2.3倍。即使JPEG XS参考编码器需要优化才能达到低延迟，我们表明，禁用显着性标志编码可以节省22-23％的编码时间，而重新培训后的成本为0.4-0.3 miou。

Latency-critical computer vision systems, such as autonomous driving or drone control, require fast image or video compression when offloading neural network inference to a remote computer. To ensure low latency on a near-sensor edge device, we propose the use of lightweight encoders with constant bitrate and pruned encoding configurations, namely, ASTC and JPEG XS. Pruning introduces significant distortion which we show can be recovered by retraining the neural network with compressed data after decompression. Such an approach does not modify the network architecture or require coding format modifications. By retraining with compressed datasets, we reduced the classification accuracy and segmentation mean intersection over union (mIoU) degradation due to ASTC compression to 4.9-5.0 percentage points (pp) and 4.4-4.0 pp, respectively. With the same method, the mIoU lost due to JPEG XS compression at the main profile was restored to 2.7-2.3 pp. In terms of encoding speed, our ASTC encoder implementation is 2.3x faster than JPEG. Even though the JPEG XS reference encoder requires optimizations to reach low latency, we showed that disabling significance flag coding saves 22-23% of encoding time at the cost of 0.4-0.3 mIoU after retraining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题