使茂密的快速和轻巧

论文标题

使茂密的快速和轻巧

Making DensePose fast and light

论文作者

Rakhimov, Ruslan, Bogomolov, Emil, Notchenko, Alexandr, Mao, Fung, Artemov, Alexey, Zorin, Denis, Burnaev, Evgeny

论文摘要

致密估计任务是增强用户体验的计算机视觉应用程序的重要一步，从增强现实到布料。能够解决此任务的现有神经网络模型被大量参数化，并且从转移到嵌入式或移动设备的远距离进行了很长的路要走。为了通过当前型号在最终设备上启用密集的姿势，需要支持昂贵的服务器端基础架构并具有稳定的Internet连接。更糟糕的是，移动设备和嵌入式设备并不总是内部具有强大的GPU。在这项工作中，我们针对重新设计密度R-CNN模型的架构的问题，以便最终网络保留其大部分准确性，但变得更加轻巧和快速。为了实现这一目标，我们对近年来的许多深度学习创新进行了测试，专门对23个有效的骨干架构，多个两阶段检测管道修改和自定义模型量化方法进行了消融研究。结果，与基线型号相比，我们实现了$ 17 \ times $型号的尺寸减少和$ 2 \ times $延迟。

DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved $17\times$ model size reduction and $2\times$ latency improvement compared to the baseline model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题