有效的HRNET：轻巧的高分辨率多人姿势估计的有效缩放

论文标题

有效的HRNET：轻巧的高分辨率多人姿势估计的有效缩放

EfficientHRNet: Efficient Scaling for Lightweight High-Resolution Multi-Person Pose Estimation

论文作者

Neff, Christopher, Sheth, Aneri, Furgurson, Steven, Tabkhi, Hamed

论文摘要

对于许多新兴的智能物联网应用程序，对轻质多人姿势估计的需求不断增长。但是，现有的算法倾向于具有较大的模型大小和强烈的计算要求，使其不适合实时应用程序，并在资源受限的硬件上部署。轻便和实时方法极为罕见，并且以劣质精度为代价。在本文中，我们介绍了一个有效的hrnet，这是一个轻巧的多人姿势估计器，能够在资源约束设备上实时执行。通过统一模型缩放的最新进展，具有高分辨率特征表示，有效Hrnet创建了高度准确的模型，同时还可以减少足够的计算以实现实时性能。最大的模型能够在当前最新的4.4％精度范围内，而具有1/3的模型大小和1/6计算，在Nvidia Jetson Xavier上获得23 fps。与最高的实时方法相比，有效的HRNET可以提高准确性22％，同时以1/3的功率实现相似的FPS。在每个级别上，有效的Hrnet在计算上比其他自下而上的2D人类姿势估计方法更有效，同时实现了高度竞争的准确性。

There is an increasing demand for lightweight multi-person pose estimation for many emerging smart IoT applications. However, the existing algorithms tend to have large model sizes and intense computational requirements, making them ill-suited for real-time applications and deployment on resource-constrained hardware. Lightweight and real-time approaches are exceedingly rare and come at the cost of inferior accuracy. In this paper, we present EfficientHRNet, a family of lightweight multi-person human pose estimators that are able to perform in real-time on resource-constrained devices. By unifying recent advances in model scaling with high-resolution feature representations, EfficientHRNet creates highly accurate models while reducing computation enough to achieve real-time performance. The largest model is able to come within 4.4% accuracy of the current state-of-the-art, while having 1/3 the model size and 1/6 the computation, achieving 23 FPS on Nvidia Jetson Xavier. Compared to the top real-time approach, EfficientHRNet increases accuracy by 22% while achieving similar FPS with 1/3 the power. At every level, EfficientHRNet proves to be more computationally efficient than other bottom-up 2D human pose estimation approaches, while achieving highly competitive accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题