论文标题
更快的体素蛋白:实时3D人姿势估计正拼图投影
Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
论文作者
论文摘要
虽然基于体素的方法已经获得了多人摄影剂的多人3D姿势估计的有希望的结果,但它们具有沉重的计算负担,尤其是对于大型场景。我们提出更快的素素,以通过将特征量重新投影到三个二维坐标平面并分别估算x,y,z坐标来解决挑战。为此,我们首先通过分别基于投影到XY平面和Z轴的体积功能来估算一个2D框及其高度,首先通过一个3D边界框来定位每个人。然后,对于每个人,我们分别估算三个坐标平面的部分关节坐标,然后将其融合以获得最终的3D姿势。该方法不含昂贵的3D-CNN,并将其素的速度提高了十倍,同时作为最先进的方法具有竞争力的准确性,证明了其在实时应用中的潜力。
While the voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras, they suffer from heavy computation burdens, especially for large scenes. We present Faster VoxelPose to address the challenge by re-projecting the feature volume to the three two-dimensional coordinate planes and estimating X, Y, Z coordinates from them separately. To that end, we first localize each person by a 3D bounding box by estimating a 2D box and its height based on the volume features projected to the xy-plane and z-axis, respectively. Then for each person, we estimate partial joint coordinates from the three coordinate planes separately which are then fused to obtain the final 3D pose. The method is free from costly 3D-CNNs and improves the speed of VoxelPose by ten times and meanwhile achieves competitive accuracy as the state-of-the-art methods, proving its potential in real-time applications.