论文标题
瓦迪斯(Quo Vadis),骨架动作识别?
Quo Vadis, Skeleton Action Recognition ?
论文作者
论文摘要
在本文中,我们研究了基于骨架的人类行动识别景观的当前和即将到来的边界。为了研究野外的骨骼行动识别,我们介绍了骨骼152,这是一个策划的和3-D姿势宣布的RGB视频子集,该视频来自Kinetics-700,这是一个大规模的动作数据集。我们将研究扩展到通过引入骨架模仿物(来自最近引入的Mimetics数据集的数据集)来扩展研究。我们还介绍了隐喻,这是一个带有标题风格的YouTube视频的数据集,内容涉及流行的社交游戏愚蠢的Charades和解释性的舞蹈表演。我们在NTU-1220数据集上基于最新模型,并对结果进行多层评估。在新引入的数据集中基准测试NTU-1220的最佳性能者的结果揭示了野外动作引起的挑战和域间隙。总体而言,我们的工作是现有方法和数据集的优势和局限性。通过介绍的数据集,我们的工作为人类行动识别提供了新的边界。
In this paper, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. To study skeleton-action recognition in the wild, we introduce Skeletics-152, a curated and 3-D pose-annotated subset of RGB videos sourced from Kinetics-700, a large-scale action dataset. We extend our study to include out-of-context actions by introducing Skeleton-Mimetics, a dataset derived from the recently introduced Mimetics dataset. We also introduce Metaphorics, a dataset with caption-style annotated YouTube videos of the popular social game Dumb Charades and interpretative dance performances. We benchmark state-of-the-art models on the NTU-120 dataset and provide multi-layered assessment of the results. The results from benchmarking the top performers of NTU-120 on the newly introduced datasets reveal the challenges and domain gap induced by actions in the wild. Overall, our work characterizes the strengths and limitations of existing approaches and datasets. Via the introduced datasets, our work enables new frontiers for human action recognition.