几次动作识别的复合原型匹配

论文标题

几次动作识别的复合原型匹配

Compound Prototype Matching for Few-shot Action Recognition

论文作者

Huang, Yifei, Yang, Lijin, Sato, Yoichi

论文摘要

很少有动作识别旨在仅使用少量标记的训练样本来识别新型动作类别。在这项工作中，我们提出了一种新颖的方法，该方法首先将每个视频汇总到由一组全球原型和一组集中原型组成的复合原型中，然后根据原型比较视频相似性。鼓励每个全局原型总结整个视频的特定方面，例如动作的开始/演变。由于没有针对全球原型提供明确的注释，因此我们使用一组专注的原型专注于视频中的某些时间戳。我们通过匹配支持视频和查询视频之间的复合原型来比较视频相似性。例如，全局原型直接匹配以从相同的角度比较视频，例如比较两个动作是否类似。对于集中的原型，由于动作在视频中具有各种时间变化，因此我们采用二分匹配，以比较具有不同时间位置和偏移的动作。实验表明，我们提出的方法在多个基准上实现了最先进的结果。

Few-shot action recognition aims to recognize novel action classes using only a small number of labeled training samples. In this work, we propose a novel approach that first summarizes each video into compound prototypes consisting of a group of global prototypes and a group of focused prototypes, and then compares video similarity based on the prototypes. Each global prototype is encouraged to summarize a specific aspect from the entire video, for example, the start/evolution of the action. Since no clear annotation is provided for the global prototypes, we use a group of focused prototypes to focus on certain timestamps in the video. We compare video similarity by matching the compound prototypes between the support and query videos. The global prototypes are directly matched to compare videos from the same perspective, for example, to compare whether two actions start similarly. For the focused prototypes, since actions have various temporal variations in the videos, we apply bipartite matching to allow the comparison of actions with different temporal positions and shifts. Experiments demonstrate that our proposed method achieves state-of-the-art results on multiple benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题