动作图：使用图形卷积网络弱监督的动作定位

论文标题

动作图：使用图形卷积网络弱监督的动作定位

Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks

论文作者

Rashid, Maheen, Kjellström, Hedvig, Lee, Yong Jae

论文摘要

我们提出了一种基于图卷积的弱监督行动定位的方法。为了找到与相关动作类相对应的视频时间段并分类，系统必须能够识别每个视频中的区分时间段，并确定每个动作的全部范围。用弱视频级别标签实现这一目标，要求系统在训练数据中的视频中使用相似性和相似性，以了解动作的出现，以及构成动作范围的子行动。但是，当前的方法并不能明确使用视频矩之间的相似性来告知本地化和分类预测。我们提出了一种新颖的方法，该方法使用图形卷积来明确模拟视频矩之间的相似性。我们的方法利用了编码外观和运动的相似性图，并在Thumos '14，ActivityNet 1.2和Charades上推动了最新的最新动作定位。

We present a method for weakly-supervised action localization based on graph convolutions. In order to find and classify video time segments that correspond to relevant action classes, a system must be able to both identify discriminative time segments in each video, and identify the full extent of each action. Achieving this with weak video level labels requires the system to use similarity and dissimilarity between moments across videos in the training data to understand both how an action appears, as well as the sub-actions that comprise the action's full extent. However, current methods do not make explicit use of similarity between video moments to inform the localization and classification predictions. We present a novel method that uses graph convolutions to explicitly model similarity between video moments. Our method utilizes similarity graphs that encode appearance and motion, and pushes the state of the art on THUMOS '14, ActivityNet 1.2, and Charades for weakly supervised action localization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题