动作识别的定向时间建模

论文标题

动作识别的定向时间建模

Directional Temporal Modeling for Action Recognition

论文作者

Li, Xinyu, Shuai, Bing, Tighe, Joseph

论文摘要

许多当前的活动识别模型都使用3D卷积神经网络（例如I3d，i3d-NL）生成局部空间时间特征。但是，此类功能不会编码剪贴级有序的时间信息。在本文中，我们引入了通道独立的定向卷积（CIDC）操作，该操作学会了对局部特征之间的时间演变进行建模。通过应用多个CIDC单元，我们构建了一个轻巧的网络，该网络可以在多个空间尺度上建模剪辑级的时间演化。我们的CIDC网络可以连接到任何活动识别骨干网络。我们在四个流行的活动识别数据集上评估了我们的方法，并在最新技术上不断改进。我们进一步可视化CIDC网络的激活图，并表明它能够专注于框架的更有意义，动作相关的部分。

Many current activity recognition models use 3D convolutional neural networks (e.g. I3D, I3D-NL) to generate local spatial-temporal features. However, such features do not encode clip-level ordered temporal information. In this paper, we introduce a channel independent directional convolution (CIDC) operation, which learns to model the temporal evolution among local features. By applying multiple CIDC units we construct a light-weight network that models the clip-level temporal evolution across multiple spatial scales. Our CIDC network can be attached to any activity recognition backbone network. We evaluate our method on four popular activity recognition datasets and consistently improve upon state-of-the-art techniques. We further visualize the activation map of our CIDC network and show that it is able to focus on more meaningful, action related parts of the frame.

下载PDF全文

下载文献需遵守相关版权规定

论文标题