RES3ATN-视频中的手势识别的深3D残留注意力网络

论文标题

RES3ATN-视频中的手势识别的深3D残留注意力网络

Res3ATN -- Deep 3D Residual Attention Network for Hand Gesture Recognition in Videos

论文作者

Dhingra, Naina, Kunz, Andreas

论文摘要

手势识别是在视频中解决的艰巨任务。在本文中，我们使用一个3D残留注意网络，该网络是手势识别的端到端训练。根据堆叠的多个注意力块，我们构建了一个3D网络，该网络在每个注意力块上生成不同的功能。我们的3D基于注意力的剩余网络（RES3ATN）可以建立并扩展到非常深层的层。使用此网络，基于三个公开可用数据集对其他3D网络进行了广泛的分析。将RES3ATN网络性能与C3D，RESNET-10和RESNEXT-101网络进行比较。我们还通过不同的注意力块研究和评估我们的基线网络。比较表明，具有3个注意力障碍的3D残留注意力网络在注意力学习方面具有牢固的注意，并且能够以更好的精度对手势进行分类，从而超过现有网络。

Hand gesture recognition is a strenuous task to solve in videos. In this paper, we use a 3D residual attention network which is trained end to end for hand gesture recognition. Based on the stacked multiple attention blocks, we build a 3D network which generates different features at each attention block. Our 3D attention based residual network (Res3ATN) can be built and extended to very deep layers. Using this network, an extensive analysis is performed on other 3D networks based on three publicly available datasets. The Res3ATN network performance is compared to C3D, ResNet-10, and ResNext-101 networks. We also study and evaluate our baseline network with different number of attention blocks. The comparison shows that the 3D residual attention network with 3 attention blocks is robust in attention learning and is able to classify the gestures with better accuracy, thus outperforming existing networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题