使用自适应钥匙框开采识别视频剪辑中的微表达

论文标题

使用自适应钥匙框开采识别视频剪辑中的微表达

Recognizing Micro-Expression in Video Clip with Adaptive Key-Frame Mining

论文作者

Peng, Min, Wang, Chongyang, Gao, Yuan, Bi, Tao, Chen, Tong, Shi, Yu, Zhou, Xiang-Dong

论文摘要

作为面部情绪的自发表达，微表达揭示了无法控制人类的潜在情绪。在微表达中，面部运动是短暂的，随着时间的流逝而定位。但是，基于从完整视频剪辑中学到的各种深度学习技术的现有表示通常是多余的。此外，利用每个视频剪辑的单个顶端框架的方法需要专家注释并牺牲时间动态。为了同时定位并认识到这种短暂的面部运动，我们提出了一种新颖的端到端深度学习建筑，称为自适应钥匙框架矿业网络（AKMNET）。 Akmnet在微表达式的视频剪辑上操作，能够通过结合自学习的本地关键框架的空间特征及其全局 - 暂时性动力学来学习歧视性时空表示。理论分析和经验评估表明，与多个基准数据集上的最新方法相比，提出的方法提高了识别精度。

As a spontaneous expression of emotion on face, micro-expression reveals the underlying emotion that cannot be controlled by human. In micro-expression, facial movement is transient and sparsely localized through time. However, the existing representation based on various deep learning techniques learned from a full video clip is usually redundant. In addition, methods utilizing the single apex frame of each video clip require expert annotations and sacrifice the temporal dynamics. To simultaneously localize and recognize such fleeting facial movements, we propose a novel end-to-end deep learning architecture, referred to as adaptive key-frame mining network (AKMNet). Operating on the video clip of micro-expression, AKMNet is able to learn discriminative spatio-temporal representation by combining spatial features of self-learned local key frames and their global-temporal dynamics. Theoretical analysis and empirical evaluation show that the proposed approach improved recognition accuracy in comparison with state-of-the-art methods on multiple benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题