论文标题

Snapture-一种新型的神经结构,用于静态和动态手势识别

Snapture -- A Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition

论文作者

Ali, Hassan, Jirak, Doreen, Wermter, Stefan

论文摘要

随着预计机器人将更多地参与人们的日常生活,因此需要直观的用户界面的框架。手势识别系统提供了一种自然的交流方式,因此是无缝人类互动(HRI)的组成部分。最近几年见证了由深度学习提供动力的计算模型的巨大演变。但是,最新的模型在跨越不同的手势域(例如标志和共同语音)上扩展。在本文中,我们提出了一种新型的混合手势识别系统。我们的体系结构使学习静态和动态手势:通过在其顶峰捕获手势性能的所谓“快照”,我们将手姿势与动态运动结合在一起。此外,我们提出了一种分析手势的动作曲线以发现其动态特征的方法,并允许根据运动量调节静态通道。与CNNLSTM基线相比,我们的评估证明了我们的方法在两个手势基准上的优越性。我们还以手势类别的基础进行了分析,该分析揭示了我们的snapture架构的潜力以改进性能。由于其模块化实现,我们的框架允许将其他多模式数据(如面部表达式和头部跟踪)集成到HRI场景中的重要提示,并将其集成到一个体系结构中。因此,我们的工作既有助于与机器人进行非语言交流的手势识别研究和机器学习应用。

As robots are expected to get more involved in people's everyday lives, frameworks that enable intuitive user interfaces are in demand. Hand gesture recognition systems provide a natural way of communication and, thus, are an integral part of seamless Human-Robot Interaction (HRI). Recent years have witnessed an immense evolution of computational models powered by deep learning. However, state-of-the-art models fall short in expanding across different gesture domains, such as emblems and co-speech. In this paper, we propose a novel hybrid hand gesture recognition system. Our architecture enables learning both static and dynamic gestures: by capturing a so-called "snapshot" of the gesture performance at its peak, we integrate the hand pose along with the dynamic movement. Moreover, we present a method for analyzing the motion profile of a gesture to uncover its dynamic characteristics and which allows regulating a static channel based on the amount of motion. Our evaluation demonstrates the superiority of our approach on two gesture benchmarks compared to a CNNLSTM baseline. We also provide an analysis on a gesture class basis that unveils the potential of our Snapture architecture for performance improvements. Thanks to its modular implementation, our framework allows the integration of other multimodal data like facial expressions and head tracking, which are important cues in HRI scenarios, into one architecture. Thus, our work contributes both to gesture recognition research and machine learning applications for non-verbal communication with robots.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源