论文标题
通过动态灰度段
Self-Supervised Face Presentation Attack Detection with Dynamic Grayscale Snippets
论文作者
论文摘要
面部表现攻击检测(PAD)在防御面部识别系统免受演示攻击方面起着重要作用。 PAD的成功很大程度上依赖于需要大量标记数据的监督学习,这对于视频尤其具有挑战性,并且通常需要专家知识。为了避免昂贵的标记数据收集,本文提出了一种通过运动预测进行自我监督视频表示学习的新方法。为了实现这一目标,我们基于三个RGB框架利用时间一致性,这些RGB帧在视频序列中以三个不同的时间获取。然后将所获得的帧转换为灰度图像,其中每个图像被指定为三个不同的通道,例如R(红色),G(绿色)和B(蓝色),形成动态灰度片段(DGS)。由此激励,这些标签会自动生成,以使用视频的不同时间长度来基于DG来增加时间多样性,这对下游任务非常有帮助。从我们方法的自我监督性质中受益,我们报告了结果表明,结果表现优于四个公共基准测试的现有方法,即重播攻击,MSU-MFSD,CASIA-FASD和OULU-NPU。解释性分析是通过石灰和Grad-CAM技术进行的,以可视化DGS中使用的最重要功能。
Face presentation attack detection (PAD) plays an important role in defending face recognition systems against presentation attacks. The success of PAD largely relies on supervised learning that requires a huge number of labeled data, which is especially challenging for videos and often requires expert knowledge. To avoid the costly collection of labeled data, this paper presents a novel method for self-supervised video representation learning via motion prediction. To achieve this, we exploit the temporal consistency based on three RGB frames which are acquired at three different times in the video sequence. The obtained frames are then transformed into grayscale images where each image is specified to three different channels such as R(red), G(green), and B(blue) to form a dynamic grayscale snippet (DGS). Motivated by this, the labels are automatically generated to increase the temporal diversity based on DGS by using the different temporal lengths of the videos, which prove to be very helpful for the downstream task. Benefiting from the self-supervised nature of our method, we report the results that outperform existing methods on four public benchmarks, namely, Replay-Attack, MSU-MFSD, CASIA-FASD, and OULU-NPU. Explainability analysis has been carried out through LIME and Grad-CAM techniques to visualize the most important features used in the DGS.