论文标题
基于可变形内核区域的视频框架插值
Video Frame Interpolation Based on Deformable Kernel Region
论文作者
论文摘要
视频框架插值任务最近在计算机视野领域变得越来越普遍。目前,基于深度学习的许多研究取得了巨大的成功。它们中的大多数基于光流信息,或者是插值内核,或者是这两种方法的组合。但是,这些方法忽略了在合成每个目标像素期间内核区域的位置存在网格限制。这些局限性导致它们不能很好地适应物体形状的不规则性和运动不确定性,这可能导致用于插值的参考像素无关。为了解决此问题,我们重新审视视频插值的可变形卷积,这可以打破固定的网格限制,使参考点的分布更适合于对象的形状,从而使更准确的插值框架扭曲。实验是在四个数据集上进行的,以证明与最先进的替代方案相比,该模型的出色性能。
Video frame interpolation task has recently become more and more prevalent in the computer vision field. At present, a number of researches based on deep learning have achieved great success. Most of them are either based on optical flow information, or interpolation kernel, or a combination of these two methods. However, these methods have ignored that there are grid restrictions on the position of kernel region during synthesizing each target pixel. These limitations result in that they cannot well adapt to the irregularity of object shape and uncertainty of motion, which may lead to irrelevant reference pixels used for interpolation. In order to solve this problem, we revisit the deformable convolution for video interpolation, which can break the fixed grid restrictions on the kernel region, making the distribution of reference points more suitable for the shape of the object, and thus warp a more accurate interpolation frame. Experiments are conducted on four datasets to demonstrate the superior performance of the proposed model in comparison to the state-of-the-art alternatives.