论文标题
Waldo:使用对象层分解和参数流程预测的未来视频合成
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction
论文作者
论文摘要
本文介绍了Waldo(扭曲层被构造的对象),这是一种对过去视频帧预测的新方法。单个图像分解为组合对象掩模和一小部分控制点的多层。该图层结构在每个视频中的所有帧中共享,以构建密集的框架间连接。复杂的场景运动是通过结合与各个层相关的参数几何变换来建模的,并且视频合成被分解为发现与过去框架相关的图层,预测了即将到来的相应转换,并相应地翘曲关联的对象区域,并填充其余图像部分。在包括Urban Videos(CityScapes和Kitti)以及具有非韧性动作(UCF-Sports和H3.6M)的视频在内的多个基准测试的广泛实验表明,我们的方法在每种情况下都会通过一个大量的余量来胜过艺术的状态。可以在项目网页https://16lemoing.github.io/waldo中找到代码,预估计的模型和视频样本。
This paper presents WALDO (WArping Layer-Decomposed Objects), a novel approach to the prediction of future video frames from past ones. Individual images are decomposed into multiple layers combining object masks and a small set of control points. The layer structure is shared across all frames in each video to build dense inter-frame connections. Complex scene motions are modeled by combining parametric geometric transformations associated with individual layers, and video synthesis is broken down into discovering the layers associated with past frames, predicting the corresponding transformations for upcoming ones and warping the associated object regions accordingly, and filling in the remaining image parts. Extensive experiments on multiple benchmarks including urban videos (Cityscapes and KITTI) and videos featuring nonrigid motions (UCF-Sports and H3.6M), show that our method consistently outperforms the state of the art by a significant margin in every case. Code, pretrained models, and video samples synthesized by our approach can be found in the project webpage https://16lemoing.github.io/waldo.