大规模视频生成的暂时换档gan

论文标题

大规模视频生成的暂时换档gan

Temporal Shift GAN for Large Scale Video Generation

论文作者

Munoz, Andres, Zolfaghari, Mohammadreza, Argus, Max, Brox, Thomas

论文摘要

在过去的几年中，视频生成模型变得越来越流行，但是当今使用的标准2D体系结构缺乏自然时空的建模功能。在本文中，我们提出了视频生成的网络体系结构，该网络体系结构对时空的一致性进行建模，而无需诉诸昂贵的3D架构。该体系结构促进了相邻时间点之间的信息交换，从而提高了高级结构的时间一致性以及生成的框架的低级细节。该方法实现了最先进的定量性能，如UCF-101数据集的成立评分以及更好的定性结果所衡量。我们还引入了一种使用下游任务进行评估的新定量度量（S3）。此外，我们提出了一个新的多标签数据集Maistoy，使我们能够评估模型的概括。

Video generation models have become increasingly popular in the last few years, however the standard 2D architectures used today lack natural spatio-temporal modelling capabilities. In this paper, we present a network architecture for video generation that models spatio-temporal consistency without resorting to costly 3D architectures. The architecture facilitates information exchange between neighboring time points, which improves the temporal consistency of both the high level structure as well as the low-level details of the generated frames. The approach achieves state-of-the-art quantitative performance, as measured by the inception score on the UCF-101 dataset as well as better qualitative results. We also introduce a new quantitative measure (S3) that uses downstream tasks for evaluation. Moreover, we present a new multi-label dataset MaisToy, which enables us to evaluate the generalization of the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题