通过扩散过程的早期停止加速扩散模型

论文标题

通过扩散过程的早期停止加速扩散模型

Accelerating Diffusion Models via Early Stop of the Diffusion Process

论文作者

Lyu, Zhaoyang, XU, Xudong, Yang, Ceyuan, Lin, Dahua, Dai, Bo

论文摘要

降级扩散概率模型（DDPM）在各种一代任务上取得了令人印象深刻的表现。通过对将数据分布逐渐扩散到高斯分布的反向过程中，可以将DDPMS产生样品被视为迭代地降低随机采样的高斯噪声。但是，在实践中，DDPM通常需要数百次甚至数千个deno的步骤，才能从高斯噪声中获取高质量的样本，从而导致推理效率极低。在这项工作中，我们提出了一种原则上的加速策略，称为DDPM的早期DDPM（ES-DDPM）。关键思想是尽早停止扩散过程，在仅考虑少数初始扩散步骤，而反向降解过程则从非高斯分布开始。通过在ES-DDPM中进一步采用强大的预训练的生成模型，例如GAN和VAE，可以通过从预先训练的生成模型中获得的样品来有效地从目标非高斯分布进行采样。通过这种方式，所需的去涂步骤的数量大大减少了。同时，ES-DDPM的样品质量也大大提高，表现优于香草DDPM和所采用的预训练的生成模型。在CIFAR-10，Celeba，Imagenet，Lsun卧室和LSUN-CAT的广泛实验中，ES-DDPM获得了有希望的加速效应和对代表性基线方法的绩效改善。此外，ES-DDPM还展示了几种有吸引力的属性，包括与现有加速度方法是正交的，并且同时启用了图像生成中的全局语义和局部像素级控制。

Denoising Diffusion Probabilistic Models (DDPMs) have achieved impressive performance on various generation tasks. By modeling the reverse process of gradually diffusing the data distribution into a Gaussian distribution, generating a sample in DDPMs can be regarded as iteratively denoising a randomly sampled Gaussian noise. However, in practice DDPMs often need hundreds even thousands of denoising steps to obtain a high-quality sample from the Gaussian noise, leading to extremely low inference efficiency. In this work, we propose a principled acceleration strategy, referred to as Early-Stopped DDPM (ES-DDPM), for DDPMs. The key idea is to stop the diffusion process early where only the few initial diffusing steps are considered and the reverse denoising process starts from a non-Gaussian distribution. By further adopting a powerful pre-trained generative model, such as GAN and VAE, in ES-DDPM, sampling from the target non-Gaussian distribution can be efficiently achieved by diffusing samples obtained from the pre-trained generative model. In this way, the number of required denoising steps is significantly reduced. In the meantime, the sample quality of ES-DDPM also improves substantially, outperforming both the vanilla DDPM and the adopted pre-trained generative model. On extensive experiments across CIFAR-10, CelebA, ImageNet, LSUN-Bedroom and LSUN-Cat, ES-DDPM obtains promising acceleration effect and performance improvement over representative baseline methods. Moreover, ES-DDPM also demonstrates several attractive properties, including being orthogonal to existing acceleration methods, as well as simultaneously enabling both global semantic and local pixel-level control in image generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题