通过用脱钩的解码器聚集学习的高斯后部来训练β-VAE

论文标题

通过用脱钩的解码器聚集学习的高斯后部来训练β-VAE

Training β-VAE by Aggregating a Learned Gaussian Posterior with a Decoupled Decoder

论文作者

Li, Jianning, Fragemann, Jana, Ahmadi, Seyed-Ahmad, Kleesiek, Jens, Egger, Jan

论文摘要

重建损失和kullback-leibler Divergence（KLD）损失在变异自动编码器（VAE）中通常扮演敌对角色，并调整$β$ -VAE中KLD损失的重量以在两种损失之间取得平衡是一个棘手的损失和数据集特定的任务。结果，如果不仔细调节重量$β$，则目前的VAE培训中的实践通常会导致重建保真度与潜在空间的连续性$/$分解之间的权衡。在本文中，我们介绍了直觉，并仔细分析了两种损失的拮抗机制，并根据洞察力提出了一种简单而有效的两阶段方法，用于训练A VAE。具体而言，该方法汇总了一个学习的高斯后部$ z \ simq_θ（z | x）$，并从kld损失中解耦，该损失已训练以学习新的条件分布$ p_ccadeplistion $ p_ccation $ p_ccation $ p _ = $ x $。在实验上，我们表明，总vae最大程度地满足了对潜在空间的高斯假设，而仍然达到可与潜在空间仅通过$ \ Mathcal {n}（n}（\ Mathbf {0}）较大正规化的重建误差。所提出的方法不需要在常见的VAE培训实践中需要的特定数据集（即KLD重量$β$）调整过度参数（即KLD重量$β$）。我们使用用于3D头骨重建和形状完成的医疗数据集评估该方法，结果表明使用该方法训练的VAE的有希望的生成能力。此外，通过对潜在变量的指导操纵，我们在现有的自动编码器（AE）基于基于AutoCodeder（AE）的方法（例如VAE）（例如VAE）之间建立了联系，以解决形状完成问题。代码和预培训的权重可在https://github.com/jianningli/skullvae上找到

The reconstruction loss and the Kullback-Leibler divergence (KLD) loss in a variational autoencoder (VAE) often play antagonistic roles, and tuning the weight of the KLD loss in $β$-VAE to achieve a balance between the two losses is a tricky and dataset-specific task. As a result, current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space, if the weight $β$ is not carefully tuned. In this paper, we present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose, based on the insights, a simple yet effective two-stage method for training a VAE. Specifically, the method aggregates a learned Gaussian posterior $z \sim q_θ (z|x)$ with a decoder decoupled from the KLD loss, which is trained to learn a new conditional distribution $p_ϕ (x|z)$ of the input data $x$. Experimentally, we show that the aggregated VAE maximally satisfies the Gaussian assumption about the latent space, while still achieves a reconstruction error comparable to when the latent space is only loosely regularized by $\mathcal{N}(\mathbf{0},I)$. The proposed approach does not require hyperparameter (i.e., the KLD weight $β$) tuning given a specific dataset as required in common VAE training practices. We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method. Besides, through guided manipulation of the latent variables, we establish a connection between existing autoencoder (AE)-based approaches and generative approaches, such as VAE, for the shape completion problem. Codes and pre-trained weights are available at https://github.com/Jianningli/skullVAE

下载PDF全文

下载文献需遵守相关版权规定

论文标题