SQ-VAE：具有自降低随机量化的离散表示形式的变异贝叶斯

论文标题

SQ-VAE：具有自降低随机量化的离散表示形式的变异贝叶斯

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

论文作者

Takida, Yuhta, Shibuya, Takashi, Liao, WeiHsiang, Lai, Chieh-Hsin, Ohmura, Junki, Uesaka, Toshimitsu, Murata, Naoki, Takahashi, Shusuke, Kumakura, Toshiyuki, Mitsufuji, Yuki

论文摘要

一个著名的矢量定量变分自动编码器（VQ-VAE）的问题是，学识渊博的离散表示形式仅使用代码书的全部容量的一小部分，也称为代码书崩溃。我们假设VQ-VAE的培训计划涉及一些精心设计的启发式方法，这是这个问题的基础。在本文中，我们提出了一种新的训练方案，该方案通过新颖的随机去量化和量化扩展了标准VAE，称为随机量化变异自动编码器（SQ-VAE）。在SQ-VAE中，我们观察到一种趋势，即在训练的初始阶段进行量化是随机的，但逐渐收敛于确定性量化，我们称之为自我宣布。我们的实验表明，SQ-VAE在不使用常见的启发式方法的情况下改善了CodeBook的利用率。此外，我们从经验上表明，在与视觉和语音有关的任务中，SQ-VAE优于VAE和VQ-VAE。

One noted issue of vector-quantized variational autoencoder (VQ-VAE) is that the learned discrete representation uses only a fraction of the full capacity of the codebook, also known as codebook collapse. We hypothesize that the training scheme of VQ-VAE, which involves some carefully designed heuristics, underlies this issue. In this paper, we propose a new training scheme that extends the standard VAE via novel stochastic dequantization and quantization, called stochastically quantized variational autoencoder (SQ-VAE). In SQ-VAE, we observe a trend that the quantization is stochastic at the initial stage of the training but gradually converges toward a deterministic quantization, which we call self-annealing. Our experiments show that SQ-VAE improves codebook utilization without using common heuristics. Furthermore, we empirically show that SQ-VAE is superior to VAE and VQ-VAE in vision- and speech-related tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题