通过采样而无需替换来估计离散随机变量的梯度

论文标题

通过采样而无需替换来估计离散随机变量的梯度

Estimating Gradients for Discrete Random Variables by Sampling without Replacement

论文作者

Kool, Wouter, van Hoof, Herke, Welling, Max

论文摘要

我们基于采样而无需替换，我们得出了对离散随机变量的预期的无偏估计量，从而避免了重复样本，从而降低了方差。我们表明我们的估计器可以作为三个不同估计量的Rao-Blackwellization得出。将我们的估计器与增强器相结合，我们获得了一个策略梯度估计器，并使用内置控制变量降低了其方差，该变量是可以在没有其他模型评估的情况下获得的。所得估计器与其他梯度估计器密切相关。玩具问题的实验，一个分类的自动编码器和结构化的预测问题表明，我们的估计器是唯一在高熵和低熵设置中始终始终是最佳估计器之一的估计器。

We derive an unbiased estimator for expectations over discrete random variables based on sampling without replacement, which reduces variance as it avoids duplicate samples. We show that our estimator can be derived as the Rao-Blackwellization of three different estimators. Combining our estimator with REINFORCE, we obtain a policy gradient estimator and we reduce its variance using a built-in control variate which is obtained without additional model evaluations. The resulting estimator is closely related to other gradient estimators. Experiments with a toy problem, a categorical Variational Auto-Encoder and a structured prediction problem show that our estimator is the only estimator that is consistently among the best estimators in both high and low entropy settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题