Langevin Monte Carlo：随机坐标下降和差异降低

论文标题

Langevin Monte Carlo：随机坐标下降和差异降低

Langevin Monte Carlo: random coordinate descent and variance reduction

论文作者

Ding, Zhiyan, Li, Qin

论文摘要

Langevin Monte Carlo（LMC）是一种流行的贝叶斯抽样方法。对于对数符号分布函数，该方法将指数级收敛，直至可控的离散误差。但是，该方法需要评估每次迭代中的完整梯度，并且要在$ \ mathbb {r}^d $上进行问题，这相当于每次迭代的$ d $ times partivation评估。 $ D \ gg1 $时，成本很高。在本文中，我们研究了如何通过在LMC上应用RCD（随机坐标下降）来提高计算效率。理论有两个方面： 1通过盲目将RCD应用于LMC，一个通过随机选择的定向衍生物在迭代中替代了整个梯度。尽管根据迭代的成本降低了成本，但迭代总数增加以达到预设误差耐受性。最终没有计算收益。 2然后，我们将降低方差降低技术（例如SAGA（随机平均梯度）和SVRG（随机方差降低梯度）纳入RCD-LMC。可以证明，与经典的LMC相比，成本降低，在阻尼不足的情况下，通过相同数量的迭代来实现收敛性，而每次迭代仅需要一个方向派生。这意味着我们可以在不足的LMC框架中获得最佳的计算成本。

Langevin Monte Carlo (LMC) is a popular Bayesian sampling method. For the log-concave distribution function, the method converges exponentially fast, up to a controllable discretization error. However, the method requires the evaluation of a full gradient in each iteration, and for a problem on $\mathbb{R}^d$, this amounts to $d$ times partial derivative evaluations per iteration. The cost is high when $d\gg1$. In this paper, we investigate how to enhance computational efficiency through the application of RCD (random coordinate descent) on LMC. There are two sides of the theory: 1 By blindly applying RCD to LMC, one surrogates the full gradient by a randomly selected directional derivative per iteration. Although the cost is reduced per iteration, the total number of iteration is increased to achieve a preset error tolerance. Ultimately there is no computational gain; 2 We then incorporate variance reduction techniques, such as SAGA (stochastic average gradient) and SVRG (stochastic variance reduced gradient), into RCD-LMC. It will be proved that the cost is reduced compared with the classical LMC, and in the underdamped case, convergence is achieved with the same number of iterations, while each iteration requires merely one-directional derivative. This means we obtain the best possible computational cost in the underdamped-LMC framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题