论文标题

渐近最佳精确的Minibatch大都市杂货店

Asymptotically Optimal Exact Minibatch Metropolis-Hastings

论文作者

Zhang, Ruqi, Cooper, A. Feder, De Sa, Christopher

论文摘要

Metropolis-Hastings(MH)是一种常用的MCMC算法,但是由于需要在整个数据集上进行计算,因此在大型数据集上可能会很棘手。在本文中,我们研究了Minibatch MH方法,该方法使用子样本启用缩放。我们观察到大多数现有的Minibatch MH方法是不确定的(即它们可能会更改目标分布),并表明这种不确定性可能会导致推理中任意较大的错误。我们提出了一种新的精确Minibatch MH方法TunAmh,该方法揭示了其批处理大小和理论上保证的收敛速度之间的可调整权衡。我们证明了任何Minibatch MH方法必须使用的批处理大小的下限,以保留精确性,同时保证快速收敛 - 首先,对于Minibatch MH,并且在批处理大小方面,TunAmh均不是最佳的。从经验上讲,我们在稳健线性回归,截短的高斯混合物和逻辑回归方面表明TunAmh优于其他精确的Minibatch MH方法。

Metropolis-Hastings (MH) is a commonly-used MCMC algorithm, but it can be intractable on large datasets due to requiring computations over the whole dataset. In this paper, we study minibatch MH methods, which instead use subsamples to enable scaling. We observe that most existing minibatch MH methods are inexact (i.e. they may change the target distribution), and show that this inexactness can cause arbitrarily large errors in inference. We propose a new exact minibatch MH method, TunaMH, which exposes a tunable trade-off between its batch size and its theoretically guaranteed convergence rate. We prove a lower bound on the batch size that any minibatch MH method must use to retain exactness while guaranteeing fast convergence-the first such bound for minibatch MH-and show TunaMH is asymptotically optimal in terms of the batch size. Empirically, we show TunaMH outperforms other exact minibatch MH methods on robust linear regression, truncated Gaussian mixtures, and logistic regression.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源