论文标题
多模式后抽样的跳水延伸兰格文动力学
Jump-Diffusion Langevin Dynamics for Multimodal Posterior Sampling
论文作者
论文摘要
贝叶斯从后验分布进行采样方法由于能够精确显示模型拟合的不确定性的能力而变得越来越流行。已知基于迭代的随机抽样和后验评估(例如大都市悬挂)的经典方法已知具有理想的长期混合特性,但是收敛速度很慢。基于梯度的方法,例如Langevin Dynamics(及其随机梯度对应物)表现出有利的维度依赖性和log-concave的快速混合时间,并且“接近”与log-conconcave分布相关,但是从局部最小化器中也有很长的逃生时间。许多当代应用,例如贝叶斯神经网络都是高维和高度模式的。在本文中,我们研究了类似于综合和真实数据的混合大都市和Langevin采样方法的性能,这表明与基于梯度的链条仔细校准混合采样跳跃显着超过了基于纯梯度的基于纯梯度或基于采样的方案。
Bayesian methods of sampling from a posterior distribution are becoming increasingly popular due to their ability to precisely display the uncertainty of a model fit. Classical methods based on iterative random sampling and posterior evaluation such as Metropolis-Hastings are known to have desirable long run mixing properties, however are slow to converge. Gradient based methods, such as Langevin Dynamics (and its stochastic gradient counterpart) exhibit favorable dimension-dependence and fast mixing times for log-concave, and "close" to log-concave distributions, however also have long escape times from local minimizers. Many contemporary applications such as Bayesian Neural Networks are both high-dimensional and highly multimodal. In this paper we investigate the performance of a hybrid Metropolis and Langevin sampling method akin to Jump Diffusion on a range of synthetic and real data, indicating that careful calibration of mixing sampling jumps with gradient based chains significantly outperforms both pure gradient-based or sampling based schemes.