论文标题
变异贝叶斯神经网络的统计基础
Statistical Foundation of Variational Bayes Neural Networks
论文作者
论文摘要
尽管近年来贝叶斯神经网络很普遍,但由于与全后验评估相关的计算成本,其在复杂和大数据情况下的使用受到限制。变分贝叶斯(VB)提供了一种有用的替代方案,可以避免使用Markov Chain Monte Carlo(MCMC)技术从真实后验产生的计算成本和时间复杂性。 VB方法的功效在机器学习文献中已经很好地确定。但是,由于从统计角度来看,由于缺乏理论有效性,其潜在的广泛影响受到了阻碍。但是,很少有结果围绕VB的理论特性,尤其是在非参数问题中。在本文中,我们为馈送人造神经网络模型的平均场变异后验(VP)建立了后验一致性的基本结果。该论文强调了确保VP集中在真实密度函数的Hellinger社区所需的条件。此外,还讨论了比例参数的作用及其对收敛速率的影响。该论文主要依赖两个结果(1)真正的后部生长的速率(2)后部和变异后的KL距离速率的增长速率。该理论提供了为贝叶斯NN模型构建先前分布的指南,并评估相应的VB实施的准确性。
Despite the popularism of Bayesian neural networks in recent years, its use is somewhat limited in complex and big data situations due to the computational cost associated with full posterior evaluations. Variational Bayes (VB) provides a useful alternative to circumvent the computational cost and time complexity associated with the generation of samples from the true posterior using Markov Chain Monte Carlo (MCMC) techniques. The efficacy of the VB methods is well established in machine learning literature. However, its potential broader impact is hindered due to a lack of theoretical validity from a statistical perspective. However there are few results which revolve around the theoretical properties of VB, especially in non-parametric problems. In this paper, we establish the fundamental result of posterior consistency for the mean-field variational posterior (VP) for a feed-forward artificial neural network model. The paper underlines the conditions needed to guarantee that the VP concentrates around Hellinger neighborhoods of the true density function. Additionally, the role of the scale parameter and its influence on the convergence rates has also been discussed. The paper mainly relies on two results (1) the rate at which the true posterior grows (2) the rate at which the KL-distance between the posterior and variational posterior grows. The theory provides a guideline of building prior distributions for Bayesian NN models along with an assessment of accuracy of the corresponding VB implementation.