论文标题
有效的无衍生贝叶斯推断大规模反问题
Efficient Derivative-free Bayesian Inference for Large-Scale Inverse Problems
论文作者
论文摘要
我们认为贝叶斯对大规模反问题的推断,在这些反向问题中,计算挑战是源于对昂贵的远期模型的重复评估所带来的。这使大多数马尔可夫链蒙特卡洛接近不可行,因为它们通常需要$ O(10^4)$型号运行,或者更多。此外,正向模型通常被作为黑匣子给出,或者不切实际地分化。因此,无衍生算法是非常可取的。我们提出了一个建立在Kalman方法论的框架,以在此类反问题上有效地进行贝叶斯推断。基本方法基于新型平均场动力系统的滤波分布的近似值,将反问题嵌入为观察算子中。平均场模型的理论特性是针对线性逆问题建立的,这表明所需的贝叶斯后验是由平均场动力学系统过滤分布的定律的稳态给出的,并证明了指数的收敛。这表明,对于接近高斯的非线性问题,依次计算该法律为有效的迭代方法提供了基础,以近似贝叶斯后部。集合方法用于获得平均场模型的滤波分布的相互作用粒子系统的近似;以及进一步降低方法论的计算成本和记忆成本的实用策略,包括低级别近似和Bifidelity方法。在几个数值实验中证明了该框架的有效性,包括概念验证线性/非线性示例和两个大规模应用:在地下流中学习渗透率参数;以及通过时间平均统计数据中的全球气候模型中学习亚网格级参数。
We consider Bayesian inference for large scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require $O(10^4)$ model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system into which the inverse problem is embedded as an observation operator. Theoretical properties of the mean-field model are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model from time-averaged statistics.