贝叶斯推断随机反应网络使用多重速度恢复马尔可夫链蒙特卡洛

论文标题

贝叶斯推断随机反应网络使用多重速度恢复马尔可夫链蒙特卡洛

Bayesian inference of Stochastic reaction networks using Multifidelity Sequential Tempered Markov Chain Monte Carlo

论文作者

Catanach, Thomas A., Vo, Huy D., Munsky, Brian

论文摘要

随机反应网络模型通常用于解释和预测单个细胞中基因调节的动力学。这些模型通常涉及几个参数，例如化学反应的动力学速率，这些参数无法直接测量，必须从实验数据中推断出来。贝叶斯推论提供了一个严格的概率框架，用于通过找到捕获其不确定性的后验参数分布来识别这些参数。用于解决推理问题的传统计算方法，例如基于经典大都市杂货算法的马尔可夫链蒙特卡洛方法涉及对似然函数的多种序列评估，进而需要化学主方程（CME）的昂贵远期解决方案。我们提出了一种基于顺序恢复马尔可夫链蒙特卡洛（ST-MCMC）采样器的多重延伸的替代方法。该算法建立在顺序的蒙特卡洛（Monte Carlo）上，并通过将其分解为有效求解的子问题序列，从而解决了贝叶斯推断问题，从而逐渐增加了模型的保真度和观察到的数据的影响。我们重新制定了有限状态预测（FSP）算法，这是一种众所周知的解决CME的方法，以产生在此多率方案中使用的代理主方程的层次结构。为了确定适当的保真度，我们引入了一个新的信息理论标准，该标准旨在从层次结构中的每个模型中提取有关最终贝叶斯后验的最多信息，而不会引起明显的偏见。使用与生物学相关的问题，使用高性能计算资源测试了这种新颖的采样方案。

Stochastic reaction network models are often used to explain and predict the dynamics of gene regulation in single cells. These models usually involve several parameters, such as the kinetic rates of chemical reactions, that are not directly measurable and must be inferred from experimental data. Bayesian inference provides a rigorous probabilistic framework for identifying these parameters by finding a posterior parameter distribution that captures their uncertainty. Traditional computational methods for solving inference problems such as Markov Chain Monte Carlo methods based on classical Metropolis-Hastings algorithm involve numerous serial evaluations of the likelihood function, which in turn requires expensive forward solutions of the chemical master equation (CME). We propose an alternative approach based on a multifidelity extension of the Sequential Tempered Markov Chain Monte Carlo (ST-MCMC) sampler. This algorithm is built upon Sequential Monte Carlo and solves the Bayesian inference problem by decomposing it into a sequence of efficiently solved subproblems that gradually increase model fidelity and the influence of the observed data. We reformulate the finite state projection (FSP) algorithm, a well-known method for solving the CME, to produce a hierarchy of surrogate master equations to be used in this multifidelity scheme. To determine the appropriate fidelity, we introduce a novel information-theoretic criteria that seeks to extract the most information about the ultimate Bayesian posterior from each model in the hierarchy without inducing significant bias. This novel sampling scheme is tested with high performance computing resources using biologically relevant problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题