可扩展的近似贝叶斯计算，用于通过推断和采样的摘要，用于增长的网络模型

论文标题

可扩展的近似贝叶斯计算，用于通过推断和采样的摘要，用于增长的网络模型

Scalable Approximate Bayesian Computation for Growing Network Models via Extrapolated and Sampled Summaries

论文作者

Raynal, Louis, Chen, Sixing, Mira, Antonietta, Onnela, Jukka-Pekka

论文摘要

近似贝叶斯计算（ABC）是一种适用于模型选择和参数估计的基于仿真的无可能的方法。 ABC参数估计需要从候选模型转发模拟数据集的能力，但是由于观察到的和模拟数据集的大小通常需要匹配，因此这在计算上可能很昂贵。此外，由于ABC推断是基于在观察到的数据和模拟数据上计算出的摘要统计数据的比较，因此使用计算昂贵的汇总统计数据可能会导致效率进一步损失。 ABC最近已应用于机械网络模型家族，该领域传统上缺乏推理和模型选择工具。网络增长的机械模型反复将节点添加到网络中，直到达到观察到的网络的大小为止，这可能是数百万节点的顺序。使用ABC，由于网络模拟的资源密集型性质和摘要统计数据的评估，此过程可能会迅速变得过于计算。我们提出了两个方法论发展，以使ABC在大型增长网络中使用ABC进行推断。首先，为了节省向前模拟模型实现所需的时间，我们提出了一个程序，以推断（通过最小二乘和高斯过程）从小型到大型网络推断汇总统计信息。其次，为了减少评估摘要统计数据的计算时间，我们使用基于样本的而不是基于普查的摘要统计信息。我们表明，通过这种方法获得的ABC后验，该方法在标准ABC中增加了两层近似层，类似于经典的ABC后验。尽管我们处理不断增长的网络模型，但预计推断的摘要和采样摘要都在逐步生成数据的其他ABC设置中相关。

Approximate Bayesian computation (ABC) is a simulation-based likelihood-free method applicable to both model selection and parameter estimation. ABC parameter estimation requires the ability to forward simulate datasets from a candidate model, but because the sizes of the observed and simulated datasets usually need to match, this can be computationally expensive. Additionally, since ABC inference is based on comparisons of summary statistics computed on the observed and simulated data, using computationally expensive summary statistics can lead to further losses in efficiency. ABC has recently been applied to the family of mechanistic network models, an area that has traditionally lacked tools for inference and model choice. Mechanistic models of network growth repeatedly add nodes to a network until it reaches the size of the observed network, which may be of the order of millions of nodes. With ABC, this process can quickly become computationally prohibitive due to the resource intensive nature of network simulations and evaluation of summary statistics. We propose two methodological developments to enable the use of ABC for inference in models for large growing networks. First, to save time needed for forward simulating model realizations, we propose a procedure to extrapolate (via both least squares and Gaussian processes) summary statistics from small to large networks. Second, to reduce computation time for evaluating summary statistics, we use sample-based rather than census-based summary statistics. We show that the ABC posterior obtained through this approach, which adds two additional layers of approximation to the standard ABC, is similar to a classic ABC posterior. Although we deal with growing network models, both extrapolated summaries and sampled summaries are expected to be relevant in other ABC settings where the data are generated incrementally.

下载PDF全文

下载文献需遵守相关版权规定

论文标题