论文标题
贝叶斯时空差距填充以推断极端热点:红色海面温度的应用
Bayesian space-time gap filling for inference on extreme hot-spots: an application to Red Sea surface temperatures
论文作者
论文摘要
我们开发了一种在时空框架中对极值热点进行概率预测的方法,该框架量身定制为包含重要差距的大数据集。在这种情况下,无法直接计算数据的摘要,例如在时空域上的最小值。为了获得此类群集摘要的预测分布,我们提出了一种两步方法。我们首先建模边缘分布,重点是对右尾的准确建模,然后在将数据转换为标准的高斯量表之后,我们估计了高斯时空依赖模型在我们想预测的时空次区域中本地定义的当地定义。在第一步中,我们减少数据的平均值和标准偏差,并拟合空间分辨的广义帕累托分布以应用上尾的校正。为了确保估计趋势的空间平滑度,我们要么使用最近的邻居技术汇总数据,要么采用广义的添加剂回归建模。为了应对数据的高时空分辨率,局部高斯模型基于随机偏微分方程(SPDE)方法使用Markov表示MARTérn相关函数。在第二步中,它们通过在R-Inla中实现的集成嵌套拉普拉斯近似中安装在贝叶斯框架中。最后,产生后样品以通过蒙特卡洛估计提供统计推断。在2019年的极值分析数据挑战中,我们说明了我们使用网格数据集(11315天,16703像素)和人工产生的差距预测当地时空最小值在红色海面温度异常中的分布。特别是,我们在没有尾巴转换的情况下,在纯粹的高斯模型上表明了两步方法的性能提高。
We develop a method for probabilistic prediction of extreme value hot-spots in a spatio-temporal framework, tailored to big datasets containing important gaps. In this setting, direct calculation of summaries from data, such as the minimum over a space-time domain, is not possible. To obtain predictive distributions for such cluster summaries, we propose a two-step approach. We first model marginal distributions with a focus on accurate modeling of the right tail and then, after transforming the data to a standard Gaussian scale, we estimate a Gaussian space-time dependence model defined locally in the time domain for the space-time subregions where we want to predict. In the first step, we detrend the mean and standard deviation of the data and fit a spatially resolved generalized Pareto distribution to apply a correction of the upper tail. To ensure spatial smoothness of the estimated trends, we either pool data using nearest-neighbor techniques, or apply generalized additive regression modeling. To cope with high space-time resolution of data, the local Gaussian models use a Markov representation of the Matérn correlation function based on the stochastic partial differential equations (SPDE) approach. In the second step, they are fitted in a Bayesian framework through the integrated nested Laplace approximation implemented in R-INLA. Finally, posterior samples are generated to provide statistical inferences through Monte-Carlo estimation. Motivated by the 2019 Extreme Value Analysis data challenge, we illustrate our approach to predict the distribution of local space-time minima in anomalies of Red Sea surface temperatures, using a gridded dataset (11315 days, 16703 pixels) with artificially generated gaps. In particular, we show the improved performance of our two-step approach over a purely Gaussian model without tail transformations.