论文标题
通过张量分解快速检测热点,并应用于每周淋病数据
Rapid Detection of Hot-spot by Tensor Decomposition with Application to Weekly Gonorrhea Data
论文作者
论文摘要
在许多生物保育和医疗保健应用中,数据源是从许多空间位置衡量的,例如,随着时间的流逝,每天/每周/每月都会反复测量。在这些应用程序中,我们通常对检测热点感兴趣,这些热点被定义为一些在空间域上稀疏但随着时间的持续存在的结构性异常值。在本文中,我们提出了一种张量分解方法来检测热点何时何地发生。我们提出的方法将观察到的原始数据表示为三维张量,包括每日/每周/每月模式的圆形时间维度,然后将张量分解为三个组件:平滑的全球趋势,局部热点和残留物。套索和融合套索的组合用于估计模型参数,并应用了一个库司,以检测可能发生热点的何时何地。我们提出的方法的有用性是通过数值模拟和每周淋病案例数量的实际数据集验证的,从2006美元到2018美元,$ 50 $ $ 50 $。
In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from $2006$ to $2018$ for $50$ states in the United States.