论文标题
高维度稀疏功能主成分分析
Sparse Functional Principal Component Analysis in High Dimensions
论文作者
论文摘要
功能主成分分析(FPCA)是一种基本工具,在近几十年来吸引了越来越多的关注,而现有方法仅限于具有单个或有限数量的随机功能(比样本量$ n $小得多)的数据。在这项工作中,我们专注于高维功能过程,在这些过程中,随机功能的数量$ p $可与$ n $相当,甚至大于$ n $。在各个领域(例如神经影像分析)中,这种数据无处不在,无法通过现有方法正确建模。我们提出了一种称为“稀疏FPCA”的新算法,该算法能够在明智的稀疏性方面有效地对主要特征函数进行建模。尽管稀疏性假设是多元统计中的标准化,但在复杂的环境中尚未对它们进行调查,因为它们不仅是$ p $大,而且每个变量本身都是本质上无限的维度过程。稀疏结构激发了阈值规则,该规则易于计算,而无需非参数平滑,通过利用单变量正顺式基础扩展与多变量Kahunen-Loève(K-L)表示之间的关系。我们研究了所得估计器的理论特性,并用模拟和真实的数据示例说明了性能。
Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller than the sample size $n$). In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without nonparametric smoothing by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Loève (K-L) representations. We investigate the theoretical properties of the resulting estimators, and illustrate the performance with simulated and real data examples.