论文标题
多项式尺度的随机内部产物内核矩阵的频谱的等效原理
An Equivalence Principle for the Spectrum of Random Inner-Product Kernel Matrices with Polynomial Scalings
论文作者
论文摘要
我们研究了随机矩阵,其条目是通过将非线性内核函数应用于$ n $独立数据向量之间的成对内部产物获得的,这些矩阵是从$ \ mathbb {r}^d $中均匀绘制的。这项研究是由机器学习和统计数据中的应用激发的,这些内核随机矩阵及其光谱属性起着重要作用。我们建立了这些矩阵在多项式缩放方案中的经验光谱分布的弱极限,其中$ d,n \ to \ infty $,以至于$ n / d / d^\ ell \ toκ$,对于某些固定的$ \ ell \ in \ mathbb {n} $和$κ\ in(0,\ in(0,\ infty)$。我们的发现概括了Cheng and Singer的早期结果,他们在线性缩放制度中检查了相同的模型($ \ ell = 1 $)。 我们的工作揭示了一个等效原理:随机核基质的频谱在渐近的矩阵模型上渐变等效,该模型构建为A(移位)WishArt矩阵的线性组合,并且是从高斯矫正器中采样的独立矩阵。 WishArt矩阵的长宽比和线性组合的系数由$ \ ell $确定,以及在正交Hermite多项式基础上的核函数的扩展。因此,随机内核矩阵的限制频谱可以被描述为Marchenko-Pastur定律和半圆定律之间的自由添加卷积。我们还将结果扩展到来自各向同性高斯分布而不是球形分布的数据向量的情况。
We investigate random matrices whose entries are obtained by applying a nonlinear kernel function to pairwise inner products between $n$ independent data vectors, drawn uniformly from the unit sphere in $\mathbb{R}^d$. This study is motivated by applications in machine learning and statistics, where these kernel random matrices and their spectral properties play significant roles. We establish the weak limit of the empirical spectral distribution of these matrices in a polynomial scaling regime, where $d, n \to \infty$ such that $n / d^\ell \to κ$, for some fixed $\ell \in \mathbb{N}$ and $κ\in (0, \infty)$. Our findings generalize an earlier result by Cheng and Singer, who examined the same model in the linear scaling regime (with $\ell = 1$). Our work reveals an equivalence principle: the spectrum of the random kernel matrix is asymptotically equivalent to that of a simpler matrix model, constructed as a linear combination of a (shifted) Wishart matrix and an independent matrix sampled from the Gaussian orthogonal ensemble. The aspect ratio of the Wishart matrix and the coefficients of the linear combination are determined by $\ell$ and the expansion of the kernel function in the orthogonal Hermite polynomial basis. Consequently, the limiting spectrum of the random kernel matrix can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law. We also extend our results to cases with data vectors sampled from isotropic Gaussian distributions instead of spherical distributions.