论文标题
混合普通线性模型中光谱方法的精确渐近学
Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models
论文作者
论文摘要
在混合的广义线性模型中,目的是从未标记的观测值中学习多个信号:每个样本都来自一个信号,但尚不清楚哪个信号。我们考虑了与高斯协变量混合的广义线性模型中估算两个统计独立的信号的原型问题。光谱方法是一类流行的估计量,它输出了合适的数据依赖性矩阵的前两个特征向量。但是,尽管适用性广泛,但仍通过启发式考虑获得了它们的设计,并且要保证恢复所需的样本$ N $数量是信号尺寸$ d $的超级线性。在本文中,我们以挑战性的比例制度在光谱方法上开发了精确的渐近学,其中$ n,d $生长较大,其比率收敛到有限的常数。通过这样做,我们能够优化光谱方法的设计,并将其与简单的线性估计器结合在一起,以最大程度地减少估计误差。我们的表征利用了随机矩阵,自由概率和传递算法的近似消息理论的组合。混合线性回归和相位检索的数值模拟证明了通过我们对光谱方法的现有设计的分析来实现的优势。
In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output the top two eigenvectors of a suitable data-dependent matrix. However, despite the wide applicability, their design is still obtained via heuristic considerations, and the number of samples $n$ needed to guarantee recovery is super-linear in the signal dimension $d$. In this paper, we develop exact asymptotics on spectral methods in the challenging proportional regime in which $n, d$ grow large and their ratio converges to a finite constant. By doing so, we are able to optimize the design of the spectral method, and combine it with a simple linear estimator, in order to minimize the estimation error. Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms. Numerical simulations for mixed linear regression and phase retrieval demonstrate the advantage enabled by our analysis over existing designs of spectral methods.