论文标题

筛选器:相关噪声中精确的MSE最佳单数值阈值

ScreeNOT: Exact MSE-Optimal Singular Value Thresholding in Correlated Noise

论文作者

Donoho, David L., Gavish, Matan, Romanov, Elad

论文摘要

我们得出一个在存在相关添加噪声的情况下,在奇异值分解的最佳硬阈值的公式中。尽管它名义上涉及不可观察的东西,但我们即使在噪声协方差结构不是A-Priori或不可估计的情况下,我们展示了如何应用它。 我们称之为Screatot的提出方法是Cattell颇受欢迎但模糊的Scree Plot启发式的数学替代方法。 Screatot具有令人惊讶的Oracle属性:通常,它在大的有限样本中准确地实现了矩阵恢复的最低可能的MSE,在每个给定的问题实例上 - 即,它选择的特定阈值可在所有可能达到的阈值选择中,可以使所有可能达到的MSE损失在所有嘈杂的数据集和那个嘈杂的数据集和那个未知的真实低级级别模型中。该方法是在计算上有效且可靠的,并且在基础协方差结构的扰动中。 我们的结果取决于以下假设:噪声的奇异值具有紧凑支持的经验分布。该模型是随机矩阵理论的标准模型,通过许多模型来满足跨行相关结构或跨柱相关结构,以及在许多情况下存在元素间相关结构的许多情况。模拟也证明了该方法的有效性,即使在中等基质大小下也是如此。该论文补充了实施拟议算法的现成软件包:Python(通过PYPI)和R(通过Cran)中的软件包Screatot。

We derive a formula for optimal hard thresholding of the singular value decomposition in the presence of correlated additive noise; although it nominally involves unobservables, we show how to apply it even where the noise covariance structure is not a-priori known or is not independently estimable. The proposed method, which we call ScreeNOT, is a mathematically solid alternative to Cattell's ever-popular but vague Scree Plot heuristic from 1966. ScreeNOT has a surprising oracle property: it typically achieves exactly, in large finite samples, the lowest possible MSE for matrix recovery, on each given problem instance - i.e. the specific threshold it selects gives exactly the smallest achievable MSE loss among all possible threshold choices for that noisy dataset and that unknown underlying true low rank model. The method is computationally efficient and robust against perturbations of the underlying covariance structure. Our results depend on the assumption that the singular values of the noise have a limiting empirical distribution of compact support; this model, which is standard in random matrix theory, is satisfied by many models exhibiting either cross-row correlation structure or cross-column correlation structure, and also by many situations where there is inter-element correlation structure. Simulations demonstrate the effectiveness of the method even at moderate matrix sizes. The paper is supplemented by ready-to-use software packages implementing the proposed algorithm: package ScreeNOT in Python (via PyPI) and R (via CRAN).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源