论文标题
双层归一化图拉普拉斯:收敛到歧管laplacian和鲁棒性与异常噪声
Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise
论文作者
论文摘要
在基于图的数据分析中,Bi-Stochastic归一化提供了图形Laplacians的替代归一化,并且可以通过Sinkhorn-knopp(SK)迭代有效地计算。本文证明了双性化标准化图拉普拉斯(Laplacian)与速率(加权)laplacian的收敛,当$ n $数据点为i.i.d.时。从嵌入可能高维空间中的一般$ d $维歧管中取样。在$ n \ to \ infty $和内核带宽$ε\至0 $的某些联合限制下,图形laplacian操作员的点融合速率(2-norm)被证明为$ o(n^{ - 1/(d/(d/(d/2+3)}),在有限的大$ n $上,n $ n $ n $ n $ nim sim n^{ - 1/(d/2+3)} $。当歧管数据被异常噪声损坏时,我们从理论上证明了图形laplacian点的一致性,该图与干净的流形数据的速率以及与自身噪声向量的内部产物和数据向量的内部产物的界限成正比的附加项。通过我们的分析激励,这表明不是确切的双性化归一化,而是大约可以达到相同的一致性率,我们提出了一个近似且受约束的矩阵缩放问题,可以通过早期终止的SK迭代来解决。数值实验支持我们的理论结果,并显示了双层归一化图拉普拉斯对高维离值噪声的鲁棒性。
Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn-Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph Laplacian to manifold (weighted-)Laplacian with rates, when $n$ data points are i.i.d. sampled from a general $d$-dimensional manifold embedded in a possibly high-dimensional space. Under certain joint limit of $n \to \infty$ and kernel bandwidth $ε\to 0$, the point-wise convergence rate of the graph Laplacian operator (under 2-norm) is proved to be $ O( n^{-1/(d/2+3)})$ at finite large $n$ up to log factors, achieved at the scaling of $ε\sim n^{-1/(d/2+3)} $. When the manifold data are corrupted by outlier noise, we theoretically prove the graph Laplacian point-wise consistency which matches the rate for clean manifold data plus an additional term proportional to the boundedness of the inner-products of the noise vectors among themselves and with data vectors. Motivated by our analysis, which suggests that not exact bi-stochastic normalization but an approximate one will achieve the same consistency rate, we propose an approximate and constrained matrix scaling problem that can be solved by SK iterations with early termination. Numerical experiments support our theoretical results and show the robustness of bi-stochastically normalized graph Laplacian to high-dimensional outlier noise.