论文标题
复杂网络分析的拉普拉斯伪变量的对角线的近似
Approximation of the Diagonal of a Laplacian's Pseudoinverse for Complex Network Analysis
论文作者
论文摘要
在许多应用程序中,大量图数据集的无处不在需要快速算法来从这些数据中提取知识。在这里,我们以三种电气措施进行了激励,以分析大型小世界图$ g =(v,e)$ - 即,在$ o(\ log | v |)$中直径的图形,这些图在复杂的网络分析中很丰富。从计算的角度来看,这三个措施的共同点是它们的关键组成部分是laplacian的伪源的对角线,即$ l^\匕首$。但是,通过伪插入,计算diag $(l^\匕首)$与密度矩阵乘法一样昂贵 - 实践中的标准工具甚至需要立方时间。此外,伪为需要二次空间 - 对于大图几乎不可行。诉诸于近似值,例如,使用约翰逊 - 林登斯特劳斯变换,需要$ o(\ log | v | /ε^2)$ laplacian线性系统来保证相对误差,这对于大输入仍然非常昂贵。 在本文中,我们提出了一种新型的近似算法,该算法仅需要一个拉普拉斯线性系统的解决方案。其余部分纯粹是组合的 - 主要是对跨树木进行采样,我们通过有效的电阻与diag $(l^\匕首)$相关。对于小世界网络,我们的算法以高概率获得了$ \ pmε$ - approximation,在$ | e | $的时间几乎是线性的,又有二次的$ 1 /ε$。我们算法的另一个积极方面是由于独立采样而其平行性质。因此,我们提供了我们算法的两个并行实现:一个使用OpenMP,一个MPI + OpenMP。在我们针对最新技术的实验中,我们的算法(i)产生更准确的结果,(ii)更快,更高的记忆效率,(iii)获得了良好的并行加速,尤其是在分布式设置中。
The ubiquity of massive graph data sets in numerous applications requires fast algorithms for extracting knowledge from these data. We are motivated here by three electrical measures for the analysis of large small-world graphs $G = (V, E)$ -- i.e., graphs with diameter in $O(\log |V|)$, which are abundant in complex network analysis. From a computational point of view, the three measures have in common that their crucial component is the diagonal of the graph Laplacian's pseudoinverse, $L^\dagger$. Computing diag$(L^\dagger)$ exactly by pseudoinversion, however, is as expensive as dense matrix multiplication -- and the standard tools in practice even require cubic time. Moreover, the pseudoinverse requires quadratic space -- hardly feasible for large graphs. Resorting to approximation by, e.g., using the Johnson-Lindenstrauss transform, requires the solution of $O(\log |V| / ε^2)$ Laplacian linear systems to guarantee a relative error, which is still very expensive for large inputs. In this paper, we present a novel approximation algorithm that requires the solution of only one Laplacian linear system. The remaining parts are purely combinatorial -- mainly sampling uniform spanning trees, which we relate to diag$(L^\dagger)$ via effective resistances. For small-world networks, our algorithm obtains a $\pm ε$-approximation with high probability, in a time that is nearly-linear in $|E|$ and quadratic in $1 / ε$. Another positive aspect of our algorithm is its parallel nature due to independent sampling. We thus provide two parallel implementations of our algorithm: one using OpenMP, one MPI + OpenMP. In our experiments against the state of the art, our algorithm (i) yields more accurate results, (ii) is much faster and more memory-efficient, and (iii) obtains good parallel speedups, in particular in the distributed setting.