论文标题
随机块模型中的非凸vex精确社区恢复
Non-Convex Exact Community Recovery in Stochastic Block Model
论文作者
论文摘要
根据随机块模型(SBM)生成的图表中的社区检测最近受到了很多关注。在本文中,我们专注于二进制对称的SBM - 其中$ n $顶点的图是通过将顶点首先分配到两个相等大小的社区中随机生成的,然后将每对顶点与概率连接起来,取决于其社区成员资格 - 并研究相关的精确社区恢复问题。尽管该问题的最大样子表述是非凸和离散的,但我们建议使用一种称为投影功率迭代的流行迭代方法对其进行处理。为了确保该方法的快速收敛性,我们使用另一个称为正交迭代的迭代方法生成的点初始化它,这是计算对称矩阵的不变子空间的经典方法。我们表明,在该问题的对数稀疏状态中,提出的两阶段方法的可能性很高,可以将两个社区恰好恢复到$ \ Mathcal {o}中的信息理论极限(N \ log^2n/\ log \ log \ log \ log \ log \ log n)$时间,这是与现有的现有状态方法相同的竞争,具有相同的现有方法的竞争。我们还对合成和真实数据集进行了数值实验,以证明我们提出的方法的功效并补充我们的理论发展。
Community detection in graphs that are generated according to stochastic block models (SBMs) has received much attention lately. In this paper, we focus on the binary symmetric SBM -- in which a graph of $n$ vertices is randomly generated by first partitioning the vertices into two equal-sized communities and then connecting each pair of vertices with probability that depends on their community memberships -- and study the associated exact community recovery problem. Although the maximum-likelihood formulation of the problem is non-convex and discrete, we propose to tackle it using a popular iterative method called projected power iterations. To ensure fast convergence of the method, we initialize it using a point that is generated by another iterative method called orthogonal iterations, which is a classic method for computing invariant subspaces of a symmetric matrix. We show that in the logarithmic sparsity regime of the problem, with high probability the proposed two-stage method can exactly recover the two communities down to the information-theoretic limit in $\mathcal{O}(n\log^2n/\log\log n)$ time, which is competitive with a host of existing state-of-the-art methods that have the same recovery performance. We also conduct numerical experiments on both synthetic and real data sets to demonstrate the efficacy of our proposed method and complement our theoretical development.