论文标题

通过基于回溯的框架分布式子图枚举

Distributed Subgraph Enumeration via Backtracking-based Framework

论文作者

Wang, Zhaokang, Hu, Weiwei, Yuan, Chunfeng, Gu, Rong, Huang, Yihua

论文摘要

在数据图中与给定模式图同构的查找或监视子图实例是许多图分析应用程序中的基本查询操作,例如网络图案挖掘和欺诈检测。最先进的分布式方法的通信效率低下。他们必须在分布式多线连接期间将部分匹配的结果洗牌。部分匹配结果可能比数据图本身大得多。为了克服缺点,我们开发了用于分布式子图枚举的批处理框架(B-benu)。 B-Benu并行执行一组本地搜索任务。每个任务都列举了数据图中顶点周围的子图,并在基于回溯的执行计划的指导下。 B-benu不会将任何部分匹配的结果改组。相反,它将数据图存储在分布式数据库中。按需数据图的每个任务查询邻接集。为了支持动态数据图,我们提出了增量模式图的概念,并将连续的子图表列为每个时间步骤的枚举增量图形图。我们开发了流媒体框架(S-BENU),以有效地列举其匹配。我们使用本地数据库缓存以及任务分割技术实现B-Benu和S-Benu。广泛的实验表明,B-Benu和S-Benu可以扩展到大数据图和复杂的图案图。他们的表现分别以最多一个和两个数量级的范围优于最先进的方法。

Finding or monitoring subgraph instances that are isomorphic to a given pattern graph in a data graph is a fundamental query operation in many graph analytic applications, such as network motif mining and fraud detection. The state-of-the-art distributed methods are inefficient in communication. They have to shuffle partial matching results during the distributed multiway join. The partial matching results may be much larger than the data graph itself. To overcome the drawback, we develop the Batch-BENU framework (B-BENU) for distributed subgraph enumeration. B-BENU executes a group of local search tasks in parallel. Each task enumerates subgraphs around a vertex in the data graph, guided by a backtracking-based execution plan. B-BENU does not shuffle any partial matching result. Instead, it stores the data graph in a distributed database. Each task queries adjacency sets of the data graph on demand. To support dynamic data graphs, we propose the concept of incremental pattern graphs and turn continuous subgraph enumeration into enumerating incremental pattern graphs at each time step. We develop the Streaming-BENU framework (S-BENU) to enumerate their matches efficiently. We implement B-BENU and S-BENU with the local database cache and the task splitting techniques. The extensive experiments show that B-BENU and S-BENU can scale to big data graphs and complex pattern graphs. They outperform the state-of-the-art methods by up to one and two orders of magnitude, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源