论文标题
展览:图案感知图挖掘系统
Peregrine: A Pattern-Aware Graph Mining System
论文作者
论文摘要
图挖掘工作负载旨在通过探索其子图结构来提取图的结构特性。通用图挖掘系统提供了一个通用的运行时,借助指导整体探索过程的用户定义功能来探索感兴趣的子图结构。但是,最先进的图形挖掘系统在很大程度上却忽略了它们所挖掘的子图的形状(或模式)。这导致他们:(a)探索不必要的子图; (b)在探索子图上执行昂贵的计算; (c)将中间部分子图保持在内存中;所有这些都会影响他们的整体表现。此外,他们的编程模型通常与他们的基本探索策略有关,这使得用户很难表达复杂的采矿任务。 在本文中,我们开发了Peregrine,这是一种模式感知的图形挖掘系统,可直接探索感兴趣的子图,同时避免探索不必要的子图,并在整个采矿过程中同时绕过昂贵的计算。我们设计了一个基于模式的编程模型,该模型将“图形模式”视为一流的构造,并使Peregrine可以提取模式的语义,以指导其探索。我们的评估表明,游eg的表现优于最先进的分布式分布式和单个机器图挖掘系统,并在较大的图表上缩放到复杂的挖掘任务,同时使用其“模式优先”编程方法保留简单性和表现力。
Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. General purpose graph mining systems provide a generic runtime to explore subgraph structures of interest with the help of user-defined functions that guide the overall exploration process. However, the state-of-the-art graph mining systems remain largely oblivious to the shape (or pattern) of the subgraphs that they mine. This causes them to: (a) explore unnecessary subgraphs; (b) perform expensive computations on the explored subgraphs; and, (c) hold intermediate partial subgraphs in memory; all of which affect their overall performance. Furthermore, their programming models are often tied to their underlying exploration strategies, which makes it difficult for domain users to express complex mining tasks. In this paper, we develop Peregrine, a pattern-aware graph mining system that directly explores the subgraphs of interest while avoiding exploration of unnecessary subgraphs, and simultaneously bypassing expensive computations throughout the mining process. We design a pattern-based programming model that treats "graph patterns" as first class constructs and enables Peregrine to extract the semantics of patterns, which it uses to guide its exploration. Our evaluation shows that Peregrine outperforms state-of-the-art distributed and single machine graph mining systems, and scales to complex mining tasks on larger graphs, while retaining simplicity and expressivity with its "pattern-first" programming approach.