论文标题

Risgraph:一种用于不断发展的图表的实时流媒体系统,以支持数百万OPS/S的每次上升分析的子毫秒分析

RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s

论文作者

Feng, Guanyu, Ma, Zixuan, Li, Daixuan, Chen, Shengqi, Zhu, Xiaowei, Han, Wentao, Chen, Wenguang

论文摘要

现实世界中不断发展的图表是大规模的,并且不断变化,因为每秒可能会有数十万个更新。单调算法(如可达性和最短路径)在实时分析中广泛使用,以获得静态和时间见解,并且可以通过增量计算加速。现有的流媒体系统采用增量计算模型,并实现低潜伏期或高吞吐量,但不能同时实现。但是,在金融欺诈检测等实际情况下,需要高吞吐量和低潜伏期。本文介绍了Risgraph,这是一个实时流媒体系统,可为每个更新提供高吞吐量的低延迟分析。 Risgraph通过局部数据访问和更高的并行性解决了挑战。我们提出了一个名为索引的邻接列表的数据结构,并使用稀疏阵列和混合并行模式来启用局部数据访问。为了实现跨越的并行性,我们根据安全和不安全的更新的分类提出了一种特定于域的并发控制机制。实验表明,Risgraph可以为具有数亿个顶点和数十亿个边缘的图表摄入数百万个更新,并且P999处理时间延迟在20毫秒内。当对每次更新进行分析时,Risgraph实现了吞吐量的量顺序改善吞吐量,而无需批次的分析,并且比现有系统更好的批量进行了更新的批次。

Evolving graphs in the real world are large-scale and constantly changing, as hundreds of thousands of updates may come every second. Monotonic algorithms such as Reachability and Shortest Path are widely used in real-time analytics to gain both static and temporal insights and can be accelerated by incremental computing. Existing streaming systems adopt the incremental computing model and achieve either low latency or high throughput, but not both. However, both high throughput and low latency are required in real scenarios such as financial fraud detection. This paper presents RisGraph, a real-time streaming system that provides low-latency analysis for each update with high throughput. RisGraph addresses the challenge with localized data access and inter-update parallelism. We propose a data structure named Indexed Adjacency Lists and use sparse arrays and Hybrid Parallel Mode to enable localized data access. To achieve inter-update parallelism, we propose a domain-specific concurrency control mechanism based on the classification of safe and unsafe updates. Experiments show that RisGraph can ingest millions of updates per second for graphs with several hundred million vertices and billions of edges, and the P999 processing time latency is within 20 milliseconds. RisGraph achieves orders-of-magnitude improvement on throughput when analyses are executed for each update without batching and performs better than existing systems with batches of up to 20 million updates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源