论文标题

在不断发展的网络中采矿持续活动

Mining Persistent Activity in Continually Evolving Networks

论文作者

Belth, Caleb, Zheng, Xinyi, Koutra, Danai

论文摘要

频繁的模式挖掘是一个关键的研究领域,可深入了解不断发展的网络(例如社交或道路网络)的结构和动态。但是,网络不仅会发展,而且经常会随着其发展的方式发展。因此,除了模式的频率之外,了解它们的持续时间和定期的时间 ​​- 即它们的持久性 - 还可以增加我们对不断发展的网络的理解。在这项工作中,我们提出了在不断发展的网络中持续时间持续时间的采矿活动的问题 - 即反复且始终如一地发生的活动。我们扩展了时间基序的概念,以捕获特定节点之间的活动,在我们所谓的活动片段中,这是重新发生的小边序列的小序列。我们提出了公理和特性,持久性的度量应满足并制定这种持久度度量。我们还提出了Penminer,这是一个有效的框架框架,用于开采活动片段在不断发展的网络中的持久性,并设计离线和流算法。我们将PenMiner应用于许多真实的大规模不断发展的网络和边缘流,并发现长期存在的活动非常规律,但很少见,而无法单独通过总数发现,并且由于缺乏持久性而暴露的活动爆发。我们与Penminer的发现包括纽约市的社区,在那里,出租车交通通过飓风桑迪,新自行车站的开放,社交网络用户的特征等等。此外,我们使用PenMiner来识别多个网络中的异常情况,在AUC中识别出微妙的异常情况的基线优于识别微妙的异常情况。

Frequent pattern mining is a key area of study that gives insights into the structure and dynamics of evolving networks, such as social or road networks. However, not only does a network evolve, but often the way that it evolves, itself evolves. Thus, knowing, in addition to patterns' frequencies, for how long and how regularly they have occurred---i.e., their persistence---can add to our understanding of evolving networks. In this work, we propose the problem of mining activity that persists through time in continually evolving networks---i.e., activity that repeatedly and consistently occurs. We extend the notion of temporal motifs to capture activity among specific nodes, in what we call activity snippets, which are small sequences of edge-updates that reoccur. We propose axioms and properties that a measure of persistence should satisfy, and develop such a persistence measure. We also propose PENminer, an efficient framework for mining activity snippets' Persistence in Evolving Networks, and design both offline and streaming algorithms. We apply PENminer to numerous real, large-scale evolving networks and edge streams, and find activity that is surprisingly regular over a long period of time, but too infrequent to be discovered by aggregate count alone, and bursts of activity exposed by their lack of persistence. Our findings with PENminer include neighborhoods in NYC where taxi traffic persisted through Hurricane Sandy, the opening of new bike-stations, characteristics of social network users, and more. Moreover, we use PENminer towards identifying anomalies in multiple networks, outperforming baselines at identifying subtle anomalies by 9.8-48% in AUC.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源