基于学习的分布式跟踪

论文标题

基于学习的分布式跟踪

Learning Based Distributed Tracking

论文作者

Wu, Hao, Gan, Junhao, Zhang, Rui

论文摘要

受到过去十年机器学习的巨大成功的启发，人们一直在思考通过探索数据分布改善理论结果的可能性。在本文中，我们在假设数据遵循某个（已知或未知）分布的假设下重新审视了一个称为分布式跟踪（DT）的基本问题，并提出了一个数字依赖数据依赖性算法的理论界限。在非正式的情况下，在DT问题中，有一个协调员和K播放器，协调员持有阈值n，每个播放器都有一个计数器。在每个时间邮票上，最多可以将一个计数器增加一个。协调员的工作是捕获所有这些K计数器的总和到达N的确切时刻。目标是最大程度地降低通信成本。尽管我们的第一类算法假定具体数据分布已提前已知，但我们的第二种算法可以即时学习分布。这两种算法都具有较高的概率界限BYO（K log n）的通信成本，从而改善了最新的与数据无关的界限O（k log n/k）。我们进一步提出了许多实施优化启发式方法，以提高算法的效率和鲁棒性。最后，我们对三个真实数据集和四个合成数据集进行了广泛的实验。实验结果表明，我们算法的通信成本至少是最先进的算法的沟通成本。

Inspired by the great success of machine learning in the past decade, people have been thinking about the possibility of improving the theoretical results by exploring data distribution. In this paper, we revisit a fundamental problem called Distributed Tracking (DT) under an assumption that the data follows a certain (known or unknown) distribution, and propose a number data-dependent algorithms with improved theoretical bounds. Informally, in the DT problem, there is a coordinator and k players, where the coordinator holds a threshold N and each player has a counter. At each time stamp, at most one counter can be increased by one. The job of the coordinator is to capture the exact moment when the sum of all these k counters reaches N. The goal is to minimise the communication cost. While our first type of algorithms assume the concrete data distribution is known in advance, our second type of algorithms can learn the distribution on the fly. Both of the algorithms achieve a communication cost bounded byO(k log log N) with high probability, improving the state-of-the-art data-independent bound O(k log N/k). We further propose a number of implementation optimisation heuristics to improve both efficiency and robustness of the algorithms. Finally, we conduct extensive experiments on three real datasets and four synthetic datasets. The experimental results show that the communication cost of our algorithms is as least as 20% of that of the state-of-the-art algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题