论文标题
在基于任务的数据流运行时窃取分布式工作
Distributed Work Stealing in a Task-Based Dataflow Runtime
论文作者
论文摘要
基于任务的数据流编程模型已成为极端规模应用程序的以过程为中心的编程模型的替代方法。但是,负载平衡仍然是基于任务的数据流运行时间的挑战。在本文中,我们向PAR-SEC运行时提出扩展,以证明分布式工作窃取是基于任务的数据流runtimes的有效负载平衡方法。与共享内存工作窃取相反,我们发现每个过程都应考虑未来的任务以及确定是否窃取的预期执行时间。我们证明了提议的偷窃政策对稀疏的cholesky分解的有效性,与静态分裂相比,稀疏的Cholesky分解的速度最高为35%。
The task-based dataflow programming model has emerged as an alternative to the process-centric programming model for extreme-scale applications. However, load balancing is still a challenge in task-based dataflow runtimes. In this paper, we present extensions to the PaR-SEC runtime to demonstrate that distributed work stealing is an effective load-balancing method for task-based dataflow runtimes. In contrast to shared-memory work stealing, we find that each process should consider future tasks and the expected waiting time for execution when determining whether to steal. We demonstrate the effectiveness of the proposed work-stealing policies for a sparse Cholesky factorization, which shows a speedup of up to 35% compared to a static division of work.