在异质环境中进行分布式或联合学习的双向压缩，部分参与：紧张的融合保证

论文标题

在异质环境中进行分布式或联合学习的双向压缩，部分参与：紧张的融合保证

Bidirectional compression in heterogeneous settings for distributed or federated learning with partial participation: tight convergence guarantees

论文作者

Philippenko, Constantin, Dieuleveut, Aymeric

论文摘要

我们介绍了一个框架 - Artemis-，以解决分布式或联合设置中的学习问题，并具有通信约束和设备部分参与。一些工人（随机抽样）使用中央服务器执行优化过程来汇总其计算。为了减轻通信成本，Artemis允许在两个方向上（从工人到服务器）和相反的内存机制压缩发送的信息。它改进了仅考虑单向压缩（对服务器）的现有算法，或在压缩操作员上使用非常强大的假设，并且通常不考虑设备的部分参与。我们在非I.I.D中的随机梯度（仅在最佳点界定噪声方差）提供了快速的收敛速率（线性最高到阈值）。设置，突出显示记忆对单向和双向压缩的影响，分析Polyak-Rupperter平均。我们在分布中使用收敛来获得渐近方差的下限，该方差突出了实际的压缩极限。我们提出了两种方法，以解决设备部分参与的具有挑战性的案例，并提供实验结果以证明我们的分析有效性。

We introduce a framework - Artemis - to tackle the problem of learning in a distributed or federated setting with communication constraints and device partial participation. Several workers (randomly sampled) perform the optimization process using a central server to aggregate their computations. To alleviate the communication cost, Artemis allows to compress the information sent in both directions (from the workers to the server and conversely) combined with a memory mechanism. It improves on existing algorithms that only consider unidirectional compression (to the server), or use very strong assumptions on the compression operator, and often do not take into account devices partial participation. We provide fast rates of convergence (linear up to a threshold) under weak assumptions on the stochastic gradients (noise's variance bounded only at optimal point) in non-i.i.d. setting, highlight the impact of memory for unidirectional and bidirectional compression, analyze Polyak-Ruppert averaging. We use convergence in distribution to obtain a lower bound of the asymptotic variance that highlights practical limits of compression. We propose two approaches to tackle the challenging case of devices partial participation and provide experimental results to demonstrate the validity of our analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题