与数据和客户异质性的沟通有效的联合学习

论文标题

与数据和客户异质性的沟通有效的联合学习

Communication-Efficient Federated Learning With Data and Client Heterogeneity

论文作者

Zakerinia, Hossein, Talaei, Shayan, Nadiradze, Giorgi, Alistarh, Dan

论文摘要

联合学习（FL）可以对机器学习模型进行大规模的分布培训，同时仍然允许单个节点在本地维护数据。但是，大规模执行FL会带来固有的实际挑战：1）局部节点数据分布的异质性，2）节点计算速度的异质性（异步），但也有3）对客户端与服务器之间通信量的约束。在这项工作中，我们介绍了经典联合平均（FedAvg）算法的第一个变体，该算法同时支持数据异质性，部分客户端异步和通信压缩。我们的算法进行了一种新颖，严格的分析，表明尽管有这些系统放松，但它可以在有趣的参数方面提供类似的融合。在高达300个节点的设置上，严格的叶基准测试中的实验结果表明，我们的算法可确保标准联合任务的快速收敛，从而改善了先前的量化和异步方法。

Federated Learning (FL) enables large-scale distributed training of machine learning models, while still allowing individual nodes to maintain data locally. However, executing FL at scale comes with inherent practical challenges: 1) heterogeneity of the local node data distributions, 2) heterogeneity of node computational speeds (asynchrony), but also 3) constraints in the amount of communication between the clients and the server. In this work, we present the first variant of the classic federated averaging (FedAvg) algorithm which, at the same time, supports data heterogeneity, partial client asynchrony, and communication compression. Our algorithm comes with a novel, rigorous analysis showing that, in spite of these system relaxations, it can provide similar convergence to FedAvg in interesting parameter regimes. Experimental results in the rigorous LEAF benchmark on setups of up to 300 nodes show that our algorithm ensures fast convergence for standard federated tasks, improving upon prior quantized and asynchronous approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题