论文标题
通过最佳客户抽样的沟通有效的联合学习
Communication-Efficient Federated Learning via Optimal Client Sampling
论文作者
论文摘要
联合学习(FL)可以在中央服务器协调从许多客户分配的数据中进行学习的设置中的隐私问题。客户在本地培训并将他们学习的模型传达给服务器;本地模型的聚合需要频繁地在客户端和中央服务器之间进行大量信息。我们提出了一种新颖,简单有效的方式,可以根据来自客户的收集模型的收集模型,以提供信息丰富的更新并估算未传达的本地更新。特别是,通过Ornstein-uhlenbeck过程对模型的权重进行建模,使我们能够得出一个最佳的抽样策略,以选择具有重大权重更新的客户集。中央服务器仅从选定的客户端收集更新的本地模型,并将其与未选择通信的客户端的估计模型更新相结合。我们在合成数据集上测试了该策略,以进行逻辑回归和两个FL基准测试,即EMNIST上的分类任务和使用莎士比亚数据集的现实语言建模任务。结果表明,与基线相比,所提出的框架可显着减少交流,同时保持竞争性或达到卓越的性能。我们的方法代表了沟通效率FL的新策略系列,该策略与现有的用户本地方法(例如量化或稀疏)是正交的,因此可以补充,而不是旨在替代这些现有方法。
Federated learning (FL) ameliorates privacy concerns in settings where a central server coordinates learning from data distributed across many clients. The clients train locally and communicate the models they learn to the server; aggregation of local models requires frequent communication of large amounts of information between the clients and the central server. We propose a novel, simple and efficient way of updating the central model in communication-constrained settings based on collecting models from clients with informative updates and estimating local updates that were not communicated. In particular, modeling the progression of model's weights by an Ornstein-Uhlenbeck process allows us to derive an optimal sampling strategy for selecting a subset of clients with significant weight updates. The central server collects updated local models from only the selected clients and combines them with estimated model updates of the clients that were not selected for communication. We test this policy on a synthetic dataset for logistic regression and two FL benchmarks, namely, a classification task on EMNIST and a realistic language modeling task using the Shakespeare dataset. The results demonstrate that the proposed framework provides significant reduction in communication while maintaining competitive or achieving superior performance compared to a baseline. Our method represents a new line of strategies for communication-efficient FL that is orthogonal to the existing user-local methods such as quantization or sparsification, thus complementing rather than aiming to replace those existing methods.