FedShuffle：在联邦学习中更好地利用本地工作的食谱

论文标题

FedShuffle：在联邦学习中更好地利用本地工作的食谱

FedShuffle: Recipes for Better Use of Local Work in Federated Learning

论文作者

Horváth, Samuel, Sanjabi, Maziar, Xiao, Lin, Richtárik, Peter, Rabbat, Michael

论文摘要

从经验上证明，在跨客户跨客户聚集之前应用多个本地更新的实践是一种成功克服联合学习（FL）中交流瓶颈的方法。通常，通过让客户执行一个或多个时期的每轮本地培训，同时在每个时期内随机重新装置有限的数据集来实现此类方法。数据不平衡，客户在FL应用程序中无处不在，其中客户有不同的本地培训样本，导致不同的客户在每回合中执行不同数量的本地更新。在这项工作中，我们提出了一种通用食谱，即FedShuffle，它更好地利用了FL中的本地更新，尤其是在这种制度中，涵盖了随机改组和异质性。 FedShuffle是第一个具有理论收敛保证的本地更新方法，该方法合并了随机重组，数据不平衡和客户端采样 - 在大规模跨设备FL中必不可少的功能。在理论上和经验上，我们对Fedshuffle和表明进行了全面的理论分析，即它在FL方法中没有存在的客观函数不匹配，该方法假设在异类FL设置中，例如FedAvg（McMahan等人，2017年，2017年）。此外，通过将上面的成分结合起来，FedShuffle在Fednova上改善了（Wang等，2020），以前提出要解决此不匹配。与Mime相似（Karimireddy等，2020），我们表明，在Hessian相似性假设下，非本地方法的饲料减少了动量差异（Cutkosky＆Orabona，2019）。

The practice of applying several local updates before aggregation across clients has been empirically shown to be a successful approach to overcoming the communication bottleneck in Federated Learning (FL). Such methods are usually implemented by having clients perform one or more epochs of local training per round while randomly reshuffling their finite dataset in each epoch. Data imbalance, where clients have different numbers of local training samples, is ubiquitous in FL applications, resulting in different clients performing different numbers of local updates in each round. In this work, we propose a general recipe, FedShuffle, that better utilizes the local updates in FL, especially in this regime encompassing random reshuffling and heterogeneity. FedShuffle is the first local update method with theoretical convergence guarantees that incorporates random reshuffling, data imbalance, and client sampling - features that are essential in large-scale cross-device FL. We present a comprehensive theoretical analysis of FedShuffle and show, both theoretically and empirically, that it does not suffer from the objective function mismatch that is present in FL methods that assume homogeneous updates in heterogeneous FL setups, such as FedAvg (McMahan et al., 2017). In addition, by combining the ingredients above, FedShuffle improves upon FedNova (Wang et al., 2020), which was previously proposed to solve this mismatch. Similar to Mime (Karimireddy et al., 2020), we show that FedShuffle with momentum variance reduction (Cutkosky & Orabona, 2019) improves upon non-local methods under a Hessian similarity assumption.

下载PDF全文

下载文献需遵守相关版权规定

论文标题