论文标题

FedBalancer:对异质客户有效联邦学习的数据和速度控制

FedBalancer: Data and Pace Control for Efficient Federated Learning on Heterogeneous Clients

论文作者

Shin, Jaemin, Li, Yuanchun, Liu, Yunxin, Lee, Sung-Ju

论文摘要

联合学习(FL)在不暴露个人数据的情况下训练在分布式客户的机器学习模型上训练机器学习模型。与通常基于仔细组织数据的集中培训不同,FL涉及通常没有过滤和失衡的设备数据。结果,常规的FL培训协议同样治疗所有数据会导致浪费本地计算资源并减慢全球学习过程。为此,我们提出了FedBalancer,这是一个系统的FL框架,可积极选择客户的培训样本。我们的样本选择策略优先考虑更多“信息性”数据,同时尊重客户的隐私和计算功能。为了更好地利用样本选择来加快全球培训,我们进一步引入了一种自适应截止日期控制方案,该方案可以通过不同的客户培训数据来预测每个回合的最佳截止日期。与现有的FL算法使用截止日期配置方法相比,我们对来自三个不同域的五个数据集的评估表明,FedBalancer将其准确性性能提高了1.20〜4.48倍,同时将模型准确性提高了1.1〜5.50%。我们还表明,FedBalancer很容易适用于其他FL方法,证明FedBalancer与三种不同的FL算法共同运行时,可以提高收敛速度和准确性。

Federated Learning (FL) trains a machine learning model on distributed clients without exposing individual data. Unlike centralized training that is usually based on carefully-organized data, FL deals with on-device data that are often unfiltered and imbalanced. As a result, conventional FL training protocol that treats all data equally leads to a waste of local computational resources and slows down the global learning process. To this end, we propose FedBalancer, a systematic FL framework that actively selects clients' training samples. Our sample selection strategy prioritizes more "informative" data while respecting privacy and computational capabilities of clients. To better utilize the sample selection to speed up global training, we further introduce an adaptive deadline control scheme that predicts the optimal deadline for each round with varying client training data. Compared with existing FL algorithms with deadline configuration methods, our evaluation on five datasets from three different domains shows that FedBalancer improves the time-to-accuracy performance by 1.20~4.48x while improving the model accuracy by 1.1~5.0%. We also show that FedBalancer is readily applicable to other FL approaches by demonstrating that FedBalancer improves the convergence speed and accuracy when operating jointly with three different FL algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源