论文标题
沟通有效的分配在强大的分散化学习方面
Communication-Efficient Distributionally Robust Decentralized Learning
论文作者
论文摘要
分散的学习算法授权互连设备共享数据和计算资源,以在无需中央协调器的情况下协作训练机器学习模型。对于网络节点的异质数据分布,协作可以在设备的一部分中产生具有不令人满意的性能的预测变量。因此,在这项工作中,我们考虑了分配强大的分散学习任务的制定,并提出了一个分散的单回路梯度下降/上升算法(AD-GDA),以直接解决基本的微型型优化问题。我们通过采用压缩共识方案来实现算法沟通效率,并为平滑凸和非凸损失函数提供收敛保证。最后,我们以经验结果证实了理论发现,与现有的分布稳健算法相比,AD-GDA提供了无偏见的预测指标并大大提高沟通效率的能力。
Decentralized learning algorithms empower interconnected devices to share data and computational resources to collaboratively train a machine learning model without the aid of a central coordinator. In the case of heterogeneous data distributions at the network nodes, collaboration can yield predictors with unsatisfactory performance for a subset of the devices. For this reason, in this work, we consider the formulation of a distributionally robust decentralized learning task and we propose a decentralized single loop gradient descent/ascent algorithm (AD-GDA) to directly solve the underlying minimax optimization problem. We render our algorithm communication-efficient by employing a compressed consensus scheme and we provide convergence guarantees for smooth convex and non-convex loss functions. Finally, we corroborate the theoretical findings with empirical results that highlight AD-GDA's ability to provide unbiased predictors and to greatly improve communication efficiency compared to existing distributionally robust algorithms.