论文标题
更快的基于自适应动量的联合方法用于分布式组成优化
Faster Adaptive Momentum-Based Federated Methods for Distributed Composition Optimization
论文作者
论文摘要
联合学习是机器学习中流行的分布式学习范式。同时,组成优化是一种有效的层次学习模型,它出现在许多机器学习应用中,例如元学习和健壮的学习。最近,尽管已经提出了一些联合组成优化算法,但它们仍然患有很高的样本和沟通复杂性。因此,在本文中,我们提出了一类更快的联合组成优化算法(即MFCGD和ADAMFCGD)来解决非convex分布式组成问题,这些问题基于基于动量的方差降低和局部-SGD技术。特别是,我们的自适应算法(即ADAMFCGD)使用统一的自适应基质来灵活地合并各种自适应学习率。此外,我们在非i.i.d下为我们的算法提供了坚实的理论分析。设置并证明我们的算法同时获得的样本和通信复杂性比现有的联合成分算法。具体而言,我们的算法获得了$ \ tilde {o}(ε^{ - 3})$的较低样本复杂性,而在查找$ε$ - $ - 计划的解决方案中,$ \ tilde {o}的较低通信复杂性(ε^{ - 2})$。我们对健壮的联合学习和分布式元学习任务进行了数值实验,以证明我们的算法效率。
Federated Learning is a popular distributed learning paradigm in machine learning. Meanwhile, composition optimization is an effective hierarchical learning model, which appears in many machine learning applications such as meta learning and robust learning. More recently, although a few federated composition optimization algorithms have been proposed, they still suffer from high sample and communication complexities. In the paper, thus, we propose a class of faster federated compositional optimization algorithms (i.e., MFCGD and AdaMFCGD) to solve the nonconvex distributed composition problems, which builds on the momentum-based variance reduced and local-SGD techniques. In particular, our adaptive algorithm (i.e., AdaMFCGD) uses a unified adaptive matrix to flexibly incorporate various adaptive learning rates. Moreover, we provide a solid theoretical analysis for our algorithms under non-i.i.d. setting, and prove our algorithms obtain a lower sample and communication complexities simultaneously than the existing federated compositional algorithms. Specifically, our algorithms obtain lower sample complexity of $\tilde{O}(ε^{-3})$ with lower communication complexity of $\tilde{O}(ε^{-2})$ in finding an $ε$-stationary solution. We conduct the numerical experiments on robust federated learning and distributed meta learning tasks to demonstrate the efficiency of our algorithms.