通过多头蒸馏的分散学习

论文标题

通过多头蒸馏的分散学习

Decentralized Learning with Multi-Headed Distillation

论文作者

Zhmoginov, Andrey, Sandler, Mark, Miller, Nolan, Kristiansen, Gus, Vladymyrov, Max

论文摘要

使用私人数据分散学习是机器学习中的核心问题。我们提出了一种新颖的基于蒸馏的分散学习技术，该技术允许多个具有私人非IID数据的代理商相互学习，而无需共享其数据，权重或权重更新。我们的方法是有效的沟通，利用一个未标记的公共数据集，并为每个客户使用多个辅助负责人，在异质数据的情况下大大提高了培训效率。这种方法使各个模型可以保留和提高其私人任务的性能，同时也大大提高了其在全球汇总数据分布上的性能。我们研究数据和模型架构异质性的影响以及基础通信拓扑对学习效率的影响，并表明我们的代理人与隔离学习相比可以显着提高其性能。

Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that allows multiple agents with private non-iid data to learn from each other, without having to share their data, weights or weight updates. Our approach is communication efficient, utilizes an unlabeled public dataset and uses multiple auxiliary heads for each client, greatly improving training efficiency in the case of heterogeneous data. This approach allows individual models to preserve and enhance performance on their private tasks while also dramatically improving their performance on the global aggregated data distribution. We study the effects of data and model architecture heterogeneity and the impact of the underlying communication graph topology on learning efficiency and show that our agents can significantly improve their performance compared to learning in isolation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题