双重自随机分散化优化，降低方差

论文标题

双重自随机分散化优化，降低方差

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

论文作者

Hendrikx, Hadrien, Bach, Francis, Massoulié, Laurent

论文摘要

我们以分散的方式考虑了在分布式数据上训练机器学习模型的问题。对于有限的问题，用于大型数据集的快速单机算法依赖于随机更新和降低方差的结合。但是，现有的分散随机算法要么无法获得随机更新允许的全加速，要么需要比常规梯度更昂贵的甲壳。在这项工作中，我们引入了一种分散的随机算法，其降低称为DVR。 DVR仅需要计算本地函数的随机梯度，并且在计算上与标准随机方差减少算法一样快，以数据集的$ 1/n $分数运行，其中$ n $是节点的数量。为了得出DVR，我们将Bregman坐标下降在精心挑选的双重问题上，并使用特定的Bregman Divergence获得双重的算法。我们根据催化剂框架给出了DVR的加速版，并通过对真实数据的模拟说明了其有效性。

We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a $1/n$ fraction of the dataset, where $n$ is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题