使用落后数据管道的随机优化

论文标题

使用落后数据管道的随机优化

Stochastic Optimization with Laggard Data Pipelines

论文作者

Agarwal, Naman, Anil, Rohan, Koren, Tomer, Talwar, Kunal, Zhang, Cyril

论文摘要

最先进的优化正在稳步转移到批量极大的批量大小的大规模平行管道。结果，与硬件加速梯度计算相反，与CPU结合的预处理和磁盘/内存/网络操作已成为新的性能瓶颈。在此制度中，最近提出的方法是数据回声（Choi等，2019），该方法在同一批次上进行了重复的梯度步骤，同时等待新鲜数据从上游到达。我们提供了常见优化方法的“数据回声”扩展的第一个收敛分析，表明它们对同步对应物的可证明改进。具体而言，我们表明，在使用随机小匹配的凸优化中，数据回声为收敛速率的曲率主导部分提供了加速，同时保持了最佳的统计率。

State-of-the-art optimization is steadily shifting towards massively parallel pipelines with extremely large batch sizes. As a consequence, CPU-bound preprocessing and disk/memory/network operations have emerged as new performance bottlenecks, as opposed to hardware-accelerated gradient computations. In this regime, a recently proposed approach is data echoing (Choi et al., 2019), which takes repeated gradient steps on the same batch while waiting for fresh data to arrive from upstream. We provide the first convergence analyses of "data-echoed" extensions of common optimization methods, showing that they exhibit provable improvements over their synchronous counterparts. Specifically, we show that in convex optimization with stochastic minibatches, data echoing affords speedups on the curvature-dominated part of the convergence rate, while maintaining the optimal statistical rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题