迷你批次SIHT算法的收敛性

论文标题

迷你批次SIHT算法的收敛性

Convergence of the mini-batch SIHT algorithm

论文作者

Damadi, Saeed, Shen, Jinglai

论文摘要

迭代的硬阈值（IHT）算法已被广泛地视为解决稀疏优化的有效确定性算法。 IHT算法在每个点的批处理（完整）梯度的信息中受益，此信息是生成序列收敛分析的关键关键。但是，在机器学习和高维统计应用方面，这种强度成为一个弱点，因为在每次迭代时计算批处理梯度在计算上都是昂贵或不切实际的。幸运的是，在这些应用中，目标函数具有一个求和结构，可以利用随机迷你批处理梯度近似批处理梯度。在本文中，我们研究了用于求解稀疏优化的小批量随机IHT（SIHT）算法。与以前的工程相比，派生需要增加和可变的迷你批量大小，我们根据我们得出的下限并显示我们的作品的小批量固定小批量。为了证明客观值函数的随机收敛性，我们首先建立一个关键的稀疏随机梯度下降特性。使用此随机梯度下降特性，我们表明由随机微型批次SIHT产生的序列是超级智能序列，并以概率为单位收敛。与以前的工作不同，我们不认为该功能是受限制的强凸。据我们所知，在稀疏优化的制度中，这是文献中第一次表明，随机函数值的序列通过对所有步骤固定小批量大小来收敛于概率。

The Iterative Hard Thresholding (IHT) algorithm has been considered extensively as an effective deterministic algorithm for solving sparse optimizations. The IHT algorithm benefits from the information of the batch (full) gradient at each point and this information is a crucial key for the convergence analysis of the generated sequence. However, this strength becomes a weakness when it comes to machine learning and high dimensional statistical applications because calculating the batch gradient at each iteration is computationally expensive or impractical. Fortunately, in these applications the objective function has a summation structure that can be taken advantage of to approximate the batch gradient by the stochastic mini-batch gradient. In this paper, we study the mini-batch Stochastic IHT (SIHT) algorithm for solving the sparse optimizations. As opposed to previous works where increasing and variable mini-batch size is necessary for derivation, we fix the mini-batch size according to a lower bound that we derive and show our work. To prove stochastic convergence of the objective value function we first establish a critical sparse stochastic gradient descent property. Using this stochastic gradient descent property we show that the sequence generated by the stochastic mini-batch SIHT is a supermartingale sequence and converges with probability one. Unlike previous work we do not assume the function to be a restricted strongly convex. To the best of our knowledge, in the regime of sparse optimization, this is the first time in the literature that it is shown that the sequence of the stochastic function values converges with probability one by fixing the mini-batch size for all steps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题