论文标题
重新启动的非convex加速梯度下降:$ o(ε^{ - 7/4})$复杂性不再有pologarithmic因子
Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the $O(ε^{-7/4})$ Complexity
论文作者
论文摘要
本文研究了通过Lipschitz连续梯度和Hessian的非凸优化的梯度方法。我们提出了两种简单的加速梯度方法,即重新启动加速梯度下降(AGD)和重新启动重球(HB)方法,并确定我们的方法达到了$ε$ - $ - $ - OP-O(ε^{ - 7/4})之内的一阶平稳点。从理论上讲,我们的复杂性并没有隐藏任何polygrogarithmic因素,因此它比$ o(\ log \ frac \ frac {1}ε)$ factor的最著名的因素改善了。我们的算法很简单,因为它们仅包括Nesterov的经典AGD或Polyak的HB迭代以及重新启动机制。他们不会调用负曲率开发或最小化正规替代功能作为子例程。与现有分析相反,我们的基本证明使用了较少的高级技术,并且不会引用强烈凸AGD或HB的分析。 代码可在https://github.com/lihuanml/restartagd上提供可用。
This paper studies accelerated gradient methods for nonconvex optimization with Lipschitz continuous gradient and Hessian. We propose two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) method, and establish that our methods achieve an $ε$-approximate first-order stationary point within $O(ε^{-7/4})$ number of gradient evaluations by elementary proofs. Theoretically, our complexity does not hide any polylogarithmic factors, and thus it improves over the best known one by the $O(\log\frac{1}ε)$ factor. Our algorithms are simple in the sense that they only consist of Nesterov's classical AGD or Polyak's HB iterations, as well as a restart mechanism. They do not invoke negative curvature exploitation or minimization of regularized surrogate functions as the subroutines. In contrast with existing analysis, our elementary proofs use less advanced techniques and do not invoke the analysis of strongly convex AGD or HB. Code is avaliable at https://github.com/lihuanML/RestartAGD.