论文标题
非Convex-Concove Min-Max问题的单环平滑梯度下降算法
A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems
论文作者
论文摘要
在许多机器学习应用程序中出现了非convex-concave min-max问题,包括最大程度地降低一组非凸功能的最大程度,以及对神经网络的鲁棒对抗训练。解决此问题的一种流行方法是梯度下降(GDA)算法,不幸的是,在非凸性的情况下可以表现出振荡。在本文中,我们引入了一种“平滑”方案,该方案可以与GDA结合起来以稳定振荡并确保收敛到固定溶液。我们证明,稳定的GDA算法可以达到$ O(1/ε^2)$迭代复杂性,以最大程度地降低有限的非convex函数集合的最大值。此外,平滑的GDA算法达到了$ O(1/ε^4)$迭代复杂性,用于一般的非convex-concave问题。提出了这种稳定的GDA算法扩展到多块情况。据我们所知,这是第一个实现一类非Convex-Concave问题的$ O(1/ε^2)$的算法。我们说明了稳定的GDA算法在健壮训练中的实际效率。
Nonconvex-concave min-max problem arises in many machine learning applications including minimizing a pointwise maximum of a set of nonconvex functions and robust adversarial training of neural networks. A popular approach to solve this problem is the gradient descent-ascent (GDA) algorithm which unfortunately can exhibit oscillation in case of nonconvexity. In this paper, we introduce a "smoothing" scheme which can be combined with GDA to stabilize the oscillation and ensure convergence to a stationary solution. We prove that the stabilized GDA algorithm can achieve an $O(1/ε^2)$ iteration complexity for minimizing the pointwise maximum of a finite collection of nonconvex functions. Moreover, the smoothed GDA algorithm achieves an $O(1/ε^4)$ iteration complexity for general nonconvex-concave problems. Extensions of this stabilized GDA algorithm to multi-block cases are presented. To the best of our knowledge, this is the first algorithm to achieve $O(1/ε^2)$ for a class of nonconvex-concave problem. We illustrate the practical efficiency of the stabilized GDA algorithm on robust training.