非Convex-Concove Min-Max问题的单环平滑梯度下降算法

论文标题

非Convex-Concove Min-Max问题的单环平滑梯度下降算法

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

论文作者

Zhang, Jiawei, Xiao, Peijun, Sun, Ruoyu, Luo, Zhi-Quan

论文摘要

在许多机器学习应用程序中出现了非convex-concave min-max问题，包括最大程度地降低一组非凸功能的最大程度，以及对神经网络的鲁棒对抗训练。解决此问题的一种流行方法是梯度下降（GDA）算法，不幸的是，在非凸性的情况下可以表现出振荡。在本文中，我们引入了一种“平滑”方案，该方案可以与GDA结合起来以稳定振荡并确保收敛到固定溶液。我们证明，稳定的GDA算法可以达到$ O（1/ε^2）$迭代复杂性，以最大程度地降低有限的非convex函数集合的最大值。此外，平滑的GDA算法达到了$ O（1/ε^4）$迭代复杂性，用于一般的非convex-concave问题。提出了这种稳定的GDA算法扩展到多块情况。据我们所知，这是第一个实现一类非Convex-Concave问题的$ O（1/ε^2）$的算法。我们说明了稳定的GDA算法在健壮训练中的实际效率。

Nonconvex-concave min-max problem arises in many machine learning applications including minimizing a pointwise maximum of a set of nonconvex functions and robust adversarial training of neural networks. A popular approach to solve this problem is the gradient descent-ascent (GDA) algorithm which unfortunately can exhibit oscillation in case of nonconvexity. In this paper, we introduce a "smoothing" scheme which can be combined with GDA to stabilize the oscillation and ensure convergence to a stationary solution. We prove that the stabilized GDA algorithm can achieve an $O(1/ε^2)$ iteration complexity for minimizing the pointwise maximum of a finite collection of nonconvex functions. Moreover, the smoothed GDA algorithm achieves an $O(1/ε^4)$ iteration complexity for general nonconvex-concave problems. Extensions of this stabilized GDA algorithm to multi-block cases are presented. To the best of our knowledge, this is the first algorithm to achieve $O(1/ε^2)$ for a class of nonconvex-concave problem. We illustrate the practical efficiency of the stabilized GDA algorithm on robust training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题