交替的镜子下降用于受约束的最小游戏

论文标题

交替的镜子下降用于受约束的最小游戏

Alternating Mirror Descent for Constrained Min-Max Games

论文作者

Wibisono, Andre, Tao, Molei, Piliouras, Georgios

论文摘要

在本文中，我们研究了具有约束策略空间的两人双线零和游戏。这种约束的自然发生的一个实例是使用混合策略时，这与概率单纯限制相对应。我们提出和分析交替的镜像下降算法，其中每个玩家都会轮流采取镜子下降算法采取行动，以进行约束优化。我们将交替的镜下下降解释为双重空间中偏斜梯度流的交替离散化，并使用凸优化和修改的能量功能中的工具来建立$ O（k^{ - 2/3}）$在$ k $迭代后的平均遗憾。与同时版本的镜子下降算法相比，这可以定量验证算法的更好行为，该算法的同时版本可以发散并产生$ O（k^{ - 1/2}）$平均遗憾。在不受约束的特殊情况下，我们的结果恢复了零和游戏的交替梯度下降算法的行为，该算法在（Bailey等，Colt 2020）中进行了研究。

In this paper we study two-player bilinear zero-sum games with constrained strategy spaces. An instance of natural occurrences of such constraints is when mixed strategies are used, which correspond to a probability simplex constraint. We propose and analyze the alternating mirror descent algorithm, in which each player takes turns to take action following the mirror descent algorithm for constrained optimization. We interpret alternating mirror descent as an alternating discretization of a skew-gradient flow in the dual space, and use tools from convex optimization and modified energy function to establish an $O(K^{-2/3})$ bound on its average regret after $K$ iterations. This quantitatively verifies the algorithm's better behavior than the simultaneous version of mirror descent algorithm, which is known to diverge and yields an $O(K^{-1/2})$ average regret bound. In the special case of an unconstrained setting, our results recover the behavior of alternating gradient descent algorithm for zero-sum games which was studied in (Bailey et al., COLT 2020).

下载PDF全文

下载文献需遵守相关版权规定

论文标题