论文标题
朝着纳什平衡的进化
Evolution toward a Nash equilibrium
论文作者
论文摘要
在本文中,我们研究了树篱的动态行为,这是一种众所周知的理论机器学习和算法游戏理论的算法。已知迭代性篱笆生成的经验平均值(算术平均值)在零和游戏中会融合到最小值。我们将其推广到对称bimatrix游戏中的经验平均值与NASH平衡的融合(即Bimatrix游戏,每个玩家的回报矩阵是对方的转置),即平均序列的每个限制点是$ $ im $ im-ip-ip-ip-approximate symetric sembore intemmetric sequilibium intocy not for not and-approximate symetric sequilibium inot for dessymetric sequilibium complobe $ $ $ $ $ $ $ $ $。我们的分析产生了对称平衡完全多项式时间近似方案,这意味着P = PPAD。
In this paper, we study the dynamic behavior of Hedge, a well-known algorithm in theoretical machine learning and algorithmic game theory. The empirical average (arithmetic mean) of the iterates Hedge generates is known to converge to a minimax equilibrium in zero-sum games. We generalize that result to show convergence of the empirical average to Nash equilibrium in symmetric bimatrix games (that is bimatrix games where the payoff matrix of each player is the transpose of that of the other) in the sense that every limit point of the sequence of averages is an $ε$-approximate symmetric equilibrium strategy for any desirable $ε$. Our analysis gives rise to a symmetric equilibrium fully polynomial-time approximation scheme, implying P = PPAD.