有限重复的游戏有效的Stackelberg策略

论文标题

有限重复的游戏有效的Stackelberg策略

Efficient Stackelberg Strategies for Finitely Repeated Games

论文作者

Collina, Natalie, Arunachaleswaran, Eshwar Ram, Kearns, Michael

论文摘要

我们在有限重复的游戏中研究Stackelberg Equilibria，领导者承诺在每轮比赛中采取行动，并且可以适应比赛历史（即他们致力于算法）。特别是，我们研究静态重复游戏而没有打折。在这种情况下，我们提供了有效的算法，可在这种情况下找到近似的stackelberg equilibria，并取决于收敛速度，具体取决于时间范围$ t $。在许多情况下，这些算法使领导者的平均表现要比单一居住的Stackelberg平均得分要好得多。我们给出了两种算法，一种具有最佳$ \ frac {1} {t} $速率的计算策略，以牺牲指数依赖动作数量的依赖，而另一种（随机的）方法计算策略不依赖于动作数量，但对$ t $ $ t $ $ t $ $ \ frac {1} 1} {1} {0.25}这两种算法都基于线性程序，以产生简单的自动机领导者策略，并为追随者诱导相应的自动机效果。我们通过表明在三人玩家有限摩托重复的游戏中近似阶梯式值是通过减少平衡顶点盖的减少来补充这些结果。

We study Stackelberg equilibria in finitely repeated games, where the leader commits to a strategy that picks actions in each round and can be adaptive to the history of play (i.e. they commit to an algorithm). In particular, we study static repeated games with no discounting. We give efficient algorithms for finding approximate Stackelberg equilibria in this setting, along with rates of convergence depending on the time horizon $T$. In many cases, these algorithms allow the leader to do much better on average than they can in the single-round Stackelberg. We give two algorithms, one computing strategies with an optimal $\frac{1}{T}$ rate at the expense of an exponential dependence on the number of actions, and another (randomized) approach computing strategies with no dependence on the number of actions but a worse dependence on $T$ of $\frac{1}{T^{0.25}}$. Both algorithms build upon a linear program to produce simple automata leader strategies and induce corresponding automata best-responses for the follower. We complement these results by showing that approximating the Stackelberg value in three-player finite-horizon repeated games is a computationally hard problem via a reduction from balanced vertex cover.

下载PDF全文

下载文献需遵守相关版权规定

论文标题