论文标题

精致的可接触式算法和应用全球成本最小化最小化的应用

Refined approachability algorithms and application to regret minimization with global costs

论文作者

Kwon, Joon

论文摘要

布莱克韦尔的可实现性是一个框架,两个球员(决策者和环境)与矢量值得的回报一起玩重复的游戏。决策者的目标是使平均收益融合到给定的名为“目标”集合。如果确实可以,则可以确定收敛性的简单算法。该抽象工具成功地用于在各种重复游戏中构建最佳策略,但在在线学习中也发现了一些应用。通过扩展(Abernethy等人,2011年)提出的一种方法,我们构建和分析了一类关注正规的领导者算法(FTRL),以实现Blackwell的可接近性,该算法能够最小化与目标集的欧几里得距离(在Blackwell的可靠性中通常是距离距离范围较高的距离范围)。这种灵活性使我们能够应用这些算法,以密切最大程度地减少各种在线学习问题的兴趣数量。特别是,对于以$ \ ell_p $全球成本的最小化后悔,我们获得了$ p $和尺寸$ d $的第一界界限。

Blackwell's approachability is a framework where two players, the Decision Maker and the Environment, play a repeated game with vector-valued payoffs. The goal of the Decision Maker is to make the average payoff converge to a given set called the target. When this is indeed possible, simple algorithms which guarantee the convergence are known. This abstract tool was successfully used for the construction of optimal strategies in various repeated games, but also found several applications in online learning. By extending an approach proposed by (Abernethy et al., 2011), we construct and analyze a class of Follow the Regularized Leader algorithms (FTRL) for Blackwell's approachability which are able to minimize not only the Euclidean distance to the target set (as it is often the case in the context of Blackwell's approachability) but a wide range of distance-like quantities. This flexibility enables us to apply these algorithms to closely minimize the quantity of interest in various online learning problems. In particular, for regret minimization with $\ell_p$ global costs, we obtain the first bounds with explicit dependence in $p$ and the dimension $d$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源