论文标题
受到时变限制的惩罚ftrl
Penalised FTRL With Time-Varying Constraints
论文作者
论文摘要
在本文中,我们通过自适应惩罚扩展了经典的遵循规范化领导者(FTRL)算法,以包含时间变化的约束。我们建立了足够的条件,使提议的受惩罚Ftrl算法实现$ o(\ sqrt {t})$后悔和违反强的基准$ \ hat {x}^{max} _t $。缺乏对约束的先验知识,这可能是我们可以合理希望的最大基准集。我们的足够条件是必要的,从某种意义上说,当它们违反时,存在$ o(\ sqrt {t})$遗憾和违规行为的例子。与现有的最佳原始二算法相比,受惩罚的FTRL实质上扩大了$ o(\ sqrt {t})$遗憾和违规绩效的问题类别。
In this paper we extend the classical Follow-The-Regularized-Leader (FTRL) algorithm to encompass time-varying constraints, through adaptive penalization. We establish sufficient conditions for the proposed Penalized FTRL algorithm to achieve $O(\sqrt{t})$ regret and violation with respect to strong benchmark $\hat{X}^{max}_t$. Lacking prior knowledge of the constraints, this is probably the largest benchmark set that we can reasonably hope for. Our sufficient conditions are necessary in the sense that when they are violated there exist examples where $O(\sqrt{t})$ regret and violation is not achieved. Compared to the best existing primal-dual algorithms, Penalized FTRL substantially extends the class of problems for which $O(\sqrt{t})$ regret and violation performance is achievable.