论文标题

准Newton的Quasi-Newton步骤,用于有效的在线Exp-Concave优化

Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

论文作者

Mhammedi, Zakaria, Gatmiry, Khashayar

论文摘要

本文的目的是为在线和随机Exp-Concave优化设置设计计算效率和最佳算法。这些设置的典型算法,例如在线牛顿步骤(ONS),可以保证$ O(d \ ln t)$在$ t $ rounds之后遗憾的是,其中$ d $是可行设置的尺寸。但是,这种算法只要在可行的集合之外进行迭代时执行所谓的广义预测。这种广义预测需要$ω(d^3)$算术操作,即使对于像欧几里得球这样的简单套装,在最差的案例中,$ d^3 t $ of订单$ d^3 t $的总运行时间。在本文中,我们通过使用自我符合障碍作为正规机来计算牛顿步骤来辅助普遍预测。这样可以确保迭代始终在可行的集合中,而无需预测。这种方法仍需要在每个步骤中计算屏障的逆逆倒数。但是,使用牛顿步骤的稳定性属性,我们表明,在大多数回合中,可以通过Taylor扩展有效地近似Hessians的倒数,从而导致$ O(d^2 t +d^ω\ sqrt {t} {t} {t})$总计算复杂性,其中$ω$是$ω$的矩阵乘积的成熟度。在随机环境中,我们表明这转化为$ O(d^3/ε)$用于查找$ε$ -suboptimal点的计算复杂性,回答了Koren 2013的一个开放问题。我们首先在可行的情况下显示了这些新的结果,其中可行的套装是欧几里得球。然后,为了转到一般凸组,我们使用降低到欧几里得球对在线凸优化的优化。我们的最终算法可以看作是ONS的更有效版本。

The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called generalized projections whenever their iterates step outside the feasible set. Such generalized projections require $Ω(d^3)$ arithmetic operations even for simple sets such a Euclidean ball, making the total runtime of ONS of order $d^3 T$ after $T$ rounds, in the worst-case. In this paper, we side-step generalized projections by using a self-concordant barrier as a regularizer to compute the Newton steps. This ensures that the iterates are always within the feasible set without requiring projections. This approach still requires the computation of the inverse of the Hessian of the barrier at every step. However, using the stability properties of the Newton steps, we show that the inverse of the Hessians can be efficiently approximated via Taylor expansions for most rounds, resulting in a $O(d^2 T +d^ω\sqrt{T})$ total computational complexity, where $ω$ is the exponent of matrix multiplication. In the stochastic setting, we show that this translates into a $O(d^3/ε)$ computational complexity for finding an $ε$-suboptimal point, answering an open question by Koren 2013. We first show these new results for the simple case where the feasible set is a Euclidean ball. Then, to move to general convex set, we use a reduction to Online Convex Optimization over the Euclidean ball. Our final algorithm can be viewed as a more efficient version of ONS.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源