汤普森抽样，用于高维稀疏线性上下文匪徒

论文标题

汤普森抽样，用于高维稀疏线性上下文匪徒

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

论文作者

Chakraborty, Sunrit, Roy, Saptarshi, Tewari, Ambuj

论文摘要

我们考虑具有高维特征的随机线性上下文匪徒问题。我们使用特殊类别的稀疏性先验（例如，尖峰和slab）对汤普森采样算法进行分析，以模拟未知参数，并在预期的累积遗憾中提供了几乎最佳的上限。据我们所知，这是第一批提供汤普森在高维且稀疏的上下文土匪中采样的理论保证的作品。为了更快的计算，我们使用变异推理而不是马尔可夫链蒙特卡洛（MCMC）来近似后验分布。广泛的模拟表明，我们提出的算法对现有算法的性能提高了。

We consider the stochastic linear contextual bandit problem with high-dimensional features. We analyze the Thompson sampling algorithm using special classes of sparsity-inducing priors (e.g., spike-and-slab) to model the unknown parameter and provide a nearly optimal upper bound on the expected cumulative regret. To the best of our knowledge, this is the first work that provides theoretical guarantees of Thompson sampling in high-dimensional and sparse contextual bandits. For faster computation, we use variational inference instead of Markov Chain Monte Carlo (MCMC) to approximate the posterior distribution. Extensive simulations demonstrate the improved performance of our proposed algorithm over existing ones.

下载PDF全文

下载文献需遵守相关版权规定

论文标题