论文标题

贝叶斯固定预算最佳武器标识

Bayesian Fixed-Budget Best-Arm Identification

论文作者

Atsidakou, Alexia, Katariya, Sumeet, Sanghavi, Sujay, Kveton, Branislav

论文摘要

固定预算最佳武器识别(BAI)是一个强盗问题,其中代理在固定的观测预算中最大程度地识别最佳臂的可能性。在这项工作中,我们在贝叶斯环境中研究了这个问题。我们提出了一种贝叶斯消除算法,并以误识最佳臂的可能性得出了上限。边界反映了先前的质量,并且是这种情况下的第一个分布依赖性界限。我们使用类似于频繁的参数证明了它,我们在其中携带了先前的内容,然后在最后整合了强盗实例。我们还提供了$ 2 $武装的贝叶斯强盗的错误识别可能性的下限,并表明我们的上限(几乎)与任何预算相匹配。我们的实验表明,贝叶斯消除优于常见的方法,并且与在我们的环境中无法保证的最先进的贝叶斯算法竞争。

Fixed-budget best-arm identification (BAI) is a bandit problem where the agent maximizes the probability of identifying the optimal arm within a fixed budget of observations. In this work, we study this problem in the Bayesian setting. We propose a Bayesian elimination algorithm and derive an upper bound on its probability of misidentifying the optimal arm. The bound reflects the quality of the prior and is the first distribution-dependent bound in this setting. We prove it using a frequentist-like argument, where we carry the prior through, and then integrate out the bandit instance at the end. We also provide a lower bound on the probability of misidentification in a $2$-armed Bayesian bandit and show that our upper bound (almost) matches it for any budget. Our experiments show that Bayesian elimination is superior to frequentist methods and competitive with the state-of-the-art Bayesian algorithms that have no guarantees in our setting.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源