论文标题
使用经验贝叶斯的贝叶斯元学习学习
Bayesian Meta-Prior Learning Using Empirical Bayes
论文作者
论文摘要
已知将域知识添加到学习系统中可以改善结果。在多参数贝叶斯框架中,此类知识被合并为先验。另一方面,各种模型参数可以在现实世界中的问题中具有不同的学习率,尤其是随着偏斜的数据而言。在操作管理和管理科学应用方面的两个经常面临的挑战是没有信息的先验,以及无法控制参数学习率的挑战。在这项研究中,我们提出了一种分层经验贝叶斯方法,该方法解决了这两个挑战,并且可以推广到任何贝叶斯框架。我们的方法从数据本身中学习经验元数据,并使用它们将其在广义线性模型中解脱出一阶和二阶特征(或任何其他给定特征分组)的学习率。由于一阶功能可能对结果产生更明显的影响,因此首先专注于学习一阶权重可能会改善性能和收敛时间。我们的经验贝叶斯方法将每个组中的特征夹在一起,并使用部署模型的数据进行经验计算事后的层次结构。我们报告了我们的元方差估计器的无偏,强度和最佳频繁累积遗憾特性的理论结果。我们将我们的方法应用于标准监督的学习优化问题,以及在亚马逊生产系统中实现的上下文强盗设置中的在线组合优化问题。在模拟和实时实验期间,我们的方法都显示出明显的改进,尤其是在流量较小的情况下。我们的发现很有希望,因为对稀疏数据进行优化通常是一个挑战。
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of informative priors, and the inability to control parameter learning rates. In this study, we propose a hierarchical Empirical Bayes approach that addresses both challenges, and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grouping) in a Generalized Linear Model. As the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the deployed model's observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem, as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge.