从观察数据中学习强大的决策政策

论文标题

从观察数据中学习强大的决策政策

Learning Robust Decision Policies from Observational Data

论文作者

Osama, Muhammad, Zachariah, Dave, Stoica, Peter

论文摘要

我们解决了从过去决策的观察数据中学习决策策略的问题，并具有特征和相关结果。过去的政策可能未知，并且在关键的关键应用程序（例如医疗决策支持）中，学习强大的政策是有意义的，以降低成本高昂的结果风险。在本文中，我们开发了一种学习政策的方法，以减少指定级别的成本分配的尾巴，此外，还提供了对每个决策成本的统计有效限制。这些属性在有限样本下是有效的 - 即使在观测数据中不同决策的特征之间存在不均匀或没有重叠的情况下，这些属性也是有效的 - 通过建立共同预测的最新结果。使用真实数据和合成数据说明了所提出方法的性能和统计特性。

We address the problem of learning a decision policy from observational data of past decisions in contexts with features and associated outcomes. The past policy maybe unknown and in safety-critical applications, such as medical decision support, it is of interest to learn robust policies that reduce the risk of outcomes with high costs. In this paper, we develop a method for learning policies that reduce tails of the cost distribution at a specified level and, moreover, provide a statistically valid bound on the cost of each decision. These properties are valid under finite samples -- even in scenarios with uneven or no overlap between features for different decisions in the observed data -- by building on recent results in conformal prediction. The performance and statistical properties of the proposed method are illustrated using both real and synthetic data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题