概括界限和表示学习的估计和因果关系的估计

论文标题

概括界限和表示学习的估计和因果关系的估计

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects

论文作者

Johansson, Fredrik D., Shalit, Uri, Kallus, Nathan, Sontag, David

论文摘要

医疗保健，经济学和教育等各个领域的从业人员渴望将机器学习应用于改善决策。执行实验的成本和不切实际性以及最近的电子记录保留的巨大增长引起了人们对基于非实验观察数据评估决策的问题。这是这项工作的设置。特别是，我们研究了个人级因果关系效应的估计，例如单个患者对替代药物的反应，从记录的上下文，决策和结果中。我们根据接受不同治疗方法的组之间的距离测量值对估计效果的误差进行了概括，从而可以重新加权。我们提供的条件在限制下紧密，并显示其与无监督域适应结果的关系。在我们的理论结果的带领下，我们设计了代表性学习算法，这些算法通过正规化表示诱导的治疗组距离，并鼓励治疗组之间的信息共享，从而最大程度地减少界限。我们将这些算法扩展到同时学习加权表示，以进一步降低治疗组的距离。最后，对真实和合成数据的实验评估显示了我们提出的表示形式结构和正则化方案的价值。

Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making. The cost and impracticality of performing experiments and a recent monumental increase in electronic record keeping has brought attention to the problem of evaluating decisions based on non-experimental observational data. This is the setting of this work. In particular, we study estimation of individual-level causal effects, such as a single patient's response to alternative medication, from recorded contexts, decisions and outcomes. We give generalization bounds on the error in estimated effects based on distance measures between groups receiving different treatments, allowing for sample re-weighting. We provide conditions under which our bound is tight and show how it relates to results for unsupervised domain adaptation. Led by our theoretical results, we devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance, and encourage sharing of information between treatment groups. We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances. Finally, an experimental evaluation on real and synthetic data shows the value of our proposed representation architecture and regularization scheme.

下载PDF全文

下载文献需遵守相关版权规定

论文标题