治疗效果估计的不变表示学习

论文标题

治疗效果估计的不变表示学习

Invariant Representation Learning for Treatment Effect Estimation

论文作者

Shi, Claudia, Veitch, Victor, Blei, David

论文摘要

观察数据的因果推断的定义挑战是“混杂因素”的存在，即影响治疗分配和结果的协变量。为了应对这一挑战，从业者收集并调整协变量，希望他们能够充分纠正混淆。但是，调整中包括每个观察到的协变量都有包含“不良控制”的风险，这些变量会诱发偏见。问题在于，我们并不总是知道协变量集中的哪些变量可以安全地调整，哪些不是。为了解决这个问题，我们开发了几乎不变的因果估计（NICE）。尼斯使用不变风险最小化（IRM）[ARJ19]来学习协变量的表示，这些协变量在某些假设下剥夺了不良控制的，但保留了足够的信息以调整以进行混杂。调整学习的表示形式，而不是协变量本身，避免了诱导的偏见，并提供有效的因果推断。我们在合成和半合成数据上评估了NICE。当协变量包含未知的对撞机变量和其他不良控件时，尼斯的性能比调整所有协变量要好。

The defining challenge for causal inference from observational data is the presence of `confounders', covariates that affect both treatment assignment and the outcome. To address this challenge, practitioners collect and adjust for the covariates, hoping that they adequately correct for confounding. However, including every observed covariate in the adjustment runs the risk of including `bad controls', variables that induce bias when they are conditioned on. The problem is that we do not always know which variables in the covariate set are safe to adjust for and which are not. To address this problem, we develop Nearly Invariant Causal Estimation (NICE). NICE uses invariant risk minimization (IRM) [Arj19] to learn a representation of the covariates that, under some assumptions, strips out bad controls but preserves sufficient information to adjust for confounding. Adjusting for the learned representation, rather than the covariates themselves, avoids the induced bias and provides valid causal inferences. We evaluate NICE on both synthetic and semi-synthetic data. When the covariates contain unknown collider variables and other bad controls, NICE performs better than adjusting for all the covariates.

下载PDF全文

下载文献需遵守相关版权规定

论文标题