论文标题

与结构性潜在混杂因子使用高斯过程的因果推理

Causal Inference using Gaussian Processes with Structured Latent Confounders

论文作者

Witty, Sam, Takatsu, Kenta, Jensen, David, Mansinghka, Vikash

论文摘要

潜在的混杂因素 - 未观察到的变量会影响治疗和结果 - - 可能会偏向因果影响的估计。在某些情况下,这些混杂因素会在观察中共享,例如除了他们单独接受的任何教育干预措施外,所有参加课程的学生都受课程困难的影响。本文展示了如何半绘图对具有这种结构的潜在混杂因素进行建模,从而改善了因果效应的估计值。关键创新是一个分层贝叶斯模型,具有结构性潜在混杂因子(GP-SLC)的高斯工艺,以及基于椭圆形采样的该模型的蒙特卡洛推理算法。 GP-SLC提供了对个体治疗效应的原则性贝叶斯不确定性估计,对功能形式的最小假设与混杂因素,协变量,治疗和结果相关。最后,本文表明,GP-SLC比在三个基准数据集上广泛使用的因果推理技术具有竞争力或更准确性,包括婴儿健康和发展计划以及一个数据集,显示了变化温度对整个新英格兰州全州能源消耗的影响。

Latent confounders---unobserved variables that influence both treatment and outcome---can bias estimates of causal effects. In some cases, these confounders are shared across observations, e.g. all students taking a course are influenced by the course's difficulty in addition to any educational interventions they receive individually. This paper shows how to semiparametrically model latent confounders that have this structure and thereby improve estimates of causal effects. The key innovations are a hierarchical Bayesian model, Gaussian processes with structured latent confounders (GP-SLC), and a Monte Carlo inference algorithm for this model based on elliptical slice sampling. GP-SLC provides principled Bayesian uncertainty estimates of individual treatment effect with minimal assumptions about the functional forms relating confounders, covariates, treatment, and outcome. Finally, this paper shows GP-SLC is competitive with or more accurate than widely used causal inference techniques on three benchmark datasets, including the Infant Health and Development Program and a dataset showing the effect of changing temperatures on state-wide energy consumption across New England.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源