论文标题
在因果模型中进行最佳干预设计的积极学习
Active Learning for Optimal Intervention Design in Causal Models
论文作者
论文摘要
在科学,工程和公共政策等各个领域中,发现实现预期结果的干预措施的顺序实验设计是一个关键问题。当可能的干预空间很大时,需要进行详尽的搜索,需要实验设计策略。在这种情况下,编码变量之间的因果关系以及干预对系统的影响对于更有效地识别理想的干预措施至关重要。在这里,我们制定了一种因果活跃的学习策略,以确定最佳的干预措施,这是通过分布后的惯用后平均值和所需目标平均值之间的差异来衡量的。该方法采用了因果模型的贝叶斯更新,并使用精心设计的,有因果关系的收购功能优先考虑干预措施。此采集功能以封闭形式进行评估,从而可以快速优化。理论上,所得算法以信息理论界限和可证明的具有已知因果图的线性因果模型的一致性结果扎根。我们将我们的方法应用于来自werturb-cite-seq实验的合成数据和单细胞转录组数据,以识别诱导特定细胞状态过渡的最佳扰动。因果知情的采集功能通常优于现有标准,即允许使用更少但精心选择的样本进行最佳干预设计。
Sequential experimental design to discover interventions that achieve a desired outcome is a key problem in various domains including science, engineering and public policy. When the space of possible interventions is large, making an exhaustive search infeasible, experimental design strategies are needed. In this context, encoding the causal relationships between the variables, and thus the effect of interventions on the system, is critical for identifying desirable interventions more efficiently. Here, we develop a causal active learning strategy to identify interventions that are optimal, as measured by the discrepancy between the post-interventional mean of the distribution and a desired target mean. The approach employs a Bayesian update for the causal model and prioritizes interventions using a carefully designed, causally informed acquisition function. This acquisition function is evaluated in closed form, allowing for fast optimization. The resulting algorithms are theoretically grounded with information-theoretic bounds and provable consistency results for linear causal models with known causal graph. We apply our approach to both synthetic data and single-cell transcriptomic data from Perturb-CITE-seq experiments to identify optimal perturbations that induce a specific cell state transition. The causally informed acquisition function generally outperforms existing criteria allowing for optimal intervention design with fewer but carefully selected samples.