因果正规化：样本中风险和样本外风险保证之间的权衡

论文标题

因果正规化：样本中风险和样本外风险保证之间的权衡

Causal Regularization: On the trade-off between in-sample risk and out-of-sample risk guarantees

论文作者

Kania, Lucas, Wit, Ernst

论文摘要

近几十年来，已经引入了许多在实践中处理因果关系的方法，例如倾向得分匹配，PC算法和不变因果预测。除了其解释性吸引力外，因果模型还提供了最佳的样本外预测保证。在本文中，我们研究了从样本中数据的因果样模型的识别，这些模型在预测一组协变量的目标变量时提供了样本外风险保证。尽管普通的最小二乘可提供最佳的样本外风险，而样本外的保证有限，但因果模型具有最佳的样本外保证，但要实现劣质样本的风险。通过定义这些属性的权衡，我们引入了$ \ textit {Causal正则化} $。随着正则化的增加，它提供了估计量，其风险在子样本中更稳定，以增加样本内风险的成本。提高的风险稳定性已显示导致样本外风险保证。我们为所有模型提供有限的样本风险界限，并证明了交叉验证的适当性来达到这些界限。

In recent decades, a number of ways of dealing with causality in practice, such as propensity score matching, the PC algorithm and invariant causal prediction, have been introduced. Besides its interpretational appeal, the causal model provides the best out-of-sample prediction guarantees. In this paper, we study the identification of causal-like models from in-sample data that provide out-of-sample risk guarantees when predicting a target variable from a set of covariates. Whereas ordinary least squares provides the best in-sample risk with limited out-of-sample guarantees, causal models have the best out-of-sample guarantees but achieve an inferior in-sample risk. By defining a trade-off of these properties, we introduce $\textit{causal regularization}$. As the regularization is increased, it provides estimators whose risk is more stable across sub-samples at the cost of increasing their overall in-sample risk. The increased risk stability is shown to lead to out-of-sample risk guarantees. We provide finite sample risk bounds for all models and prove the adequacy of cross-validation for attaining these bounds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题