剩下的一个分离的条件随机测试

论文标题

剩下的一个分离的条件随机测试

The leave-one-covariate-out conditional randomization test

论文作者

Katsevich, Eugene, Ramdas, Aaditya

论文摘要

有条件的独立性测试是一个重要的问题，但事实证明没有假设。后期流行的假设之一称为“ Model-X”，我们假设我们知道协变量的联合分布，但鉴于协变量，结果对结果的条件分布一无所知。仿冒品是与此框架相关的一种流行方法，但它具有两个主要缺点：在每个变量上只有一个位$ p $ - 值可用于推断，并且该方法在实践中的运行中随机变化很大。条件随机测试（CRT）被认为是模型X下的“正确”解决方案，但通常被视为计算效率低下。本文提出了一个计算高效的剩下的搭载（Loco）CRT，该CRT解决了这两个仿冒剂的缺点。 Loco CRT产生有效的$ p $ - 价值，可用于控制家庭错误率，并且具有几乎为零的算法可变性。对于L1正则化M估计器，我们开发了一个更快的变体，称为L1ME CRT，该变体通过利用有关交叉验证的Lasso的稳定性来消除非活性变量的稳定性来重复计算。最后，对于多元高斯协变量，我们为Loco Crt $ p $ - 价值提供了封闭式表达式，因此在这个重要的特殊情况下完全消除了重新采样。

Conditional independence testing is an important problem, yet provably hard without assumptions. One of the assumptions that has become popular of late is called "model-X", where we assume we know the joint distribution of the covariates, but assume nothing about the conditional distribution of the outcome given the covariates. Knockoffs is a popular methodology associated with this framework, but it suffers from two main drawbacks: only one-bit $p$-values are available for inference on each variable, and the method is randomized with significant variability across runs in practice. The conditional randomization test (CRT) is thought to be the "right" solution under model-X, but usually viewed as computationally inefficient. This paper proposes a computationally efficient leave-one-covariate-out (LOCO) CRT that addresses both drawbacks of knockoffs. LOCO CRT produces valid $p$-values that can be used to control the familywise error rate, and has nearly zero algorithmic variability. For L1 regularized M-estimators, we develop an even faster variant called L1ME CRT, which reuses computation by leveraging a novel observation about the stability of the cross-validated lasso to removing inactive variables. Last, for multivariate Gaussian covariates, we present a closed form expression for the LOCO CRT $p$-value, thus completely eliminating resampling in this important special case.

下载PDF全文

下载文献需遵守相关版权规定

论文标题