论文标题
最佳的非参数测试完全随机丢失及其与兼容性的连接
Optimal nonparametric testing of Missing Completely At Random, and its connections to compatibility
论文作者
论文摘要
考虑到一组不完整的观察结果,我们研究了测试数据是否完全随机缺失的非参数问题(MCAR)。我们的第一个贡献是准确表征可以与MCAR无效假设区分开的替代方案集。这揭示了与Fréchet类别理论(尤其是兼容分布)和线性编程的有趣而新颖的联系,这使我们能够提出与所有可检测替代方案一致的MCAR测试。我们将不兼容索引定义为自然的可检测性,确定其关键特性,并显示如何在某些情况下准确地计算出来并在其他情况下进行界定。此外,我们证明我们的测试可以根据该度量达到最小值分离率,直到对数因素。我们的方法不需要任何完整的案例才能有效,并且可以在R套件麦卡特斯特(R Package McArtest)中使用。
Given a set of incomplete observations, we study the nonparametric problem of testing whether data are Missing Completely At Random (MCAR). Our first contribution is to characterise precisely the set of alternatives that can be distinguished from the MCAR null hypothesis. This reveals interesting and novel links to the theory of Fréchet classes (in particular, compatible distributions) and linear programming, that allow us to propose MCAR tests that are consistent against all detectable alternatives. We define an incompatibility index as a natural measure of ease of detectability, establish its key properties, and show how it can be computed exactly in some cases and bounded in others. Moreover, we prove that our tests can attain the minimax separation rate according to this measure, up to logarithmic factors. Our methodology does not require any complete cases to be effective, and is available in the R package MCARtest.