论文标题
偏差差异在分析在线受控实验中
Bias Variance Tradeoff in Analysis of Online Controlled Experiments
论文作者
论文摘要
许多组织利用大规模的在线控制实验(OCE)来加速创新。具有很高的统计能力来准确地检测控制和治疗之间的微小差异至关重要,因为即使是关键指标的微小变化也可能价值数百万美元,或者表明对大量用户的用户不满意。对于大规模OCE,持续时间通常很短(例如两周),以加快产品的变化和改进。在本文中,我们研究了在实验时间窗口内从用户收集的使用数据数据的两种常见方法,这可能会有所不同。开放方法包括整个实验过程中所有活跃用户的所有相关用法数据。有限的方法包括第一次在实验窗口中活跃的用户后,每个用户的固定观察期(例如,在暴露后7天)的数据。
Many organizations utilize large-scale online controlled experiments (OCEs) to accelerate innovation. Having high statistical power to detect small differences between control and treatment accurately is critical, as even small changes in key metrics can be worth millions of dollars or indicate user dissatisfaction for a very large number of users. For large-scale OCE, the duration is typically short (e.g. two weeks) to expedite changes and improvements to the product. In this paper, we examine two common approaches for analyzing usage data collected from users within the time window of an experiment, which can differ in accuracy and power. The open approach includes all relevant usage data from all active users for the entire duration of the experiment. The bounded approach includes data from a fixed period of observation for each user (e.g. seven days after exposure) after the first time a user became active in the experiment window.