论文标题
通过顺序测试在软件部署中快速回归检测
Rapid Regression Detection in Software Deployments through Sequential Testing
论文作者
论文摘要
连续部署的实践使公司能够通过增加可以部署软件的速度来减少上市的时间。但是,更频繁地部署偶尔会释放有缺陷的变化的风险。对于互联网公司,这有可能降低用户体验并增加用户放弃。因此,质量控制门是软件交付过程的重要组成部分。这些用于建立对发布或更改的可靠性的信心。为此,一种常见的方法是执行金丝雀测试,以评估生产工作负载下的新软件。尽早发现缺陷以减少暴露并为开发人员提供立即反馈。我们提出了一个统计框架,用于快速检测软件部署中的回归。我们的方法基于随机顺序和分布平等的顺序检验。这使得可以连续监控金丝雀测试,从而允许在严格控制整个错误检测概率的同时迅速检测回归。根据Netflix的两个案例研究,证明了这种方法的实用性。
The practice of continuous deployment has enabled companies to reduce time-to-market by increasing the rate at which software can be deployed. However, deploying more frequently bears the risk that occasionally defective changes are released. For Internet companies, this has the potential to degrade the user experience and increase user abandonment. Therefore, quality control gates are an important component of the software delivery process. These are used to build confidence in the reliability of a release or change. Towards this end, a common approach is to perform a canary test to evaluate new software under production workloads. Detecting defects as early as possible is necessary to reduce exposure and to provide immediate feedback to the developer. We present a statistical framework for rapidly detecting regressions in software deployments. Our approach is based on sequential tests of stochastic order and of equality in distribution. This enables canary tests to be continuously monitored, permitting regressions to be rapidly detected while strictly controlling the false detection probability throughout. The utility of this approach is demonstrated based on two case studies at Netflix.