学习学习预测Meta生产的绩效回归

论文标题

学习学习预测Meta生产的绩效回归

Learning to Learn to Predict Performance Regressions in Production at Meta

论文作者

Beller, Moritz, Li, Hongyu, Nair, Vivek, Murali, Vijayaraghavan, Ahmad, Imad, Cito, Jürgen, Carlson, Drew, Aye, Ari, Dyer, Wes

论文摘要

捕获和归因于代码变更引起的生产中的性能回归很难；事先预测它们，甚至更努力。关于自动学习预测软件中性能回归的入门，本文介绍了我们在Meta研究和部署基于ML的回归预测管道时获得的经验。在本文中，我们报告了一项比较研究，其中有四个ML复杂性增加的模型，从（1）代码 - opaque，超过（2）个单词袋，（3）基于悬浮的变压器，到（4）基于定制的变压器的模型，造成的超级福利家。我们的调查表明，性能预测问题的固有难度，其特征是良性对回归变化的不平衡。我们的结果还质疑了基于变压器的架构在性能预测中的一般适用性：基于固定的代码的方法的性能令人惊讶。我们高度定制的超大号架构最初实现了预测性能，这与简单的单词模型相当，并且仅在下游用例中优于它们。超大门廊将其转移到应用程序中的这种能力提供了一个机会，可以在Meta的实践中进行部署：它可以作为预滤器进行预滤波器，以整理不太可能引入回归的变化，从而缩小了更改的空间，以搜索回归的空间，最多可在43％的情况下进行43％的改善，比随机的基线提高45倍。为了进一步洞悉超大号公园，我们通过一系列计算反事实解释进行了探索。这些突出显示了代码的哪些部分更改模型认为重要的，从而验证了学习的黑框模型。

Catching and attributing code change-induced performance regressions in production is hard; predicting them beforehand, even harder. A primer on automatically learning to predict performance regressions in software, this article gives an account of the experiences we gained when researching and deploying an ML-based regression prediction pipeline at Meta. In this paper, we report on a comparative study with four ML models of increasing complexity, from (1) code-opaque, over (2) Bag of Words, (3) off-the-shelve Transformer-based, to (4) a bespoke Transformer-based model, coined SuperPerforator. Our investigation shows the inherent difficulty of the performance prediction problem, which is characterized by a large imbalance of benign onto regressing changes. Our results also call into question the general applicability of Transformer-based architectures for performance prediction: an off-the-shelve CodeBERT-based approach had surprisingly poor performance; our highly customized SuperPerforator architecture initially achieved prediction performance that was just on par with simpler Bag of Words models, and only outperformed them for down-stream use cases. This ability of SuperPerforator to transfer to an application with few learning examples afforded an opportunity to deploy it in practice at Meta: it can act as a pre-filter to sort out changes that are unlikely to introduce a regression, truncating the space of changes to search a regression in by up to 43%, a 45x improvement over a random baseline. To gain further insight into SuperPerforator, we explored it via a series of experiments computing counterfactual explanations. These highlight which parts of a code change the model deems important, thereby validating the learned black-box model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题