An industrial case study of automatically identifying performance regression-causes

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) Pub Date : 2014-05-31 DOI:10.1145/2597073.2597092

Thanh H. D. Nguyen, M. Nagappan, A. Hassan, Mohamed N. Nasser, P. Flora

{"title":"An industrial case study of automatically identifying performance regression-causes","authors":"Thanh H. D. Nguyen, M. Nagappan, A. Hassan, Mohamed N. Nasser, P. Flora","doi":"10.1145/2597073.2597092","DOIUrl":null,"url":null,"abstract":"Even the addition of a single extra field or control statement in the source code of a large-scale software system can lead to performance regressions. Such regressions can considerably degrade the user experience. Working closely with the members of a performance engineering team, we observe that they face a major challenge in identifying the cause of a performance regression given the large number of performance counters (e.g., memory and CPU usage) that must be analyzed. We propose the mining of a regression-causes repository (where the results of performance tests and causes of past regressions are stored) to assist the performance team in identifying the regression-cause of a newly-identified regression. We evaluate our approach on an open-source system, and a commercial system for which the team is responsible. The results show that our approach can accurately (up to 80% accuracy) identify performance regression-causes using a reasonably small number of historical test runs (sometimes as few as four test runs per regression-cause).","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"11 1","pages":"232-241"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2597073.2597092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 56

Abstract

Even the addition of a single extra field or control statement in the source code of a large-scale software system can lead to performance regressions. Such regressions can considerably degrade the user experience. Working closely with the members of a performance engineering team, we observe that they face a major challenge in identifying the cause of a performance regression given the large number of performance counters (e.g., memory and CPU usage) that must be analyzed. We propose the mining of a regression-causes repository (where the results of performance tests and causes of past regressions are stored) to assist the performance team in identifying the regression-cause of a newly-identified regression. We evaluate our approach on an open-source system, and a commercial system for which the team is responsible. The results show that our approach can accurately (up to 80% accuracy) identify performance regression-causes using a reasonably small number of historical test runs (sometimes as few as four test runs per regression-cause).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一个自动识别性能退化原因的工业案例研究

即使在大型软件系统的源代码中添加一个额外的字段或控制语句也可能导致性能下降。这样的回归会大大降低用户体验。在与性能工程团队的成员密切合作时，我们观察到他们面临着一个主要的挑战，即在必须分析大量性能计数器(例如，内存和CPU使用率)的情况下，确定性能退化的原因。我们建议挖掘一个回归原因存储库(其中存储了性能测试的结果和过去回归的原因)，以帮助性能团队识别新识别的回归的回归原因。我们在开源系统和团队负责的商业系统上评估我们的方法。结果表明，我们的方法可以使用相当少量的历史测试运行(有时每个回归原因只有四个测试运行)准确地(高达80%的准确率)识别性能回归原因。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

自引率

0.00%

发文量