Characterizing the Prevalence, Distribution, and Duration of Stale Reviewer Recommendations

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING IEEE Transactions on Software Engineering Pub Date : 2024-07-03 DOI:10.1109/TSE.2024.3422369

Farshad Kazemi;Maxime Lamothe;Shane McIntosh

{"title":"Characterizing the Prevalence, Distribution, and Duration of Stale Reviewer Recommendations","authors":"Farshad Kazemi;Maxime Lamothe;Shane McIntosh","doi":"10.1109/TSE.2024.3422369","DOIUrl":null,"url":null,"abstract":"The appropriate assignment of reviewers is a key factor in determining the value that organizations can derive from code review. While inappropriate reviewer recommendations can hinder the benefits of the code review process, identifying these assignments is challenging. Stale reviewers, i.e., those who no longer contribute to the project, are one type of reviewer recommendation that is certainly inappropriate. Understanding and minimizing this type of recommendation can thus enhance the benefits of the code review process. While recent work demonstrates the existence of stale reviewers, to the best of our knowledge, attempts have yet to be made to characterize and mitigate them. In this paper, we study the prevalence and potential effects. We then propose and assess a strategy to mitigate stale recommendations in existing code reviewer recommendation tools. By applying five code reviewer recommendation approaches (LearnRec, RetentionRec, cHRev, Sofia, and WLRRec) to three thriving open-source systems with 5,806 contributors, we observe that, on average, 12.59% of incorrect recommendations are stale due to developer turnover; however, fewer stale recommendations are made when the recency of contributions is considered by the recommendation objective function. We also investigate which reviewers appear in stale recommendations and observe that the top reviewers account for a considerable proportion of stale recommendations. For instance, in 15.31% of cases, the top-3 reviewers account for at least half of the stale recommendations. Finally, we study how long stale reviewers linger after the candidate leaves the project, observing that contributors who left the project 7.7 years ago are still suggested to review change sets. Based on our findings, we propose separating the reviewer contribution recency from the other factors that are used by the CRR objective function to filter out developers who have not contributed during a specified duration. By evaluating this strategy with different intervals, we assess the potential impact of this choice on the recommended reviewers. The proposed filter reduces the staleness of recommendations, i.e., the Staleness Reduction Ratio (SRR) improves between 21.44%–92.39%. Yet since the strategy may increase active reviewer workload, careful project-specific exploration of the impact of the cut-off setting is crucial.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 8","pages":"2096-2109"},"PeriodicalIF":6.5000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10584343/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

The appropriate assignment of reviewers is a key factor in determining the value that organizations can derive from code review. While inappropriate reviewer recommendations can hinder the benefits of the code review process, identifying these assignments is challenging. Stale reviewers, i.e., those who no longer contribute to the project, are one type of reviewer recommendation that is certainly inappropriate. Understanding and minimizing this type of recommendation can thus enhance the benefits of the code review process. While recent work demonstrates the existence of stale reviewers, to the best of our knowledge, attempts have yet to be made to characterize and mitigate them. In this paper, we study the prevalence and potential effects. We then propose and assess a strategy to mitigate stale recommendations in existing code reviewer recommendation tools. By applying five code reviewer recommendation approaches (LearnRec, RetentionRec, cHRev, Sofia, and WLRRec) to three thriving open-source systems with 5,806 contributors, we observe that, on average, 12.59% of incorrect recommendations are stale due to developer turnover; however, fewer stale recommendations are made when the recency of contributions is considered by the recommendation objective function. We also investigate which reviewers appear in stale recommendations and observe that the top reviewers account for a considerable proportion of stale recommendations. For instance, in 15.31% of cases, the top-3 reviewers account for at least half of the stale recommendations. Finally, we study how long stale reviewers linger after the candidate leaves the project, observing that contributors who left the project 7.7 years ago are still suggested to review change sets. Based on our findings, we propose separating the reviewer contribution recency from the other factors that are used by the CRR objective function to filter out developers who have not contributed during a specified duration. By evaluating this strategy with different intervals, we assess the potential impact of this choice on the recommended reviewers. The proposed filter reduces the staleness of recommendations, i.e., the Staleness Reduction Ratio (SRR) improves between 21.44%–92.39%. Yet since the strategy may increase active reviewer workload, careful project-specific exploration of the impact of the cut-off setting is crucial.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

描述陈旧审稿人建议的普遍分布和持续时间

审核员的适当分配是决定组织能否从代码审查中获得价值的关键因素。虽然不恰当的审阅人推荐会阻碍代码审查流程的效益，但识别这些分配是具有挑战性的。陈旧的审阅者，即那些不再对项目有贡献的人，就是一种肯定不合适的审阅者推荐。因此，了解并尽量减少这类推荐可以提高代码审查流程的效益。虽然最近的工作证明了陈旧审稿人的存在，但就我们所知，还没有人尝试过对其进行描述和缓解。在本文中，我们研究了陈腐的普遍性和潜在影响。然后，我们提出并评估了在现有代码审阅者推荐工具中减少陈旧推荐的策略。通过将五种代码审查员推荐方法（LearnRec、RetentionRec、cHRev、Sofia 和 WLRRec）应用于拥有 5806 名贡献者的三个蓬勃发展的开源系统，我们观察到，由于开发人员的更替，平均有 12.59% 的错误推荐是陈旧的；然而，当推荐目标函数考虑贡献的周期时，陈旧的推荐就会减少。我们还调查了哪些审稿人出现在陈旧推荐中，发现顶级审稿人在陈旧推荐中占了相当大的比例。例如，在 15.31% 的案例中，排名前三的审稿人至少占了陈旧推荐的一半。最后，我们研究了在候选人离开项目后，陈旧的审阅者会在项目中停留多久，发现 7.7 年前离开项目的贡献者仍被建议审阅变更集。基于我们的研究结果，我们建议将审阅人贡献的持续时间与 CRR 目标函数使用的其他因素分开，以过滤掉在指定时间内没有贡献的开发人员。通过评估这一策略的不同时间间隔，我们评估了这一选择对推荐审稿人的潜在影响。所提出的过滤策略降低了推荐的陈旧度，即陈旧度降低率（SRR）提高了 21.44%-92.39% 之间。然而，由于该策略可能会增加主动审稿人的工作量，因此针对具体项目仔细探讨截止设置的影响至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.

期刊最新文献

GenProgJS: a Baseline System for Test-based Automated Repair of JavaScript Programs On Inter-dataset Code Duplication and Data Leakage in Large Language Models Line-Level Defect Prediction by Capturing Code Contexts with Graph Convolutional Networks Does Treatment Adherence Impact Experiment Results in TDD? Scoping Software Engineering for AI: The TSE Perspective