An Experience Report on Technical Debt in Pull Requests: Challenges and Lessons Learned

Shubhashis Karmakar, Zadia Codabux, M. Vidoni
{"title":"An Experience Report on Technical Debt in Pull Requests: Challenges and Lessons Learned","authors":"Shubhashis Karmakar, Zadia Codabux, M. Vidoni","doi":"10.1145/3544902.3546637","DOIUrl":null,"url":null,"abstract":"Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers’ roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"200 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3544902.3546637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers’ roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关于拉式请求中的技术债务的经验报告:挑战和教训
背景:GitHub是一个用于全球软件开发的协作平台,其中Pull Requests (pr)对于将代码更改与版本控制连接起来至关重要。然而,开发人员经常为了更快的实现而牺牲软件质量,从而产生技术债务(TD)。当开发人员承担审查者的角色并评估PR时,他们通常可以发现TD实例,从而导致PR拒绝或讨论。目的:我们通过评估三个大型存储库:Spark、Kafka和React,调查了Pull Request Comments (prc)是否表明TD。方法:采用机器学习和深度学习模型,将人工分类与自动检测相结合。结果:我们对两个数据集进行分类,发现分别有37.7%和38.7%的prc提示TD。在验证阶段对TD进行分类时,我们的最佳模型达到了F1 = 0.85。结论:我们在这个过程中面临了几个挑战,这可能暗示prc中的TD与其他软件工件(例如,代码注释、提交、问题或讨论论坛)的讨论方式不同。因此,我们提出了挑战和经验教训,以帮助研究人员进行这一领域的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analyzing the Relationship between Community and Design Smells in Open-Source Software Projects: An Empirical Study A Preliminary Investigation of MLOps Practices in GitHub PG-VulNet: Detect Supply Chain Vulnerabilities in IoT Devices using Pseudo-code and Graphs On the Relationship Between Story Points and Development Effort in Agile Open-Source Software DevOps Practitioners’ Perceptions of the Low-code Trend
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1