{"title":"An Experience Report on Technical Debt in Pull Requests: Challenges and Lessons Learned","authors":"Shubhashis Karmakar, Zadia Codabux, M. Vidoni","doi":"10.1145/3544902.3546637","DOIUrl":null,"url":null,"abstract":"Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers’ roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"200 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3544902.3546637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Background: GitHub is a collaborative platform for global software development, where Pull Requests (PRs) are essential to bridge code changes with version control. However, developers often trade software quality for faster implementation, incurring Technical Debt (TD). When developers undertake reviewers’ roles and evaluate PRs, they can often detect TD instances, leading to either PR rejection or discussions. Aims: We investigated whether Pull Request Comments (PRCs) indicate TD by assessing three large-scale repositories: Spark, Kafka, and React. Method: We combined manual classification with automated detection using machine learning and deep learning models. Results: We classified two datasets and found that 37.7 and 38.7% of PRCs indicate TD, respectively. Our best model achieved F1 = 0.85 when classifying TD during the validation phase. Conclusions: We faced several challenges during this process, which may hint that TD in PRCs is discussed differently from other software artifacts (e.g., code comments, commits, issues, or discussion forums). Thus, we present challenges and lessons learned to assist researchers in pursuing this area of research.