Alvin Jian Jia Tan , Chun Yong Chong , Aldeida Aleti
{"title":"REARRANGE: Effort estimation approach for software clustering-based remodularisation","authors":"Alvin Jian Jia Tan , Chun Yong Chong , Aldeida Aleti","doi":"10.1016/j.infsof.2024.107567","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>Most research in software clustering and remodularisation typically concludes by recommending the refactoring operations without further insight into the practicality of the proposed technique. Developers might be hesitant to follow through with the refactoring suggestions due to the uncertainty in the effort needed.</p></div><div><h3>Objective:</h3><p>This work aims to address this gap by introducing an effo<strong>R</strong>t <strong>E</strong>stimation <strong>A</strong>pp<strong>R</strong>oach fo<strong>R</strong> softw<strong>A</strong>re clusteri<strong>NG</strong>-based r<strong>E</strong>modularisation (REARRANGE) to close the loop in extant software clustering and remodularisation research by estimating the time required to carry out the suggested refactoring operations based on the history of the evolution of the software. By providing tangible estimates of refactoring effort in person-hours, we can inform developers of complex and time-consuming refactoring operations that will help prioritise refactoring efforts, allowing practitioners to weave in these activities during sprint planning.</p></div><div><h3>Method:</h3><p>REARRANGE builds a machine learning model to predict effort estimation based on past commit activity which extracts Software Features (lines of code, number of methods), Refactoring Features (refactoring type, source and destination) and Dependency Features (dependencies between classes). REARRANGE is then compared against sanity checks, baseline effort estimation models, and state-of-the-art software estimation models. We also attempt to cross-validate REARRANGE’s effort estimation with software developers.</p></div><div><h3>Results:</h3><p>Experimented through 25 open-source Java-based projects, the proposed approach estimated the refactoring effort of the test subjects with a Mean Absolute Error (MAE) of 5.47 person-hours against the MAE of the next-best approach of 453.31 person-hours. Based on a survey conducted among software developers, REARRANGE consistently delivers accurate estimates in 93.6% of cases.</p></div><div><h3>Conclusion:</h3><p>The lack of a direct comparison for REARRANGE highlights the need for a refactoring effort-focused estimation model that provides tangible effort estimates in person-hours for refactoring operations. Only then can developers selectively choose relevant refactoring operations while considering the available time and budget constraints, bridging the gap between software clustering research and real-world application.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"176 ","pages":"Article 107567"},"PeriodicalIF":3.8000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950584924001721/pdfft?md5=bd53a5ee1cbc06cd207117c50478f517&pid=1-s2.0-S0950584924001721-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924001721","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
Most research in software clustering and remodularisation typically concludes by recommending the refactoring operations without further insight into the practicality of the proposed technique. Developers might be hesitant to follow through with the refactoring suggestions due to the uncertainty in the effort needed.
Objective:
This work aims to address this gap by introducing an effoRt Estimation AppRoach foR softwAre clusteriNG-based rEmodularisation (REARRANGE) to close the loop in extant software clustering and remodularisation research by estimating the time required to carry out the suggested refactoring operations based on the history of the evolution of the software. By providing tangible estimates of refactoring effort in person-hours, we can inform developers of complex and time-consuming refactoring operations that will help prioritise refactoring efforts, allowing practitioners to weave in these activities during sprint planning.
Method:
REARRANGE builds a machine learning model to predict effort estimation based on past commit activity which extracts Software Features (lines of code, number of methods), Refactoring Features (refactoring type, source and destination) and Dependency Features (dependencies between classes). REARRANGE is then compared against sanity checks, baseline effort estimation models, and state-of-the-art software estimation models. We also attempt to cross-validate REARRANGE’s effort estimation with software developers.
Results:
Experimented through 25 open-source Java-based projects, the proposed approach estimated the refactoring effort of the test subjects with a Mean Absolute Error (MAE) of 5.47 person-hours against the MAE of the next-best approach of 453.31 person-hours. Based on a survey conducted among software developers, REARRANGE consistently delivers accurate estimates in 93.6% of cases.
Conclusion:
The lack of a direct comparison for REARRANGE highlights the need for a refactoring effort-focused estimation model that provides tangible effort estimates in person-hours for refactoring operations. Only then can developers selectively choose relevant refactoring operations while considering the available time and budget constraints, bridging the gap between software clustering research and real-world application.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.