Quanxin Yang, Dongjin Yu, Xin Chen, Yihang Xu, Wangliang Yan, Bin Hu
{"title":"Feature envy detection based on cross-graph local semantics matching","authors":"Quanxin Yang, Dongjin Yu, Xin Chen, Yihang Xu, Wangliang Yan, Bin Hu","doi":"10.1016/j.infsof.2024.107515","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>As a typical code smell, feature envy occurs when a method exhibits excessive reliance and usage on specific functionalities of another class, which can lead to issues with the maintainability and extensibility of the code. As such, detecting and avoiding feature envy is critical for software development. Previous research on detecting feature envy has demonstrated significant advantages of deep learning-based approaches over static code analysis tools. However, current deep learning-based approaches still suffer from two limitations: (1) They focus on the functional or overall semantics of the code, which ignores the opportunities for local code semantics matching, making it challenging to identify some more complex cases; (2) Existing feature envy datasets are collected or synthesized using static code analysis tools, which limits feature envy cases to fixed rules and makes it challenging to cover other complex cases in real projects.</p></div><div><h3>Objective:</h3><p>We are motivated to propose a Siamese graph neural network based on code local semantics matching and collect feature envy refactoring cases from real projects for experimental evaluation.</p></div><div><h3>Method:</h3><p>To address the first issue, we propose a cross-graph local semantics matching network, which aims to simulate human intuition or experience to detect feature envy by analyzing the local semantics matching between code graphs. To address the second one, we manually review and collect commits for refactoring feature envy cases on GitHub. Then, we refer to image data augmentation technology to construct two datasets for identifying feature envy and recommending <em>Move Method</em> refactorings, respectively.</p></div><div><h3>Results:</h3><p>Extensive experiments show that our approach outperforms state-of-the-art baselines regarding both tasks’ comprehensive metrics, F1-score and AUC.</p></div><div><h3>Conclusion:</h3><p>The experimental results indicate that the proposed Siamese graph neural network based on code local semantics matching is effective. In addition, the provided data augmentation algorithms can significantly improve model performance.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"174 ","pages":"Article 107515"},"PeriodicalIF":3.8000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924001204","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
As a typical code smell, feature envy occurs when a method exhibits excessive reliance and usage on specific functionalities of another class, which can lead to issues with the maintainability and extensibility of the code. As such, detecting and avoiding feature envy is critical for software development. Previous research on detecting feature envy has demonstrated significant advantages of deep learning-based approaches over static code analysis tools. However, current deep learning-based approaches still suffer from two limitations: (1) They focus on the functional or overall semantics of the code, which ignores the opportunities for local code semantics matching, making it challenging to identify some more complex cases; (2) Existing feature envy datasets are collected or synthesized using static code analysis tools, which limits feature envy cases to fixed rules and makes it challenging to cover other complex cases in real projects.
Objective:
We are motivated to propose a Siamese graph neural network based on code local semantics matching and collect feature envy refactoring cases from real projects for experimental evaluation.
Method:
To address the first issue, we propose a cross-graph local semantics matching network, which aims to simulate human intuition or experience to detect feature envy by analyzing the local semantics matching between code graphs. To address the second one, we manually review and collect commits for refactoring feature envy cases on GitHub. Then, we refer to image data augmentation technology to construct two datasets for identifying feature envy and recommending Move Method refactorings, respectively.
Results:
Extensive experiments show that our approach outperforms state-of-the-art baselines regarding both tasks’ comprehensive metrics, F1-score and AUC.
Conclusion:
The experimental results indicate that the proposed Siamese graph neural network based on code local semantics matching is effective. In addition, the provided data augmentation algorithms can significantly improve model performance.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.