融合深度卷积和LSTM递归神经网络的代码气味自动检测

Anh Ho, Anh M. T. Bui, P. Nguyen, Amleto Di Salle
{"title":"融合深度卷积和LSTM递归神经网络的代码气味自动检测","authors":"Anh Ho, Anh M. T. Bui, P. Nguyen, Amleto Di Salle","doi":"10.1145/3593434.3593476","DOIUrl":null,"url":null,"abstract":"Code smells is the term used to signal certain patterns or structures in software code that may contain a potential design or architecture problem, leading to maintainability or other software quality issues. Detecting code smells early in the software development process helps prevent these problems and improve the overall software quality. Existing research concentrates on the process of collecting and handling dataset, then exploring the potential of utilizing deep learning models to detect smells, while ignoring extensive feature engineering. Though these approaches obtained promising results, the following issues need to be tackled: (i) extracting both structural and semantic features from the software units; (ii) mitigating the effects of imbalanced data distribution on the performance.In this paper, we propose DeepSmells as a novel approach to code smells detection. To learn the complex hierarchical representations of the code fragment, we apply a deep convolutional neural network (CNN). Then, in order to improve the quality of the context encoding and preserve semantic information, long short-term memory networks (LSTM) is placed immediately after the CNN. The final classification is conducted by deep neural networks with weighted loss function to reduce the impact of skewed data distribution. We performed an empirical study using the existing code smell benchmark datasets to assess the performance of our proposed approach, and compare it with state-of-the-art baselines. The results demonstrate the effectiveness of our proposed method for all kinds of code smells with outperformed evaluation metrics in terms of F1 score and MCC.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smells\",\"authors\":\"Anh Ho, Anh M. T. Bui, P. Nguyen, Amleto Di Salle\",\"doi\":\"10.1145/3593434.3593476\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code smells is the term used to signal certain patterns or structures in software code that may contain a potential design or architecture problem, leading to maintainability or other software quality issues. Detecting code smells early in the software development process helps prevent these problems and improve the overall software quality. Existing research concentrates on the process of collecting and handling dataset, then exploring the potential of utilizing deep learning models to detect smells, while ignoring extensive feature engineering. Though these approaches obtained promising results, the following issues need to be tackled: (i) extracting both structural and semantic features from the software units; (ii) mitigating the effects of imbalanced data distribution on the performance.In this paper, we propose DeepSmells as a novel approach to code smells detection. To learn the complex hierarchical representations of the code fragment, we apply a deep convolutional neural network (CNN). Then, in order to improve the quality of the context encoding and preserve semantic information, long short-term memory networks (LSTM) is placed immediately after the CNN. The final classification is conducted by deep neural networks with weighted loss function to reduce the impact of skewed data distribution. We performed an empirical study using the existing code smell benchmark datasets to assess the performance of our proposed approach, and compare it with state-of-the-art baselines. The results demonstrate the effectiveness of our proposed method for all kinds of code smells with outperformed evaluation metrics in terms of F1 score and MCC.\",\"PeriodicalId\":178596,\"journal\":{\"name\":\"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3593434.3593476\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3593434.3593476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

代码气味是一个术语,用于指示软件代码中的某些模式或结构,这些模式或结构可能包含潜在的设计或体系结构问题,从而导致可维护性或其他软件质量问题。在软件开发过程的早期检测代码气味有助于防止这些问题并提高整体软件质量。现有的研究集中在收集和处理数据集的过程,然后探索利用深度学习模型检测气味的潜力,而忽略了广泛的特征工程。虽然这些方法获得了令人满意的结果,但需要解决以下问题:(i)从软件单元中提取结构和语义特征;(ii)减轻数据分布不均衡对性能的影响。在本文中,我们提出了deepsmell作为一种新的代码气味检测方法。为了学习代码片段的复杂层次表示,我们应用了深度卷积神经网络(CNN)。然后,为了提高上下文编码的质量并保留语义信息,将长短期记忆网络(LSTM)放置在CNN之后。最后的分类是用加权损失函数的深度神经网络进行的,以减少数据分布偏斜的影响。我们使用现有的代码气味基准数据集进行了一项实证研究,以评估我们提出的方法的性能,并将其与最先进的基线进行比较。结果表明,我们提出的方法对各种代码气味的有效性,在F1分数和MCC方面的评估指标优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smells
Code smells is the term used to signal certain patterns or structures in software code that may contain a potential design or architecture problem, leading to maintainability or other software quality issues. Detecting code smells early in the software development process helps prevent these problems and improve the overall software quality. Existing research concentrates on the process of collecting and handling dataset, then exploring the potential of utilizing deep learning models to detect smells, while ignoring extensive feature engineering. Though these approaches obtained promising results, the following issues need to be tackled: (i) extracting both structural and semantic features from the software units; (ii) mitigating the effects of imbalanced data distribution on the performance.In this paper, we propose DeepSmells as a novel approach to code smells detection. To learn the complex hierarchical representations of the code fragment, we apply a deep convolutional neural network (CNN). Then, in order to improve the quality of the context encoding and preserve semantic information, long short-term memory networks (LSTM) is placed immediately after the CNN. The final classification is conducted by deep neural networks with weighted loss function to reduce the impact of skewed data distribution. We performed an empirical study using the existing code smell benchmark datasets to assess the performance of our proposed approach, and compare it with state-of-the-art baselines. The results demonstrate the effectiveness of our proposed method for all kinds of code smells with outperformed evaluation metrics in terms of F1 score and MCC.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Classification-based Static Collection Selection for Java: Effectiveness and Adaptability Decentralised Autonomous Organisations for Public Procurement Analyzing the Resource Usage Overhead of Mobile App Development Frameworks Investigation of Security-related Commits in Android Apps Exploring the UK Cyber Skills Gap through a mapping of active job listings to the Cyber Security Body of Knowledge (CyBOK)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1