Dynamic stacking ensemble for cross-language code smell detection

IF 3.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-08-15 DOI:10.7717/peerj-cs.2254
Hamoud Aljamaan
{"title":"Dynamic stacking ensemble for cross-language code smell detection","authors":"Hamoud Aljamaan","doi":"10.7717/peerj-cs.2254","DOIUrl":null,"url":null,"abstract":"Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (i.e., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2254","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (i.e., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
跨语言代码气味检测的动态堆叠组合
代码气味是指软件工程师在设计和实施过程中做出的不良选择,可能会影响软件的整体质量。使用机器学习模型检测代码气味已成为一个热门领域,以建立能够检测多种编程语言中不同代码气味的有效模型。然而,建立此类有效模型的过程尚未达到稳定状态,现有研究大多集中于 Java 代码气味检测。本文的主要目的是提出使用贪婪搜索和后向消除两种策略的动态集合,它们能够准确地检测两种编程语言(即 Java 和 Python)中的代码气味,而且其复杂性低于完全堆叠集合。在四种 Java 和两种 Python 代码气味的背景下,研究了动态集合的检测性能。贪婪搜索和后向消除策略产生了不同的基础模型列表来构建动态集合。与完全堆叠集合相比,动态集合在用于检测大部分被调查的 Java 和 Python 代码气味时,所产生的模型复杂度较低,后向消除策略所产生的模型复杂度较低。动态集合与完全堆叠集合的性能相当,没有明显的检测损失。本文的结论是,在所有基本模型中,动态堆叠集合都能有效、稳定地检测出 Java 和 Python 代码气味,而且复杂度低于完全堆叠集合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PeerJ Computer Science
PeerJ Computer Science Computer Science-General Computer Science
CiteScore
6.10
自引率
5.30%
发文量
332
审稿时长
10 weeks
期刊介绍: PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.
期刊最新文献
A model integrating attention mechanism and generative adversarial network for image style transfer. Detecting rumors in social media using emotion based deep learning approach. Harnessing AI and analytics to enhance cybersecurity and privacy for collective intelligence systems. Improving synthetic media generation and detection using generative adversarial networks. Intelligent accounting optimization method based on meta-heuristic algorithm and CNN.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1