首页 > 最新文献

Journal of Software-Evolution and Process最新文献

英文 中文
Evaluation Framework for Smart Contract Fuzzers 智能合约模糊器的评估框架
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-22 DOI: 10.1002/smr.70021
Peixuan Feng, Yongjuan Wang, Siqi Lu, Qingjun Yuan, Gang Yu, Xiangyu Wang, Jianan Liu, Huaiguang Wu

With the widespread application of smart contracts in economics and asset management, the security of smart contracts has been widely addressed by academia and industry. Fuzz is an effective technique for vulnerability detection. Several fuzzers are currently available for smart contracts, how to choose the most appropriate tools to test smart contracts is a problem that needs to be solved. To this end, we propose an evaluation framework for a smart contract fuzzers, which sets eight evaluation indicators from five aspects to comprehensively evaluate the usability, transparency, detection ability, branch coverage, and design of oracle of the smart contract fuzzers. In order to verify the scientificity and rationality of the framework, we selected six state-of-the-art (SOTA) smart contract fuzzers for evaluation. By evaluating the usability of six fuzzers, the level of difficulty in using them was verified; by evaluating the transparency of six fuzzers, the usability of the tool's output information during use was verified; the branch coverage and rationality of oracle design of the six fuzzers was validated by evaluating their detection ability on the dataset. The final evaluation results validated the effectiveness of our proposed framework in guiding users to choose smart contract fuzzers.

随着智能合约在经济和资产管理领域的广泛应用,智能合约的安全性问题得到了学术界和产业界的广泛关注。模糊分析是一种有效的漏洞检测技术。目前有几种用于智能合约的fuzzers,如何选择最合适的工具来测试智能合约是一个需要解决的问题。为此,我们提出了智能合约fuzzers的评估框架,该框架从五个方面设置了八个评估指标,综合评估智能合约fuzzers的可用性、透明度、检测能力、分支覆盖率和oracle设计。为了验证该框架的科学性和合理性,我们选择了六个最先进的(SOTA)智能合约fuzzers进行评估。通过对六个模糊器的可用性评估,验证了它们的使用难易程度;通过评估6个模糊器的透明度,验证了工具在使用过程中输出信息的可用性;通过对数据集的检测能力评估,验证了6个模糊器的分支覆盖率和oracle设计的合理性。最终的评估结果验证了我们提出的框架在指导用户选择智能合约模糊器方面的有效性。
{"title":"Evaluation Framework for Smart Contract Fuzzers","authors":"Peixuan Feng,&nbsp;Yongjuan Wang,&nbsp;Siqi Lu,&nbsp;Qingjun Yuan,&nbsp;Gang Yu,&nbsp;Xiangyu Wang,&nbsp;Jianan Liu,&nbsp;Huaiguang Wu","doi":"10.1002/smr.70021","DOIUrl":"https://doi.org/10.1002/smr.70021","url":null,"abstract":"<div>\u0000 \u0000 <p>With the widespread application of smart contracts in economics and asset management, the security of smart contracts has been widely addressed by academia and industry. Fuzz is an effective technique for vulnerability detection. Several fuzzers are currently available for smart contracts, how to choose the most appropriate tools to test smart contracts is a problem that needs to be solved. To this end, we propose an evaluation framework for a smart contract fuzzers, which sets eight evaluation indicators from five aspects to comprehensively evaluate the usability, transparency, detection ability, branch coverage, and design of oracle of the smart contract fuzzers. In order to verify the scientificity and rationality of the framework, we selected six state-of-the-art (SOTA) smart contract fuzzers for evaluation. By evaluating the usability of six fuzzers, the level of difficulty in using them was verified; by evaluating the transparency of six fuzzers, the usability of the tool's output information during use was verified; the branch coverage and rationality of oracle design of the six fuzzers was validated by evaluating their detection ability on the dataset. The final evaluation results validated the effectiveness of our proposed framework in guiding users to choose smart contract fuzzers.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143861710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manipulating a CI/CD Pipeline in an IoT Embedded Project: A Quasi-Experiment 在物联网嵌入式项目中操作CI/CD管道:一个准实验
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-15 DOI: 10.1002/smr.70022
Igor Pereira, Tiago Carneiro, Eduardo Figueiredo

Given the multidisciplinary complexity of embedded Internet of Things (IoT) projects and the demand for qualified professionals, this study investigates the influence of continuous integration and continuous delivery (CI/CD) skills and developers' perceptions regarding applying these practices in this domain. We conducted a quasi-experiment with 98 students from three undergraduate courses at two Brazilian federal universities, analyzing the impact of developer skills on CI/CD. The results showed that developers with no previous CI/CD skills faced more significant difficulties in practical activities. It was interesting to note that most participants in our sample already had some experience with real software development projects. However, most have never had real experience with an embedded IoT project or CI/CD tools. The approach we followed resulted in 92% success. Attendees expressed interest in more hands-on training on CI/CD pipeline, DevOps, and embedded IoT projects. We also noticed a great need for them to have more practical experience with Git, GitHub, GitHub Actions, and GNU/Linux.

鉴于嵌入式物联网(IoT)项目的多学科复杂性和对合格专业人员的需求,本研究调查了持续集成和持续交付(CI/CD)技能的影响以及开发人员对在该领域应用这些实践的看法。我们对巴西两所联邦大学三个本科专业的 98 名学生进行了一次准实验,分析了开发人员技能对 CI/CD 的影响。结果显示,没有 CI/CD 技能的开发人员在实践活动中面临更多困难。有趣的是,我们的样本中大多数参与者都已经有了一些实际软件开发项目的经验。但是,大多数人从未接触过嵌入式物联网项目或 CI/CD 工具。我们采用的方法取得了 92% 的成功率。与会者表示对 CI/CD 管道、DevOps 和嵌入式物联网项目方面的更多实践培训感兴趣。我们还注意到,他们非常需要获得更多有关 Git、GitHub、GitHub Actions 和 GNU/Linux 的实践经验。
{"title":"Manipulating a CI/CD Pipeline in an IoT Embedded Project: A Quasi-Experiment","authors":"Igor Pereira,&nbsp;Tiago Carneiro,&nbsp;Eduardo Figueiredo","doi":"10.1002/smr.70022","DOIUrl":"https://doi.org/10.1002/smr.70022","url":null,"abstract":"<div>\u0000 \u0000 <p>Given the multidisciplinary complexity of embedded Internet of Things (IoT) projects and the demand for qualified professionals, this study investigates the influence of continuous integration and continuous delivery (CI/CD) skills and developers' perceptions regarding applying these practices in this domain. We conducted a quasi-experiment with 98 students from three undergraduate courses at two Brazilian federal universities, analyzing the impact of developer skills on CI/CD. The results showed that developers with no previous CI/CD skills faced more significant difficulties in practical activities. It was interesting to note that most participants in our sample already had some experience with real software development projects. However, most have never had real experience with an embedded IoT project or CI/CD tools. The approach we followed resulted in 92% success. Attendees expressed interest in more hands-on training on CI/CD pipeline, DevOps, and embedded IoT projects. We also noticed a great need for them to have more practical experience with Git, GitHub, GitHub Actions, and GNU/Linux.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI Framework for Software Defect Prediction 用于软件缺陷预测的可解释AI框架
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-13 DOI: 10.1002/smr.70018
Bahar Gezici Geçer, Ayça Kolukısa Tarhan

Software engineering plays a critical role in improving the quality of software systems, because identifying and correcting defects is one of the most expensive tasks in software development life cycle. For instance, determining whether a software product still has defects before distributing it is crucial. The customer's confidence in the software product will decline if the defects are discovered after it has been deployed. Machine learning-based techniques for predicting software defects have lately started to yield encouraging results. The software defect prediction system's prediction results are raised by machine learning models. More accurate models tend to be more complicated, which makes them harder to interpret. As the rationale behind machine learning models' decisions are obscure, it is challenging to employ them in actual production. In this study, we employ five different machine learning models which are random forest (RF), gradient boosting (GB), naive Bayes (NB), multilayer perceptron (MLP), and neural network (NN) to predict software defects and also provide an explainable artificial intelligence (XAI) framework to both locally and globally increase openness throughout the machine learning pipeline. While global explanations identify general trends and feature importance, local explanations provide insights into individual instances, and their combination allows for a holistic understanding of the model. This is accomplished through the utilization of Explainable AI algorithms, which aim to reduce the “black-boxiness” of ML models by explaining the reasoning behind a prediction. The explanations provide quantifiable information about the characteristics that affect defect prediction. These justifications are produced using six XAI methods, namely, SHAP, anchor, ELI5, LIME, partial dependence plot (PDP), and ProtoDash. We use the KC2 dataset to apply these methods to the software defect prediction (SDP) system, and provide and discuss the results.

软件工程在提高软件系统质量方面起着至关重要的作用,因为识别和纠正缺陷是软件开发生命周期中最昂贵的任务之一。例如,在发布软件产品之前确定它是否仍然存在缺陷是至关重要的。如果在软件产品部署之后才发现缺陷,那么客户对软件产品的信心将会下降。用于预测软件缺陷的基于机器学习的技术最近开始产生令人鼓舞的结果。软件缺陷预测系统的预测结果是通过机器学习模型提出的。更精确的模型往往更复杂,这使得它们更难以解释。由于机器学习模型决策背后的基本原理是模糊的,因此在实际生产中使用它们是具有挑战性的。在本研究中,我们采用五种不同的机器学习模型,分别是随机森林(RF)、梯度增强(GB)、朴素贝叶斯(NB)、多层感知器(MLP)和神经网络(NN)来预测软件缺陷,并提供一个可解释的人工智能(XAI)框架,以在本地和全球范围内增加整个机器学习管道的开放性。虽然全局解释确定了总体趋势和特征的重要性,但局部解释提供了对单个实例的见解,并且它们的组合允许对模型进行整体理解。这是通过使用可解释的人工智能算法来实现的,该算法旨在通过解释预测背后的推理来减少机器学习模型的“黑盒性”。这些解释提供了有关影响缺陷预测的特性的可量化信息。这些证明是使用六种XAI方法生成的,即SHAP、anchor、ELI5、LIME、部分依赖图(PDP)和ProtoDash。利用KC2数据集将这些方法应用于软件缺陷预测(SDP)系统,并给出了结果并进行了讨论。
{"title":"Explainable AI Framework for Software Defect Prediction","authors":"Bahar Gezici Geçer,&nbsp;Ayça Kolukısa Tarhan","doi":"10.1002/smr.70018","DOIUrl":"https://doi.org/10.1002/smr.70018","url":null,"abstract":"<div>\u0000 \u0000 <p>Software engineering plays a critical role in improving the quality of software systems, because identifying and correcting defects is one of the most expensive tasks in software development life cycle. For instance, determining whether a software product still has defects before distributing it is crucial. The customer's confidence in the software product will decline if the defects are discovered after it has been deployed. Machine learning-based techniques for predicting software defects have lately started to yield encouraging results. The software defect prediction system's prediction results are raised by machine learning models. More accurate models tend to be more complicated, which makes them harder to interpret. As the rationale behind machine learning models' decisions are obscure, it is challenging to employ them in actual production. In this study, we employ five different machine learning models which are random forest (RF), gradient boosting (GB), naive Bayes (NB), multilayer perceptron (MLP), and neural network (NN) to predict software defects and also provide an explainable artificial intelligence (XAI) framework to both locally and globally increase openness throughout the machine learning pipeline. While global explanations identify general trends and feature importance, local explanations provide insights into individual instances, and their combination allows for a holistic understanding of the model. This is accomplished through the utilization of Explainable AI algorithms, which aim to reduce the “black-boxiness” of ML models by explaining the reasoning behind a prediction. The explanations provide quantifiable information about the characteristics that affect defect prediction. These justifications are produced using six XAI methods, namely, SHAP, anchor, ELI5, LIME, partial dependence plot (PDP), and ProtoDash. We use the KC2 dataset to apply these methods to the software defect prediction (SDP) system, and provide and discuss the results.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143826734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilabel Vulnerability Classification in Decentralized Blockchain–Based Reputation System 基于分散式区块链信誉系统的多标签漏洞分类
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-13 DOI: 10.1002/smr.70024
Balaji Barmavat, Dhanaraju M, K. Sreerama Murthy, Hari Krishna Madthala, Satya Krupa Prakash Karey, Rajesh Palthya

Smart contracts serve as decentralized applications essential for extensive utilization of blockchain technology across various contexts that have transitioned from the blockchain, characterized primarily by digital currency systems that emphasize the financial systems. Blockchain operates as a distributed ledger that securely records transactions using cryptographic techniques to establish a unique, chain-like data structure managed collectively by miners within the network. However, current methods for analyzing smart contracts often demand substantial processing time and face challenges in accurately detecting vulnerabilities in complex contracts. To address these limitations, this research introduces the Updated Wave search Graph Bidirectional Convolutional Neural Network (UWGBCNN), a novel approach designed to enhance smart contract security. UWGBCNN integrates a multilabel vulnerability classification mechanism, utilizing the Updated Wave Search Algorithm (UWSA) to efficiently analyze and identify patterns in smart contracts by adapting network parameters to detect vulnerabilities with speed and precision. Additionally, feature extraction is enhanced through the Bidirectional Encoder Representations from Transformer (BERT) language model, incorporating supplementary word embedding features. The proposed technique achieves superior performance, reaching a precision of 98.5%, recall of 98.6%, and an F1-score of 99.6%, surpassing current methods. This approach contributes significantly to blockchain security by minimizing financial risks associated with vulnerabilities in decentralized applications.

智能合约作为去中心化应用,对于在各种背景下广泛利用区块链技术至关重要,这些背景是从区块链过渡而来,主要以强调金融系统的数字货币系统为特征。区块链作为分布式账本运行,利用加密技术安全地记录交易,建立一个由网络内矿工集体管理的独特的链式数据结构。然而,目前分析智能合约的方法往往需要大量的处理时间,在准确检测复杂合约中的漏洞方面面临挑战。为了解决这些局限性,本研究引入了更新波搜索图双向卷积神经网络(UWGBCNN),这是一种旨在增强智能合约安全性的新方法。UWGBCNN 集成了多标签漏洞分类机制,利用更新波搜索算法(UWSA),通过调整网络参数来高效分析和识别智能合约中的模式,从而快速、精确地检测漏洞。此外,还通过双向变压器编码器表征(BERT)语言模型加强了特征提取,并纳入了补充单词嵌入特征。所提出的技术实现了卓越的性能,精确度达到 98.5%,召回率达到 98.6%,F1 分数达到 99.6%,超过了当前的方法。这种方法最大程度地降低了去中心化应用中与漏洞相关的金融风险,从而为区块链安全做出了重大贡献。
{"title":"Multilabel Vulnerability Classification in Decentralized Blockchain–Based Reputation System","authors":"Balaji Barmavat,&nbsp;Dhanaraju M,&nbsp;K. Sreerama Murthy,&nbsp;Hari Krishna Madthala,&nbsp;Satya Krupa Prakash Karey,&nbsp;Rajesh Palthya","doi":"10.1002/smr.70024","DOIUrl":"https://doi.org/10.1002/smr.70024","url":null,"abstract":"<div>\u0000 \u0000 <p>Smart contracts serve as decentralized applications essential for extensive utilization of blockchain technology across various contexts that have transitioned from the blockchain, characterized primarily by digital currency systems that emphasize the financial systems. Blockchain operates as a distributed ledger that securely records transactions using cryptographic techniques to establish a unique, chain-like data structure managed collectively by miners within the network. However, current methods for analyzing smart contracts often demand substantial processing time and face challenges in accurately detecting vulnerabilities in complex contracts. To address these limitations, this research introduces the Updated Wave search Graph Bidirectional Convolutional Neural Network (UWGBCNN), a novel approach designed to enhance smart contract security. UWGBCNN integrates a multilabel vulnerability classification mechanism, utilizing the Updated Wave Search Algorithm (UWSA) to efficiently analyze and identify patterns in smart contracts by adapting network parameters to detect vulnerabilities with speed and precision. Additionally, feature extraction is enhanced through the Bidirectional Encoder Representations from Transformer (BERT) language model, incorporating supplementary word embedding features. The proposed technique achieves superior performance, reaching a precision of 98.5%, recall of 98.6%, and an F1-score of 99.6%, surpassing current methods. This approach contributes significantly to blockchain security by minimizing financial risks associated with vulnerabilities in decentralized applications.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143826660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CARLDA: An Approach for Stack Overflow API Mention Recognition Driven by Context and LLM-Based Data Augmentation 基于上下文和基于llm的数据增强驱动的堆栈溢出API提及识别方法
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-10 DOI: 10.1002/smr.70015
Zhang Zhang, Xinjun Mao, Shangwen Wang, Kang Yang, Tanghaoran Zhang, Yao Lu

The recognition of Application Programming Interface (API) mentions in software-related texts is vital for extracting API-related knowledge, providing deep insights into API usage and enhancing productivity efficiency. Previous research identifies two primary technical challenges in this task: (1) differentiating APIs from common words and (2) identifying morphological variants of standard APIs. While deep learning-based methods have demonstrated advancements in addressing these challenges, they rely heavily on high-quality labeled data, leading to another significant data-related challenge: (3) the lack of such high-quality data due to the substantial effort required for labeling. To overcome these challenges, this paper proposes a context-aware API recognition method named CARLDA. This approach utilizes two key components, namely, Bidirectional Encoder Representations from Transformers (BERT) and Bidirectional Long Short-Term Memory (BiLSTM), to extract context at both the word and sequence levels, capturing syntactic and semantic information to address the first challenge. For the second challenge, it incorporates a character-level BiLSTM with an attention mechanism to grasp global character-level context, enhancing the recognition of morphological features of APIs. To address the third challenge, we developed specialized data augmentation techniques using large language models (LLMs) to tackle both in-library and cross-library data shortages. These techniques generate a variety of labeled samples through targeted transformations (e.g., replacing tokens and restructuring sentences) and hybrid augmentation strategies (e.g., combining real-world and generated data while applying style rules to replicate authentic programming contexts). Given the uncertainty about the quality of LLM-generated samples, we also developed sample selection algorithms to filter out low-quality samples (i.e., incomplete or incorrectly labeled samples). Moreover, specific datasets have been constructed to evaluate CARLDA's ability to address the aforementioned challenges. Experimental results demonstrate that (1) CARLDA significantly enhances F1 by 11.0% and the Matthews correlation coefficient (MCC) by 10.0% compared to state-of-the-art methods, showing superior overall performance and effectively tackling the first two challenges, and (2) LLM-based data augmentation techniques successfully yield high-quality labeled data and effectively alleviate the third challenge.

识别软件相关文本中提到的应用程序编程接口(API)对于提取与API相关的知识、提供对API使用的深入了解和提高生产力效率至关重要。先前的研究确定了这项任务的两个主要技术挑战:(1)将api与常用单词区分开来;(2)识别标准api的形态学变体。虽然基于深度学习的方法在解决这些挑战方面已经取得了进展,但它们严重依赖于高质量的标记数据,从而导致了另一个与数据相关的重大挑战:(3)由于标记需要大量的努力而缺乏高质量的数据。为了克服这些挑战,本文提出了一种上下文感知的API识别方法CARLDA。该方法利用两个关键组件,即变形器的双向编码器表示(BERT)和双向长短期记忆(BiLSTM),在单词和序列级别提取上下文,捕获语法和语义信息以解决第一个挑战。第二个挑战是引入具有注意机制的字符级BiLSTM,以掌握全局字符级上下文,增强对api形态特征的识别。为了解决第三个挑战,我们使用大型语言模型(llm)开发了专门的数据增强技术,以解决库内和跨库的数据短缺问题。这些技术通过有针对性的转换(例如,替换标记和重组句子)和混合增强策略(例如,在应用风格规则复制真实编程上下文的同时,将真实世界和生成的数据结合起来)生成各种标记样本。考虑到llm生成的样本质量的不确定性,我们还开发了样本选择算法来过滤掉低质量的样本(即不完整或标记错误的样本)。此外,已经构建了特定的数据集来评估carda解决上述挑战的能力。实验结果表明:(1)与现有方法相比,carda的F1和Matthews相关系数(MCC)分别提高了11.0%和10.0%,整体性能优越,有效解决了前两个挑战;(2)基于llm的数据增强技术成功生成了高质量的标记数据,有效缓解了第三个挑战。
{"title":"CARLDA: An Approach for Stack Overflow API Mention Recognition Driven by Context and LLM-Based Data Augmentation","authors":"Zhang Zhang,&nbsp;Xinjun Mao,&nbsp;Shangwen Wang,&nbsp;Kang Yang,&nbsp;Tanghaoran Zhang,&nbsp;Yao Lu","doi":"10.1002/smr.70015","DOIUrl":"https://doi.org/10.1002/smr.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>The recognition of Application Programming Interface (API) mentions in software-related texts is vital for extracting API-related knowledge, providing deep insights into API usage and enhancing productivity efficiency. Previous research identifies two primary technical challenges in this task: (1) differentiating APIs from common words and (2) identifying morphological variants of standard APIs. While deep learning-based methods have demonstrated advancements in addressing these challenges, they rely heavily on high-quality labeled data, leading to another significant data-related challenge: (3) the lack of such high-quality data due to the substantial effort required for labeling. To overcome these challenges, this paper proposes a context-aware API recognition method named CARLDA. This approach utilizes two key components, namely, Bidirectional Encoder Representations from Transformers (BERT) and Bidirectional Long Short-Term Memory (BiLSTM), to extract context at both the word and sequence levels, capturing syntactic and semantic information to address the first challenge. For the second challenge, it incorporates a character-level BiLSTM with an attention mechanism to grasp global character-level context, enhancing the recognition of morphological features of APIs. To address the third challenge, we developed specialized data augmentation techniques using large language models (LLMs) to tackle both in-library and cross-library data shortages. These techniques generate a variety of labeled samples through targeted transformations (e.g., replacing tokens and restructuring sentences) and hybrid augmentation strategies (e.g., combining real-world and generated data while applying style rules to replicate authentic programming contexts). Given the uncertainty about the quality of LLM-generated samples, we also developed sample selection algorithms to filter out low-quality samples (i.e., incomplete or incorrectly labeled samples). Moreover, specific datasets have been constructed to evaluate CARLDA's ability to address the aforementioned challenges. Experimental results demonstrate that (1) CARLDA significantly enhances F1 by 11.0% and the Matthews correlation coefficient (MCC) by 10.0% compared to state-of-the-art methods, showing superior overall performance and effectively tackling the first two challenges, and (2) LLM-based data augmentation techniques successfully yield high-quality labeled data and effectively alleviate the third challenge.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Process Debt: Definition, Risks, and Management 过程债务:定义、风险和管理
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-09 DOI: 10.1002/smr.70017
Antonio Martini, Viktoria Stray, Terese Besker, Nils Brede Moe, Jan Bosch

Process debt, like technical debt, can be a source of short-term benefits but often leads to harmful consequences in the long term for a software organization. Despite its impact, the phenomenon of process debt has not been thoroughly explored in current literature, leaving a gap in understanding how it affects and is managed within organizations. This paper addresses this gap by defining process debt, describing its occurrence, the risks of its mismanagement, and showing examples of mitigation strategies. Our study began with an exploratory phase involving semi-structured interviews with sixteen practitioners across four international organizations, allowing us to gather diverse insights into the occurrence and management of process debt. Then, to deepen our understanding and validate our findings, we conducted a cross-company focus group with ten additional practitioners and analyzed fifty-eight observations and thirty-five interviews from a longitudinal case study. The analysis of the research findings led to a definition of process debt and a novel framework. We also report on the causes, consequences, and occurrence patterns of process debt over time. We present mitigation strategies and discuss which ones need further attention for future research. Our results suggest that the debt metaphor may help companies understand how to manage and improve their processes and make process-related decisions that are beneficial both in the short and long term.

过程债,就像技术债一样,可以是短期利益的来源,但是对于软件组织来说,从长期来看往往会导致有害的后果。尽管它的影响,过程债的现象在当前的文献中还没有被彻底地探讨,在理解它是如何影响和在组织中被管理方面留下了一个空白。本文通过定义过程债务,描述其发生,管理不当的风险,并展示缓解策略的例子来解决这个差距。我们的研究开始于一个探索性的阶段,包括与四个国际组织的16个实践者进行半结构化的访谈,允许我们收集对过程债务的发生和管理的不同见解。然后,为了加深我们的理解和验证我们的发现,我们进行了一个跨公司的焦点小组,其中有10个额外的从业者,并分析了58个观察结果和35个纵向案例研究的访谈。对研究结果的分析导致了过程债的定义和一个新的框架。我们还报告了随着时间的推移过程债务的原因、后果和发生模式。我们提出了缓解策略,并讨论了哪些需要进一步关注未来的研究。我们的研究结果表明,债务隐喻可以帮助公司理解如何管理和改进他们的流程,并做出与流程相关的决策,这些决策在短期和长期都是有益的。
{"title":"Process Debt: Definition, Risks, and Management","authors":"Antonio Martini,&nbsp;Viktoria Stray,&nbsp;Terese Besker,&nbsp;Nils Brede Moe,&nbsp;Jan Bosch","doi":"10.1002/smr.70017","DOIUrl":"https://doi.org/10.1002/smr.70017","url":null,"abstract":"<div>\u0000 \u0000 <p>Process debt, like technical debt, can be a source of short-term benefits but often leads to harmful consequences in the long term for a software organization. Despite its impact, the phenomenon of process debt has not been thoroughly explored in current literature, leaving a gap in understanding how it affects and is managed within organizations. This paper addresses this gap by defining process debt, describing its occurrence, the risks of its mismanagement, and showing examples of mitigation strategies. Our study began with an exploratory phase involving semi-structured interviews with sixteen practitioners across four international organizations, allowing us to gather diverse insights into the occurrence and management of process debt. Then, to deepen our understanding and validate our findings, we conducted a cross-company focus group with ten additional practitioners and analyzed fifty-eight observations and thirty-five interviews from a longitudinal case study. The analysis of the research findings led to a definition of process debt and a novel framework. We also report on the causes, consequences, and occurrence patterns of process debt over time. We present mitigation strategies and discuss which ones need further attention for future research. Our results suggest that the debt metaphor may help companies understand how to manage and improve their processes and make process-related decisions that are beneficial both in the short and long term.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prioritization of Functional Requirements Using Directed Graph and K-Means Clustering 使用有向图和k均值聚类的功能需求优先级
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-03-31 DOI: 10.1002/smr.70019
Muhammad Yaseen, Muhammad Asif Nauman, Roobaea Alroobaea, Hamed Alsufyani, Umar Farooq Khattak

Functional requirements (FRs) prioritization is process of ranking of software FRs from development perspective such that which requirement to be implemented first and which should not. FRs prioritization is necessary as these requirements are interrelated such that one requirement is necessary for the implementation of another requirement. Also, when two parallel developers work on interrelated dependent requirements, requirements must be prioritized. Prioritizing small size requirements is not a big issue due to a fewer number of comparisons but when developers implement large size requirements such as enterprise resource planning (ERP), it requires a huge number of comparisons. Numerous techniques are suggested for FRs prioritization such as AHP, which yield more accurate results, but these techniques are not scalable for large size software requirements. In this research paper, a new prioritization approach based on graph and k-means clustering is suggested that will capture all dependencies from a list of FRs using a directed graph and then prioritize it with a clustering technique with fewer comparisons. The proposed technique based on directed graph and clustering approach is validated on ODOO ERP, which shows that with n-1 pairwise comparisons, requirements can be prioritized.

功能需求(FRs)的优先级排序是从开发角度对软件FRs进行排序的过程,以确定哪些需求应该首先实现,哪些不应该。FRs的优先次序是必要的,因为这些需求是相互关联的,因此一个需求对于另一个需求的实现是必要的。同样,当两个并行开发人员处理相互关联的依赖需求时,必须对需求进行优先级排序。考虑小尺寸需求的优先级并不是一个大问题,因为比较的次数较少,但是当开发人员实现大尺寸需求(如企业资源规划(ERP))时,就需要进行大量的比较。对于FRs的优先级,建议使用许多技术,例如AHP,这些技术可以产生更准确的结果,但是这些技术对于大型软件需求是不可伸缩的。本文提出了一种新的基于图和k-means聚类的优先级排序方法,该方法使用有向图捕获FRs列表中的所有依赖关系,然后使用较少比较的聚类技术对其进行优先级排序。基于有向图和聚类的方法在ODOO ERP上进行了验证,结果表明,通过n-1对比较,可以实现需求的优先级排序。
{"title":"Prioritization of Functional Requirements Using Directed Graph and K-Means Clustering","authors":"Muhammad Yaseen,&nbsp;Muhammad Asif Nauman,&nbsp;Roobaea Alroobaea,&nbsp;Hamed Alsufyani,&nbsp;Umar Farooq Khattak","doi":"10.1002/smr.70019","DOIUrl":"https://doi.org/10.1002/smr.70019","url":null,"abstract":"<div>\u0000 \u0000 <p>Functional requirements (FRs) prioritization is process of ranking of software FRs from development perspective such that which requirement to be implemented first and which should not. FRs prioritization is necessary as these requirements are interrelated such that one requirement is necessary for the implementation of another requirement. Also, when two parallel developers work on interrelated dependent requirements, requirements must be prioritized. Prioritizing small size requirements is not a big issue due to a fewer number of comparisons but when developers implement large size requirements such as enterprise resource planning (ERP), it requires a huge number of comparisons. Numerous techniques are suggested for FRs prioritization such as AHP, which yield more accurate results, but these techniques are not scalable for large size software requirements. In this research paper, a new prioritization approach based on graph and k-means clustering is suggested that will capture all dependencies from a list of FRs using a directed graph and then prioritize it with a clustering technique with fewer comparisons. The proposed technique based on directed graph and clustering approach is validated on ODOO ERP, which shows that with n-1 pairwise comparisons, requirements can be prioritized.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143741562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Software Design Affects Energy Performance: A Systematic Literature Review 软件设计如何影响能源绩效:系统文献综述
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-03-29 DOI: 10.1002/smr.70014
Déaglán Connolly Bree, Mel Ó Cinnéide

Interest in the energy consumption of software has grown with rising energy costs and greater environmental awareness. Many approaches to research in this area have been proposed, from the examination of hardware and compiler optimizations to platform specific software modifications. However, the impact of general software design on energy efficiency remains unclear. The goal of this research is to summarize the findings of studies that empirically examine the impact of design patterns, code smells, and refactorings (which we collectively describe as design elements) on energy consumption. Our secondary goal is to provide an overview of the impact of these aspects of software design on energy performance and discuss the current state of the art. We present a systematic literature review (SLR) of papers that examine the impact of the aforementioned design elements on energy consumption. We perform a search through four major databases, a manual search through publications of eight conferences and five journals from 2010 through 2023, in addition to snowballing. We extract relevant data from the literature and present an overview of each experiment's setup, the data reported, and results for each design element studied. Beginning with a set of 8684 papers, we select 24 that include studies of these design elements. Overall, they provide data on 22 design patterns, 17 code smells, and 31 refactorings. Many studies are preliminary in nature, and contradictory findings are frequent. We present three main findings: (i) a wide array of design patterns, code smells, and refactorings have been examined from an energy performance perspective; (ii) many of these studies are preliminary in nature and indicate the need for further research; (iii) there has been little growth recently in publications empirically examining these aspects of software design.

随着能源成本的上升和环保意识的增强,人们对软件能耗的关注与日俱增。在这一领域,已经提出了许多研究方法,从硬件检查和编译器优化到特定平台的软件修改。然而,一般软件设计对能效的影响仍不明确。本研究的目标是总结实证检验设计模式、代码气味和重构(我们统称为设计元素)对能耗影响的研究结果。我们的第二个目标是概述软件设计的这些方面对能耗性能的影响,并讨论当前的研究现状。我们对研究上述设计要素对能耗影响的论文进行了系统的文献综述(SLR)。我们通过四个主要数据库进行了搜索,还通过人工搜索了 2010 年至 2023 年期间八个会议和五个期刊的出版物,此外还进行了滚雪球式搜索。我们从文献中提取了相关数据,并概述了每个实验的设置、报告的数据以及每个设计元素的研究结果。我们从 8684 篇文献中挑选出 24 篇包含对这些设计元素的研究。总的来说,这些论文提供了 22 种设计模式、17 种代码气味和 31 种重构的数据。许多研究都是初步性的,而且经常出现相互矛盾的结论。我们提出了三个主要发现:(i) 从能效的角度研究了大量的设计模式、代码气味和重构;(ii) 许多研究都是初步性的,表明需要进一步研究;(iii) 最近对软件设计的这些方面进行实证研究的出版物几乎没有增加。
{"title":"How Software Design Affects Energy Performance: A Systematic Literature Review","authors":"Déaglán Connolly Bree,&nbsp;Mel Ó Cinnéide","doi":"10.1002/smr.70014","DOIUrl":"https://doi.org/10.1002/smr.70014","url":null,"abstract":"<p>Interest in the energy consumption of software has grown with rising energy costs and greater environmental awareness. Many approaches to research in this area have been proposed, from the examination of hardware and compiler optimizations to platform specific software modifications. However, the impact of general software design on energy efficiency remains unclear. The goal of this research is to summarize the findings of studies that empirically examine the impact of design patterns, code smells, and refactorings (which we collectively describe as <i>design elements</i>) on energy consumption. Our secondary goal is to provide an overview of the impact of these aspects of software design on energy performance and discuss the current state of the art. We present a systematic literature review (SLR) of papers that examine the impact of the aforementioned design elements on energy consumption. We perform a search through four major databases, a manual search through publications of eight conferences and five journals from 2010 through 2023, in addition to snowballing. We extract relevant data from the literature and present an overview of each experiment's setup, the data reported, and results for each design element studied. Beginning with a set of 8684 papers, we select 24 that include studies of these design elements. Overall, they provide data on 22 design patterns, 17 code smells, and 31 refactorings. Many studies are preliminary in nature, and contradictory findings are frequent. We present three main findings: (i) a wide array of design patterns, code smells, and refactorings have been examined from an energy performance perspective; (ii) many of these studies are preliminary in nature and indicate the need for further research; (iii) there has been little growth recently in publications empirically examining these aspects of software design.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143726770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving IT/Business Alignment in DevOps: Business Capability for Adopting BizDevOps 改进DevOps中的IT/业务一致性:采用BizDevOps的业务能力
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-03-25 DOI: 10.1002/smr.70016
Guillermo Fuentes-Quijada, Francisco Ruiz-González, Angélica Caro

As organizations increasingly adopt DevOps practices, they often face limitations in IT/business alignment. BizDevOps emerges as an evolutionary and complementary approach that integrates business perspectives directly into the software development lifecycle, aiming to address these limitations and enhance overall organizational performance. This study investigates how BizDevOps builds upon and extends DevOps practices, focusing on developing a business capability to facilitate their integration and improve alignment without compromising agility. The study develops the BizDevOps Business Capability (BizDevOps-BC) through metaethnographic analysis and design principles application. To validate the applicability of this capability, both a proof of concept and an expert opinion survey were conducted. The PoC was implemented in a DevOps organization to evaluate whether this business capability improves IT-business alignment while maintaining software development agility. The expert survey gathered qualitative insights from industry professionals and academics, evaluating the relevance of BizDevOps-BC in various organizational contexts. The findings suggest that BizDevOps-BC has the potential to enhance alignment between IT and business, offering a structured approach that, if properly implemented, could help organizations evolve their DevOps practices, further improving overall IT/business alignment, without compromising their existing agility.

随着组织越来越多地采用DevOps实践,他们经常面临IT/业务一致性方面的限制。BizDevOps作为一种进化的和互补的方法出现,它将业务视角直接集成到软件开发生命周期中,旨在解决这些限制并增强整体组织性能。本研究调查了BizDevOps是如何建立和扩展DevOps实践的,重点是开发一种业务能力,以促进它们的集成,并在不损害敏捷性的情况下提高一致性。本研究通过元人类学分析和设计原则应用开发了BizDevOps业务能力(BizDevOps- bc)。为了验证这种能力的适用性,进行了概念验证和专家意见调查。PoC在DevOps组织中实现,以评估此业务能力是否在保持软件开发敏捷性的同时改善了it -业务一致性。专家调查收集了来自行业专业人士和学者的定性见解,评估了BizDevOps-BC在各种组织环境中的相关性。研究结果表明,BizDevOps-BC有潜力增强IT和业务之间的一致性,提供一种结构化的方法,如果正确实现,可以帮助组织发展他们的DevOps实践,进一步提高整体IT/业务一致性,而不会损害他们现有的敏捷性。
{"title":"Improving IT/Business Alignment in DevOps: Business Capability for Adopting BizDevOps","authors":"Guillermo Fuentes-Quijada,&nbsp;Francisco Ruiz-González,&nbsp;Angélica Caro","doi":"10.1002/smr.70016","DOIUrl":"https://doi.org/10.1002/smr.70016","url":null,"abstract":"<div>\u0000 \u0000 <p>As organizations increasingly adopt DevOps practices, they often face limitations in IT/business alignment. BizDevOps emerges as an evolutionary and complementary approach that integrates business perspectives directly into the software development lifecycle, aiming to address these limitations and enhance overall organizational performance. This study investigates how BizDevOps builds upon and extends DevOps practices, focusing on developing a business capability to facilitate their integration and improve alignment without compromising agility. The study develops the BizDevOps Business Capability (BizDevOps-<span>BC</span>) through metaethnographic analysis and design principles application. To validate the applicability of this capability, both a proof of concept and an expert opinion survey were conducted. The PoC was implemented in a DevOps organization to evaluate whether this business capability improves IT-business alignment while maintaining software development agility. The expert survey gathered qualitative insights from industry professionals and academics, evaluating the relevance of BizDevOps-<span>BC</span> in various organizational contexts. The findings suggest that BizDevOps-<span>BC</span> has the potential to enhance alignment between IT and business, offering a structured approach that, if properly implemented, could help organizations evolve their DevOps practices, further improving overall IT/business alignment, without compromising their existing agility.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Levy Flight and Greylag Goose Optimization for Enhanced Cross-Project Defect Prediction in Software Evolution 利用Levy Flight和Greylag Goose优化增强软件进化中的跨项目缺陷预测
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-03-24 DOI: 10.1002/smr.70013
Kripa Sekaran, Sherly Puspha Annabel Lawrence

The cross-project defect prediction (CPDP) in software applications is crucial to predict defects and ensure software quality. The performance of the traditional CPDP models is degraded due to the class imbalance issue between different projects and differences in the data distribution. To overcome these limitations, a novel approach is proposed named as Levy flight–enabled greylag goose optimized UniXcoder-based stacked defect predictor (LFGGO-USDP) for the prediction of cross-project defects in the software engineering. In this paper, 23 software projects are selected from diverse datasets such as PROMISE, ReLink, AEEEM, and NASA that are preprocessed for enhancing reliability and reducing class imbalance issues. The transformation model maps source and target projects that are present in the feature space for enhancing predictive performances. During feature selection, the LF mechanism is embedded with the GGO algorithm to localize the features in the source code for enhancing diversity and minimizing local optimum issues. The integration of UniXcoder-based stacked bidirectional long short-term memory (U-SBiLSTM) is implemented as a cross-project defect predictor. The UniXcoder model extracts semantic information for source code tokenization. Then, the output of UniXcoder is fed as input to SBiLSTM, and the SBiLSTM model is applied to determine the relationship between the source code. After that, the output of UniXcoder (which contains the semantic features) is integrated with the output of SBiLSTM (which contains the sequential and temporal dependencies). After concatenating these features, the particular information is selected by using an attention mechanism for categorizing defective and nondefective classes. The experimental investigations are performed to analyze the nondefective and defective cases in software projects and numerical validation is conducted by applying different evaluation models for analyzing the superiority. The proposed model achieved the highest defect prediction accuracy of 0.986 compared to other existing approaches that demonstrates the proposed model provided better prediction outcomes.

软件应用中的跨项目缺陷预测(CPDP)是预测缺陷和保证软件质量的关键。由于不同项目之间的类不平衡问题和数据分布的差异,传统的CPDP模型的性能下降。为了克服这些限制,提出了一种新的方法,称为Levy飞行灰雁优化基于unixcoder的堆叠缺陷预测器(LFGGO-USDP),用于预测软件工程中的跨项目缺陷。本文从PROMISE、ReLink、AEEEM和NASA等不同的数据集中选择23个软件项目进行预处理,以提高可靠性和减少类不平衡问题。转换模型映射出现在特征空间中的源项目和目标项目,以增强预测性能。在特征选择过程中,将LF机制嵌入到GGO算法中,对源代码中的特征进行局部定位,以增强多样性并最小化局部最优问题。基于unixcoder的堆叠双向长短期记忆(U-SBiLSTM)集成被实现为跨项目缺陷预测器。UniXcoder模型为源代码标记提取语义信息。然后,将UniXcoder的输出作为SBiLSTM的输入,并应用SBiLSTM模型确定源代码之间的关系。之后,UniXcoder的输出(包含语义特征)与SBiLSTM的输出(包含顺序和时间依赖关系)集成。在连接这些特征之后,通过使用注意机制对缺陷和非缺陷类进行分类来选择特定的信息。通过实验研究分析了软件项目中的无缺陷和缺陷情况,并应用不同的评价模型进行了数值验证,分析了评价模型的优越性。与其他现有方法相比,该模型的缺陷预测准确率最高,为0.986,表明该模型具有较好的预测效果。
{"title":"Leveraging Levy Flight and Greylag Goose Optimization for Enhanced Cross-Project Defect Prediction in Software Evolution","authors":"Kripa Sekaran,&nbsp;Sherly Puspha Annabel Lawrence","doi":"10.1002/smr.70013","DOIUrl":"https://doi.org/10.1002/smr.70013","url":null,"abstract":"<div>\u0000 \u0000 <p>The cross-project defect prediction (CPDP) in software applications is crucial to predict defects and ensure software quality. The performance of the traditional CPDP models is degraded due to the class imbalance issue between different projects and differences in the data distribution. To overcome these limitations, a novel approach is proposed named as Levy flight–enabled greylag goose optimized UniXcoder-based stacked defect predictor (LFGGO-USDP) for the prediction of cross-project defects in the software engineering. In this paper, 23 software projects are selected from diverse datasets such as PROMISE, ReLink, AEEEM, and NASA that are preprocessed for enhancing reliability and reducing class imbalance issues. The transformation model maps source and target projects that are present in the feature space for enhancing predictive performances. During feature selection, the LF mechanism is embedded with the GGO algorithm to localize the features in the source code for enhancing diversity and minimizing local optimum issues. The integration of UniXcoder-based stacked bidirectional long short-term memory (U-SBiLSTM) is implemented as a cross-project defect predictor. The UniXcoder model extracts semantic information for source code tokenization. Then, the output of UniXcoder is fed as input to SBiLSTM, and the SBiLSTM model is applied to determine the relationship between the source code. After that, the output of UniXcoder (which contains the semantic features) is integrated with the output of SBiLSTM (which contains the sequential and temporal dependencies). After concatenating these features, the particular information is selected by using an attention mechanism for categorizing defective and nondefective classes. The experimental investigations are performed to analyze the nondefective and defective cases in software projects and numerical validation is conducted by applying different evaluation models for analyzing the superiority. The proposed model achieved the highest defect prediction accuracy of 0.986 compared to other existing approaches that demonstrates the proposed model provided better prediction outcomes.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143689672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Software-Evolution and Process
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1