首页 > 最新文献

IET Software最新文献

英文 中文
A Systematic Literature Review on Graphical User Interface Testing Through Software Patterns 通过软件模式测试图形用户界面的系统性文献综述
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-12 DOI: 10.1049/sfw2/9140693
Ambreen Kousar, Saif Ur Rehman Khan, Atif Mashkoor, Javed Iqbal

Context: Graphical user interface (GUI) testing of mobile applications (apps) is significant from a user perspective to ensure that the apps are visually appealing and user-friendly. Pattern-based GUI testing (PBGT) is an innovative model-based testing (MBT) approach designed to enhance user satisfaction and reusability while minimizing the effort required to model and test UIs of mobile apps. In the literature, several primary studies have been conducted in the domain of PBGT.

Problem: The current state-of-the-art lacks comprehensive secondary studies within the PBGT domain. To our knowledge, this area has insufficient focus on in-depth research. Consequently, numerous challenges and limitations persist in the existing literature.

Objective: This study aims to fill the gaps mentioned above in the existing body of knowledge. We highlight popular research topics and analyze their relationships. We explore current state-of-the-art approaches and techniques, a taxonomy of tools and modeling languages, a list of reported UI test patterns (UITPs), and a taxonomy of writing UITPs. We also highlight practical challenges, limitations, and gaps in the targeted research area. Furthermore, the current study intends to highlight future research directions in this domain.

Method: We conducted a systematic literature review (SLR) on PBGT in the context of Android and web apps. A hybrid methodology that combines the Kitchenham and PRISMA guidelines is adopted to achieve the targeted research objectives (ROs). We perform a keyword-based search on well-known databases and select 30 (out of 557) studies.

Results: The current study identifies 11 tools used in PBGT and devises a taxonomy to categorize these tools. A taxonomy for writing UITPs has also been developed. In addition, we outline the limitations of the targeted research domain and future directions.

Conclusion: This study benefits the community and readers by better understanding the targeted research area. A comprehensive knowledge of existing tools, techniques, and methodologies is helpful for practitioners. Moreover, the identified limitations, gaps, emerging trends, and future research directions will benefit researchers who intend to work further in future research.

背景:从用户角度来看,移动应用程序(Apps)的图形用户界面(GUI)测试对于确保应用程序的视觉吸引力和用户友好性意义重大。基于模式的图形用户界面测试(PBGT)是一种创新的基于模型的测试(MBT)方法,旨在提高用户满意度和可重用性,同时最大限度地减少移动应用程序用户界面建模和测试所需的工作量。文献中对 PBGT 领域进行了多项初步研究。 问题:目前在 PBGT 领域缺乏全面的二次研究。据我们所知,这一领域的深入研究还不够集中。因此,现有文献中仍然存在许多挑战和局限性。 研究目的本研究旨在填补现有知识体系中的上述空白。我们强调了热门研究课题,并分析了它们之间的关系。我们探讨了当前最先进的方法和技术、工具和建模语言分类法、已报告的用户界面测试模式(UITPs)列表以及编写 UITPs 的分类法。我们还强调了目标研究领域的实际挑战、局限性和差距。此外,本研究还旨在强调该领域未来的研究方向。 研究方法我们对安卓和网络应用中的 PBGT 进行了系统的文献综述(SLR)。我们采用了一种结合 Kitchenham 和 PRISMA 准则的混合方法来实现目标研究目标 (RO)。我们在知名数据库中进行了基于关键词的搜索,并从 557 项研究中选出了 30 项。 结果:本研究确定了 PBGT 中使用的 11 种工具,并设计了一种分类法对这些工具进行分类。此外,还制定了用于编写 UITP 的分类标准。此外,我们还概述了目标研究领域的局限性和未来发展方向。 结论本研究通过更好地了解目标研究领域,使社区和读者受益匪浅。对现有工具、技术和方法的全面了解对从业人员很有帮助。此外,已确定的局限性、差距、新兴趋势和未来研究方向也将使打算在未来研究中进一步开展工作的研究人员受益匪浅。
{"title":"A Systematic Literature Review on Graphical User Interface Testing Through Software Patterns","authors":"Ambreen Kousar,&nbsp;Saif Ur Rehman Khan,&nbsp;Atif Mashkoor,&nbsp;Javed Iqbal","doi":"10.1049/sfw2/9140693","DOIUrl":"10.1049/sfw2/9140693","url":null,"abstract":"<p><b>Context:</b> Graphical user interface (GUI) testing of mobile applications (apps) is significant from a user perspective to ensure that the apps are visually appealing and user-friendly. Pattern-based GUI testing (PBGT) is an innovative model-based testing (MBT) approach designed to enhance user satisfaction and reusability while minimizing the effort required to model and test UIs of mobile apps. In the literature, several primary studies have been conducted in the domain of PBGT.</p><p><b>Problem:</b> The current state-of-the-art lacks comprehensive secondary studies within the PBGT domain. To our knowledge, this area has insufficient focus on in-depth research. Consequently, numerous challenges and limitations persist in the existing literature.</p><p><b>Objective:</b> This study aims to fill the gaps mentioned above in the existing body of knowledge. We highlight popular research topics and analyze their relationships. We explore current state-of-the-art approaches and techniques, a taxonomy of tools and modeling languages, a list of reported UI test patterns (UITPs), and a taxonomy of writing UITPs. We also highlight practical challenges, limitations, and gaps in the targeted research area. Furthermore, the current study intends to highlight future research directions in this domain.</p><p><b>Method:</b> We conducted a systematic literature review (SLR) on PBGT in the context of Android and web apps. A hybrid methodology that combines the Kitchenham and PRISMA guidelines is adopted to achieve the targeted research objectives (ROs). We perform a keyword-based search on well-known databases and select 30 (out of 557) studies.</p><p><b>Results:</b> The current study identifies 11 tools used in PBGT and devises a taxonomy to categorize these tools. A taxonomy for writing UITPs has also been developed. In addition, we outline the limitations of the targeted research domain and future directions.</p><p><b>Conclusion:</b> This study benefits the community and readers by better understanding the targeted research area. A comprehensive knowledge of existing tools, techniques, and methodologies is helpful for practitioners. Moreover, the identified limitations, gaps, emerging trends, and future research directions will benefit researchers who intend to work further in future research.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2/9140693","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143822296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Hybrid Methodology for Software Architecture Style Selection Using Analytic Hierarchy Process and Fuzzy Analytic Hierarchy Process 基于层次分析法和模糊层次分析法的软件体系结构风格选择自动化混合方法
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-04-03 DOI: 10.1049/sfw2/9943825
Muna Alrazgan, Ahmed Ghoneim, Luluah Albesher, Razan Aldossari, Shahad Alotaibi, Lama Alsaykhan, Norah Alshahrani, Maha Alshammari

In software engineering, selecting the appropriate architectural style for software systems is risky and sensitive. The selection process is a multicriteria decision-making (MCDM) problem. Consequently, selecting a suitable architecture is a key challenge in software development. This study presents an automated hybrid methodology based on the analytic hierarchy process (AHP) and fuzzy analytic hierarchy process (FAHP) to evaluate and suggest multiple architectural styles based on quality attributes (QAs) alone rather than relying on expert opinions. A Tera-PROMISE dataset is presented to illustrate the proposed methodology and then compare the result of the methodology with expert judgments. Moreover, to support the proposed methodology, a case study is carried out to compare the proposed method to previous studies.

在软件工程中,为软件系统选择合适的体系结构风格是有风险和敏感的。选择过程是一个多标准决策(MCDM)问题。因此,选择合适的体系结构是软件开发中的一个关键挑战。本文提出了一种基于层次分析法(AHP)和模糊层次分析法(FAHP)的自动化混合方法,以评估和建议基于质量属性(qa)的多种建筑风格,而不是依赖于专家意见。提出了一个Tera-PROMISE数据集来说明所提出的方法,然后将方法的结果与专家判断进行比较。此外,为了支持所提出的方法,进行了一个案例研究,以比较所提出的方法与以往的研究。
{"title":"Automated Hybrid Methodology for Software Architecture Style Selection Using Analytic Hierarchy Process and Fuzzy Analytic Hierarchy Process","authors":"Muna Alrazgan,&nbsp;Ahmed Ghoneim,&nbsp;Luluah Albesher,&nbsp;Razan Aldossari,&nbsp;Shahad Alotaibi,&nbsp;Lama Alsaykhan,&nbsp;Norah Alshahrani,&nbsp;Maha Alshammari","doi":"10.1049/sfw2/9943825","DOIUrl":"10.1049/sfw2/9943825","url":null,"abstract":"<p>In software engineering, selecting the appropriate architectural style for software systems is risky and sensitive. The selection process is a multicriteria decision-making (MCDM) problem. Consequently, selecting a suitable architecture is a key challenge in software development. This study presents an automated hybrid methodology based on the analytic hierarchy process (AHP) and fuzzy analytic hierarchy process (FAHP) to evaluate and suggest multiple architectural styles based on quality attributes (QAs) alone rather than relying on expert opinions. A Tera-PROMISE dataset is presented to illustrate the proposed methodology and then compare the result of the methodology with expert judgments. Moreover, to support the proposed methodology, a case study is carried out to compare the proposed method to previous studies.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2/9943825","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143770403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blockchain Consensus Scheme Based on the Proof of Distributed Deep Learning Work 基于分布式深度学习工作证明的区块链共识方案
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-21 DOI: 10.1049/sfw2/3378383
Hui Zhi, HongCheng Wu, Yu Huang, ChangLin Tian, SuZhen Wang

With the development of artificial intelligence and blockchain technology, the training of deep learning models needs large computing resources. Meanwhile, the Proof of Work (PoW) consensus mechanism in blockchain systems often leads to the wastage of computing resources. This article combines distributed deep learning (DDL) with blockchain technology and proposes a blockchain consensus scheme based on the proof of distributed deep learning work (BCDDL) to reduce the waste of computing resources in blockchain. BCDDL treats DDL training as a mining task and allocates different training data to different nodes based on their computing power to improve the utilization rate of computing resources. In order to balance the demand and supply of computing resources and incentivize nodes to participate in training tasks and consensus, a dynamic incentive mechanism based on task size and computing resources (DIM-TSCR) is proposed. In addition, in order to reduce the impact of malicious nodes on the accuracy of the global model, a model aggregation algorithm based on training data size and model accuracy (MAA-TM) is designed. Experiments demonstrate that BCDDL can significantly increase the utilization rate of computing resources and diminish the impact of malicious nodes on the accuracy of the global model.

随着人工智能和区块链技术的发展,深度学习模型的训练需要大量的计算资源。同时,区块链系统中的工作量证明(PoW)共识机制经常导致计算资源的浪费。本文将分布式深度学习(DDL)与区块链技术相结合,提出了一种基于分布式深度学习工作证明(BCDDL)的区块链共识方案,以减少区块链中计算资源的浪费。BCDDL将DDL训练视为一项挖掘任务,根据不同节点的计算能力将不同的训练数据分配给不同的节点,以提高计算资源的利用率。为了平衡计算资源的需求和供给,激励节点参与训练任务和共识,提出了一种基于任务大小和计算资源的动态激励机制(DIM-TSCR)。此外,为了减少恶意节点对全局模型精度的影响,设计了一种基于训练数据大小和模型精度的模型聚合算法(MAA-TM)。实验表明,BCDDL可以显著提高计算资源的利用率,减少恶意节点对全局模型精度的影响。
{"title":"Blockchain Consensus Scheme Based on the Proof of Distributed Deep Learning Work","authors":"Hui Zhi,&nbsp;HongCheng Wu,&nbsp;Yu Huang,&nbsp;ChangLin Tian,&nbsp;SuZhen Wang","doi":"10.1049/sfw2/3378383","DOIUrl":"10.1049/sfw2/3378383","url":null,"abstract":"<p>With the development of artificial intelligence and blockchain technology, the training of deep learning models needs large computing resources. Meanwhile, the Proof of Work (PoW) consensus mechanism in blockchain systems often leads to the wastage of computing resources. This article combines distributed deep learning (DDL) with blockchain technology and proposes a blockchain consensus scheme based on the proof of distributed deep learning work (BCDDL) to reduce the waste of computing resources in blockchain. BCDDL treats DDL training as a mining task and allocates different training data to different nodes based on their computing power to improve the utilization rate of computing resources. In order to balance the demand and supply of computing resources and incentivize nodes to participate in training tasks and consensus, a dynamic incentive mechanism based on task size and computing resources (DIM-TSCR) is proposed. In addition, in order to reduce the impact of malicious nodes on the accuracy of the global model, a model aggregation algorithm based on training data size and model accuracy (MAA-TM) is designed. Experiments demonstrate that BCDDL can significantly increase the utilization rate of computing resources and diminish the impact of malicious nodes on the accuracy of the global model.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2/3378383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143117532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Code Parameter Summarization Based on Transformer and Fusion Strategy 基于变压器和融合策略的代码参数汇总
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-31 DOI: 10.1049/sfw2/3706673
Fanlong Zhang, Jiancheng Fan, Weiqi Li, Siau-cheng Khoo

Context: As more time has been spent on code comprehension activities during software development, automatic code summarization has received much attention in software engineering research, with the goal of enhancing software comprehensibility. In the meantime, it is prevalently known that a good knowledge about the declaration and the use of method parameters can effectively enhance the understanding of the associated methods. A traditional approach used in software development is to declare the types of method parameters.

Objective: In this work, we advocate parameter-level code summarization and propose a novel approach to automatically generate parameter summaries of a given method. Parameter summarization is considerably challenging, as neither do we know the kind of information of the parameters that can be employed for summarization nor do we know the methods for retrieving such information.

Method: We present paramTrans, which is a novel approach for parameter summarization. paramTrans characterizes the semantic features from parameter-related information based on transformer; it also explores three fusion strategies for absorbing the method-level information to enhance the performance. Moreover, to retrieve parameter-related information, a parameter slicing algorithm (named paramSlice) is proposed, which slices the parameter-related node from the abstract syntax tree (AST) at the statement level.

Results: We conducted experiments to verify the effectiveness of our approach. Experimental results show that our approach possesses an effective ability in summarizing parameters; such ability can be further enhanced by understanding the available summaries about individual methods, through the introduction of three fusion strategies.

Conclusion: We recommend developers employ our approach as well as the fusion strategies to produce parameter summaries to enhance the comprehensibility of code.

背景:随着软件开发过程中代码理解活动所花费的时间越来越多,以提高软件的可理解性为目标的自动代码总结在软件工程研究中受到越来越多的关注。与此同时,人们普遍知道,对方法参数的声明和使用有很好的了解,可以有效地增强对相关方法的理解。软件开发中使用的传统方法是声明方法参数的类型。目的:在这项工作中,我们提倡参数级代码摘要,并提出了一种自动生成给定方法的参数摘要的新方法。参数汇总是相当具有挑战性的,因为我们既不知道可以用于汇总的参数信息的类型,也不知道检索这些信息的方法。方法:提出了一种新的参数汇总方法——参数转换。paramTrans基于transformer从参数相关信息中提取语义特征;探讨了吸收方法级信息以提高性能的三种融合策略。此外,为了检索参数相关信息,提出了一种参数切片算法(paramSlice),该算法在语句级对抽象语法树(AST)中的参数相关节点进行切片。结果:我们通过实验验证了我们的方法的有效性。实验结果表明,该方法具有有效的参数汇总能力;通过介绍三种融合策略,可以通过了解单个方法的可用摘要来进一步增强这种能力。结论:我们建议开发人员采用我们的方法以及融合策略来生成参数摘要,以增强代码的可理解性。
{"title":"Code Parameter Summarization Based on Transformer and Fusion Strategy","authors":"Fanlong Zhang,&nbsp;Jiancheng Fan,&nbsp;Weiqi Li,&nbsp;Siau-cheng Khoo","doi":"10.1049/sfw2/3706673","DOIUrl":"10.1049/sfw2/3706673","url":null,"abstract":"<p><b>Context:</b> As more time has been spent on code comprehension activities during software development, automatic code summarization has received much attention in software engineering research, with the goal of enhancing software comprehensibility. In the meantime, it is prevalently known that a good knowledge about the declaration and the use of method parameters can effectively enhance the understanding of the associated methods. A traditional approach used in software development is to declare the types of method parameters.</p><p><b>Objective:</b> In this work, we advocate parameter-level code summarization and propose a novel approach to automatically generate parameter summaries of a given method. Parameter summarization is considerably challenging, as neither do we know the kind of information of the parameters that can be employed for summarization nor do we know the methods for retrieving such information.</p><p><b>Method:</b> We present paramTrans, which is a novel approach for parameter summarization. paramTrans characterizes the semantic features from parameter-related information based on transformer; it also explores three fusion strategies for absorbing the method-level information to enhance the performance. Moreover, to retrieve parameter-related information, a parameter slicing algorithm (named paramSlice) is proposed, which slices the parameter-related node from the abstract syntax tree (AST) at the statement level.</p><p><b>Results:</b> We conducted experiments to verify the effectiveness of our approach. Experimental results show that our approach possesses an effective ability in summarizing parameters; such ability can be further enhanced by understanding the available summaries about individual methods, through the introduction of three fusion strategies.</p><p><b>Conclusion:</b> We recommend developers employ our approach as well as the fusion strategies to produce parameter summaries to enhance the comprehensibility of code.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2/3706673","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Observational Study on Flask Web Framework Questions on Stack Overflow (SO) 关于Flask Web框架Stack Overflow (SO)问题的观察研究
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-19 DOI: 10.1049/sfw2/1905538
Luluh Albesher, Reem Alfayez

Web-based applications are popular in demand and usage. To facilitate the development of web-based applications, the software engineering community developed multiple web application frameworks, one of which is Flask. Flask is a popular web framework that allows developers to speed up and scale the development of web applications. A review of the software engineering literature revealed that the Stack Overflow (SO) website has proven its effectiveness in providing a better understanding of multiple subjects within the software engineering field. This study aims to analyze SO Flask-related questions to gain a better understanding of the stance of Flask on the website. We identified a set of 70,230 Flask-related questions that we further analyzed to estimate how the interest towards the framework evolved over time on the website. Afterward, we utilized the Latent Dirichlet Allocation (LDA) algorithm to identify Flask-related topics that are discussed within the set of the identified questions. Moreover, we leveraged a number of proxy measures to examine the difficulty and popularity of the identified topics. The study found that the interest towards Flask has been generally increasing on the website, with a peak in 2020 and drops in the following years. Moreover, Flask-related questions on SO revolve around 12 topics, where Application Programming Interface (API) can be considered the most popular topic and background tasks can be considered the most difficult one. Software engineering researchers, practitioners, educators, and Flask contributors may find this study useful in guiding their future Flask-related endeavors.

基于web的应用程序在需求和使用方面都很流行。为了促进基于web的应用程序的开发,软件工程社区开发了多个web应用程序框架,其中之一就是Flask。Flask是一个流行的web框架,它允许开发人员加速和扩展web应用程序的开发。对软件工程文献的回顾表明,Stack Overflow (SO)网站已经证明了它在提供对软件工程领域内多个主题的更好理解方面的有效性。本研究旨在分析与SO Flask相关的问题,以更好地了解Flask在网站上的立场。我们确定了70,230个与flask相关的问题,我们进一步分析了这些问题,以估计网站上对该框架的兴趣是如何随着时间的推移而演变的。之后,我们利用潜在狄利克雷分配(LDA)算法来识别在识别问题集中讨论的flask相关主题。此外,我们利用一些代理措施来检查确定主题的难度和受欢迎程度。研究发现,网站上对Flask的兴趣总体上在增加,在2020年达到顶峰,随后几年下降。此外,与flask相关的SO问题围绕着12个主题,其中应用程序编程接口(API)可以被认为是最受欢迎的主题,后台任务可以被认为是最难的主题。软件工程研究人员、实践者、教育者和Flask贡献者可能会发现这项研究对指导他们未来与Flask相关的工作很有用。
{"title":"An Observational Study on Flask Web Framework Questions on Stack Overflow (SO)","authors":"Luluh Albesher,&nbsp;Reem Alfayez","doi":"10.1049/sfw2/1905538","DOIUrl":"10.1049/sfw2/1905538","url":null,"abstract":"<p>Web-based applications are popular in demand and usage. To facilitate the development of web-based applications, the software engineering community developed multiple web application frameworks, one of which is Flask. Flask is a popular web framework that allows developers to speed up and scale the development of web applications. A review of the software engineering literature revealed that the Stack Overflow (SO) website has proven its effectiveness in providing a better understanding of multiple subjects within the software engineering field. This study aims to analyze SO Flask-related questions to gain a better understanding of the stance of Flask on the website. We identified a set of 70,230 Flask-related questions that we further analyzed to estimate how the interest towards the framework evolved over time on the website. Afterward, we utilized the Latent Dirichlet Allocation (LDA) algorithm to identify Flask-related topics that are discussed within the set of the identified questions. Moreover, we leveraged a number of proxy measures to examine the difficulty and popularity of the identified topics. The study found that the interest towards Flask has been generally increasing on the website, with a peak in 2020 and drops in the following years. Moreover, Flask-related questions on SO revolve around 12 topics, where Application Programming Interface (API) can be considered the most popular topic and background tasks can be considered the most difficult one. Software engineering researchers, practitioners, educators, and Flask contributors may find this study useful in guiding their future Flask-related endeavors.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/sfw2/1905538","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software Defect Prediction Method Based on Clustering Ensemble Learning 基于聚类集合学习的软件缺陷预测方法
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-19 DOI: 10.1049/2024/6294422
Hongwei Tao, Qiaoling Cao, Haoran Chen, Yanting Li, Xiaoxu Niu, Tao Wang, Zhenhao Geng, Songtao Shang

The technique of software defect prediction aims to assess and predict potential defects in software projects and has made significant progress in recent years within software development. In previous studies, this technique largely relied on supervised learning methods, requiring a substantial amount of labeled historical defect data to train the models. However, obtaining these labeled data often demands significant time and resources. In contrast, software defect prediction based on unsupervised learning does not depend on known labeled data, eliminating the need for large-scale data labeling, thereby saving considerable time and resources while providing a more flexible solution for ensuring software quality. This paper conducts software defect prediction using unsupervised learning methods on data from 16 projects across two public datasets (PROMISE and NASA). During the feature selection step, a chi-squared sparse feature selection method is proposed. This feature selection strategy combines chi-squared tests with sparse principal component analysis (SPCA). Specifically, the chi-squared test is first used to filter out the most statistically significant features, and then the SPCA is applied to reduce the dimensionality of these significant features. In the clustering step, the dot product matrix and Pearson correlation coefficient (PCC) matrix are used to construct weighted adjacency matrices, and a clustering overlap method is proposed. This method integrates spectral clustering, Newman clustering, fluid clustering, and Clauset–Newman–Moore (CNM) clustering through ensemble learning. Experimental results indicate that, in the absence of labeled data, using the chi-squared sparse method for feature selection demonstrates superior performance, and the proposed clustering overlap method outperforms or is comparable to the effectiveness of the four baseline clustering methods.

软件缺陷预测技术旨在评估和预测软件项目中的潜在缺陷,近年来在软件开发领域取得了重大进展。在以往的研究中,该技术主要依赖于监督学习方法,需要大量标注的历史缺陷数据来训练模型。然而,获取这些标注数据往往需要大量的时间和资源。相比之下,基于无监督学习的软件缺陷预测不依赖于已知的标记数据,无需进行大规模的数据标记,从而节省了大量的时间和资源,同时为确保软件质量提供了更灵活的解决方案。本文使用无监督学习方法对两个公共数据集(PROMISE 和 NASA)中 16 个项目的数据进行了软件缺陷预测。在特征选择步骤中,提出了一种奇平方稀疏特征选择方法。这种特征选择策略结合了卡方检验和稀疏主成分分析(SPCA)。具体来说,首先使用卡方检验筛选出统计意义最显著的特征,然后应用 SPCA 降低这些显著特征的维度。在聚类步骤中,利用点积矩阵和皮尔逊相关系数(PCC)矩阵构建加权邻接矩阵,并提出一种聚类重叠方法。该方法通过集合学习将光谱聚类、纽曼聚类、流体聚类和克劳塞特-纽曼-摩尔(CNM)聚类整合在一起。实验结果表明,在没有标注数据的情况下,使用秩方稀疏法进行特征选择表现出更优越的性能,而所提出的聚类重叠方法则优于或相当于四种基线聚类方法的效果。
{"title":"Software Defect Prediction Method Based on Clustering Ensemble Learning","authors":"Hongwei Tao,&nbsp;Qiaoling Cao,&nbsp;Haoran Chen,&nbsp;Yanting Li,&nbsp;Xiaoxu Niu,&nbsp;Tao Wang,&nbsp;Zhenhao Geng,&nbsp;Songtao Shang","doi":"10.1049/2024/6294422","DOIUrl":"10.1049/2024/6294422","url":null,"abstract":"<p>The technique of software defect prediction aims to assess and predict potential defects in software projects and has made significant progress in recent years within software development. In previous studies, this technique largely relied on supervised learning methods, requiring a substantial amount of labeled historical defect data to train the models. However, obtaining these labeled data often demands significant time and resources. In contrast, software defect prediction based on unsupervised learning does not depend on known labeled data, eliminating the need for large-scale data labeling, thereby saving considerable time and resources while providing a more flexible solution for ensuring software quality. This paper conducts software defect prediction using unsupervised learning methods on data from 16 projects across two public datasets (PROMISE and NASA). During the feature selection step, a chi-squared sparse feature selection method is proposed. This feature selection strategy combines chi-squared tests with sparse principal component analysis (SPCA). Specifically, the chi-squared test is first used to filter out the most statistically significant features, and then the SPCA is applied to reduce the dimensionality of these significant features. In the clustering step, the dot product matrix and Pearson correlation coefficient (PCC) matrix are used to construct weighted adjacency matrices, and a clustering overlap method is proposed. This method integrates spectral clustering, Newman clustering, fluid clustering, and Clauset–Newman–Moore (CNM) clustering through ensemble learning. Experimental results indicate that, in the absence of labeled data, using the chi-squared sparse method for feature selection demonstrates superior performance, and the proposed clustering overlap method outperforms or is comparable to the effectiveness of the four baseline clustering methods.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6294422","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142674173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConCPDP: A Cross-Project Defect Prediction Method Integrating Contrastive Pretraining and Category Boundary Adjustment ConCPDP:整合对比预训练和类别边界调整的跨项目缺陷预测方法
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-13 DOI: 10.1049/2024/5102699
Hengjie Song, Yufei Pan, Feng Guo, Xue Zhang, Le Ma, Siyu Jiang

Software defect prediction (SDP) is a crucial phase preceding the launch of software products. Cross-project defect prediction (CPDP) is introduced for the anticipation of defects in novel projects lacking defect labels. CPDP can use defect information of mature projects to speed up defect prediction for new projects. So that developers can quickly get the defect information of the new project, so that they can test the software project pertinently. At present, the predominant approaches in CPDP rely on deep learning, and the performance of the ultimate model is notably affected by the quality of the training dataset. However, the dataset of CPDP not only has few samples but also has almost no label information in new projects, which makes the general deep-learning-based CPDP model not ideal. In addition, most of the current CPDP models do not fully consider the enrichment of classification boundary samples after cross-domain, leading to suboptimal predictive capabilities of the model. To overcome these obstacles, we present contrastive learning pretraining for CPDP (ConCPDP), a CPDP method integrating contrastive pretraining and category boundary adjustment. We first perform data augmentation on the source and target domain code files and then extract the enhanced data as an abstract syntax tree (AST). The AST is then transformed into an integer sequence using specific mapping rules, serving as input for the subsequent neural network. A neural network based on bidirectional long short-term memory (Bi-LSTM) will receive an integer sequence and output a feature vector. Then, the feature vectors are input into the contrastive module to optimise the feature extraction network. The pretrained feature extractor can be fine-tuned by the maximum mean discrepancy (MMD) between the feature distribution of the source domain and the target domain and the binary classification loss on the source domain. This paper conducts a large number of experiments on the PROMISE dataset, which is commonly used for CPDP, to validate ConCPDP’s efficacy, achieving superior results in terms of F1 measure, area under curve (AUC), and Matthew’s correlation coefficient (MCC).

软件缺陷预测(SDP)是软件产品发布前的一个关键阶段。跨项目缺陷预测(CPDP)是为预测缺乏缺陷标签的新项目中的缺陷而引入的。CPDP 可以利用成熟项目的缺陷信息来加快新项目的缺陷预测。这样,开发人员就能快速获得新项目的缺陷信息,从而有针对性地测试软件项目。目前,CPDP 的主要方法依赖于深度学习,而最终模型的性能明显受到训练数据集质量的影响。然而,CPDP 的数据集不仅样本少,而且新项目几乎没有标签信息,这使得基于深度学习的一般 CPDP 模型并不理想。此外,目前的 CPDP 模型大多没有充分考虑跨域后分类边界样本的丰富性,导致模型的预测能力不理想。为了克服这些障碍,我们提出了 CPDP 的对比学习预训练(ConCPDP),这是一种整合了对比预训练和类别边界调整的 CPDP 方法。我们首先对源代码文件和目标领域代码文件进行数据增强,然后将增强后的数据提取为抽象语法树(AST)。然后使用特定的映射规则将 AST 转换为整数序列,作为后续神经网络的输入。基于双向长短期记忆(Bi-LSTM)的神经网络将接收整数序列并输出特征向量。然后,将特征向量输入对比模块,以优化特征提取网络。预训练的特征提取器可根据源域和目标域特征分布之间的最大平均差异(MMD)以及源域的二元分类损失进行微调。本文在 CPDP 常用的 PROMISE 数据集上进行了大量实验,验证了 ConCPDP 的有效性,在 F1 指标、曲线下面积(AUC)和马太相关系数(MCC)方面取得了优异的结果。
{"title":"ConCPDP: A Cross-Project Defect Prediction Method Integrating Contrastive Pretraining and Category Boundary Adjustment","authors":"Hengjie Song,&nbsp;Yufei Pan,&nbsp;Feng Guo,&nbsp;Xue Zhang,&nbsp;Le Ma,&nbsp;Siyu Jiang","doi":"10.1049/2024/5102699","DOIUrl":"10.1049/2024/5102699","url":null,"abstract":"<p>Software defect prediction (SDP) is a crucial phase preceding the launch of software products. Cross-project defect prediction (CPDP) is introduced for the anticipation of defects in novel projects lacking defect labels. CPDP can use defect information of mature projects to speed up defect prediction for new projects. So that developers can quickly get the defect information of the new project, so that they can test the software project pertinently. At present, the predominant approaches in CPDP rely on deep learning, and the performance of the ultimate model is notably affected by the quality of the training dataset. However, the dataset of CPDP not only has few samples but also has almost no label information in new projects, which makes the general deep-learning-based CPDP model not ideal. In addition, most of the current CPDP models do not fully consider the enrichment of classification boundary samples after cross-domain, leading to suboptimal predictive capabilities of the model. To overcome these obstacles, we present contrastive learning pretraining for CPDP (ConCPDP), a CPDP method integrating contrastive pretraining and category boundary adjustment. We first perform data augmentation on the source and target domain code files and then extract the enhanced data as an abstract syntax tree (AST). The AST is then transformed into an integer sequence using specific mapping rules, serving as input for the subsequent neural network. A neural network based on bidirectional long short-term memory (Bi-LSTM) will receive an integer sequence and output a feature vector. Then, the feature vectors are input into the contrastive module to optimise the feature extraction network. The pretrained feature extractor can be fine-tuned by the maximum mean discrepancy (MMD) between the feature distribution of the source domain and the target domain and the binary classification loss on the source domain. This paper conducts a large number of experiments on the PROMISE dataset, which is commonly used for CPDP, to validate ConCPDP’s efficacy, achieving superior results in terms of <i>F</i><sub>1</sub> measure, area under curve (AUC), and Matthew’s correlation coefficient (MCC).</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5102699","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142641693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking the Blockchain Trilemma: A Comprehensive Consensus Mechanism for Ensuring Security, Scalability, and Decentralization 打破区块链三难困境:确保安全性、可扩展性和去中心化的全面共识机制
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-10 DOI: 10.1049/2024/6874055
Khandakar Md Shafin, Saha Reno

The ongoing challenge in the world of blockchain technology is finding a solution to the trilemma that involves balancing decentralization, security, and scalability. This paper introduces a pioneering blockchain architecture designed to transcend this trilemma, uniting advanced cryptographic methods, inventive security protocols, and dynamic decentralization mechanisms. Employing established techniques such as elliptic curve cryptography, Schnorr verifiable random function, and zero-knowledge proof (zk-SNARK), alongside groundbreaking methodologies for stake distribution, anomaly detection, and incentive alignment, our framework sets a new benchmark for secure, scalable, and decentralized blockchain ecosystems. The proposed system surpasses top-tier consensuses by attaining a throughput of 1700+ transactions per second, ensuring robust security against all well-known blockchain attacks without compromising scalability and demonstrating solid decentralization in benchmark analysis alongside 25 other blockchain systems, all achieved with an affordable hardware cost for validators and an average CPU usage of only 16.1%.

区块链技术领域一直面临的挑战是如何解决去中心化、安全性和可扩展性之间的三难问题。本文介绍了一种开创性的区块链架构,旨在将先进的加密方法、创新的安全协议和动态去中心化机制结合在一起,从而超越这一三难问题。我们的框架采用了椭圆曲线密码学、施诺尔可验证随机函数和零知识证明(zk-SNARK)等成熟技术,以及股权分配、异常检测和激励调整等开创性方法,为安全、可扩展和去中心化的区块链生态系统树立了新标杆。拟议的系统超越了顶级共识,每秒吞吐量达到 1700 多笔交易,在不影响可扩展性的情况下确保了抵御所有知名区块链攻击的强大安全性,并在基准分析中与其他 25 个区块链系统一起展示了稳固的去中心化,所有这一切都以验证者可承受的硬件成本和平均仅 16.1% 的 CPU 使用率实现的。
{"title":"Breaking the Blockchain Trilemma: A Comprehensive Consensus Mechanism for Ensuring Security, Scalability, and Decentralization","authors":"Khandakar Md Shafin,&nbsp;Saha Reno","doi":"10.1049/2024/6874055","DOIUrl":"10.1049/2024/6874055","url":null,"abstract":"<p>The ongoing challenge in the world of blockchain technology is finding a solution to the trilemma that involves balancing decentralization, security, and scalability. This paper introduces a pioneering blockchain architecture designed to transcend this trilemma, uniting advanced cryptographic methods, inventive security protocols, and dynamic decentralization mechanisms. Employing established techniques such as elliptic curve cryptography, Schnorr verifiable random function, and zero-knowledge proof (zk-SNARK), alongside groundbreaking methodologies for stake distribution, anomaly detection, and incentive alignment, our framework sets a new benchmark for secure, scalable, and decentralized blockchain ecosystems. The proposed system surpasses top-tier consensuses by attaining a throughput of 1700+ transactions per second, ensuring robust security against all well-known blockchain attacks without compromising scalability and demonstrating solid decentralization in benchmark analysis alongside 25 other blockchain systems, all achieved with an affordable hardware cost for validators and an average CPU usage of only 16.1%.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6874055","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142404701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IC-GraF: An Improved Clustering with Graph-Embedding-Based Features for Software Defect Prediction IC-GraF:基于图形嵌入特征的改进聚类,用于软件缺陷预测
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-16 DOI: 10.1049/2024/8027037
Xuanye Wang, Lu Lu, Qingyan Tian, Haishan Lin

Software defect prediction (SDP) has been a prominent area of research in software engineering. Previous SDP methods often struggled in industrial applications, primarily due to the need for sufficient historical data. Thus, clustering-based unsupervised defect prediction (CUDP) and cross-project defect prediction (CPDP) emerged to address this challenge. However, the former exhibited limitations in capturing semantic and structural features, while the latter encountered constraints due to differences in data distribution across projects. Therefore, we introduce a novel framework called improved clustering with graph-embedding-based features (IC-GraF) for SDP without the reliance on historical data. First, a preprocessing operation is performed to extract program dependence graphs (PDGs) and mark distinct dependency relationships within them. Second, the improved deep graph infomax (IDGI) model, an extension of the DGI model specifically for SDP, is designed to generate graph-level representations of PDGs. Finally, a heuristic-based k-means clustering algorithm is employed to classify the features generated by IDGI. To validate the efficacy of IC-GraF, we conduct experiments based on 24 releases of the PROMISE dataset, using F-measure and G-measure as evaluation criteria. The findings indicate that IC-GraF achieves 5.0%−42.7% higher F-measure, 5%−39.4% higher G-measure, and 2.5%−11.4% higher AUC over existing CUDP methods. Even when compared with eight supervised learning-based SDP methods, IC-GraF maintains a superior competitive edge.

软件缺陷预测(SDP)一直是软件工程的一个重要研究领域。以前的 SDP 方法在工业应用中往往举步维艰,主要原因是需要足够的历史数据。因此,基于聚类的无监督缺陷预测(CUDP)和跨项目缺陷预测(CPDP)应运而生,以应对这一挑战。然而,前者在捕捉语义和结构特征方面表现出局限性,而后者则因跨项目数据分布的差异而遇到限制。因此,我们为 SDP 引入了一种新的框架,即基于图嵌入特征的改进聚类(IC-GraF),而无需依赖历史数据。首先,进行预处理操作以提取程序依赖图(PDGs),并标记其中不同的依赖关系。其次,设计了改进的深度图 infomax(IDGI)模型,该模型是专门针对 SDP 的 DGI 模型的扩展,用于生成 PDGs 的图级表示。最后,采用基于启发式的 k-means 聚类算法对 IDGI 生成的特征进行分类。为了验证 IC-GraF 的功效,我们使用 F-measure 和 G-measure 作为评估标准,基于 24 个发布的 PROMISE 数据集进行了实验。结果表明,与现有的 CUDP 方法相比,IC-GraF 的 F-measure 高出 5.0%-42.7%,G-measure 高出 5%-39.4%,AUC 高出 2.5%-11.4%。即使与八种基于监督学习的 SDP 方法相比,IC-GraF 也保持了卓越的竞争优势。
{"title":"IC-GraF: An Improved Clustering with Graph-Embedding-Based Features for Software Defect Prediction","authors":"Xuanye Wang,&nbsp;Lu Lu,&nbsp;Qingyan Tian,&nbsp;Haishan Lin","doi":"10.1049/2024/8027037","DOIUrl":"10.1049/2024/8027037","url":null,"abstract":"<p>Software defect prediction (SDP) has been a prominent area of research in software engineering. Previous SDP methods often struggled in industrial applications, primarily due to the need for sufficient historical data. Thus, clustering-based unsupervised defect prediction (CUDP) and cross-project defect prediction (CPDP) emerged to address this challenge. However, the former exhibited limitations in capturing semantic and structural features, while the latter encountered constraints due to differences in data distribution across projects. Therefore, we introduce a novel framework called improved clustering with graph-embedding-based features (IC-GraF) for SDP without the reliance on historical data. First, a preprocessing operation is performed to extract program dependence graphs (PDGs) and mark distinct dependency relationships within them. Second, the improved deep graph infomax (IDGI) model, an extension of the DGI model specifically for SDP, is designed to generate graph-level representations of PDGs. Finally, a heuristic-based k-means clustering algorithm is employed to classify the features generated by IDGI. To validate the efficacy of IC-GraF, we conduct experiments based on 24 releases of the PROMISE dataset, using F-measure and G-measure as evaluation criteria. The findings indicate that IC-GraF achieves 5.0%−42.7% higher F-measure, 5%−39.4% higher G-measure, and 2.5%−11.4% higher AUC over existing CUDP methods. Even when compared with eight supervised learning-based SDP methods, IC-GraF maintains a superior competitive edge.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/8027037","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation IAPCP:通过域内对齐和基于编程的分布适应,建立有效的跨项目缺陷预测模型
IF 1.3 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-09-03 DOI: 10.1049/2024/5358773
Nana Zhang, Kun Zhu, Dandan Zhu

Cross-project defect prediction (CPDP) aims to identify defect-prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution discrepancy between the source and target projects makes it challenging to transfer the matching feature representation and severely hinders CPDP performance. Besides, existing CPDP models require an intensively expensive and time-consuming process to tune a lot of parameters. To address the above limitations, we propose an effective CPDP model named IAPCP based on distribution adaptation in this study, which consists of two stages: correlation alignment and intra-domain programming. Correlation alignment first calculates the covariance matrices of the source and target projects and then erases some features of the source project (i.e., whitening operation) and employs the features of the target project (i.e., target covariance) to fill the source project, thereby well aligning the source and target feature distributions and reducing the distribution discrepancy across projects. Intra-domain programming can directly learn a nonparametric linear transfer defect predictor with strong discriminative capacity by solving a probabilistic annotation matrix (PAM) based on the adjusted features of the source project. The model does not require model selection and parameter tuning. Extensive experiments on a total of 82 cross-project pairs from 16 software projects demonstrate that IAPCP can achieve competitive CPDP effectiveness and efficiency compared with multiple state-of-the-art baseline models.

跨项目缺陷预测(CPDP)旨在利用从其他软件项目(源项目)收集到的历史数据,识别一个项目(目标项目)中容易出现缺陷的软件实例,从而帮助维护人员合理分配有限的测试资源。遗憾的是,源项目和目标项目之间的特征分布差异使得转移匹配的特征表示具有挑战性,严重影响了 CPDP 的性能。此外,现有的 CPDP 模型需要耗费大量时间和金钱来调整大量参数。针对上述局限性,我们在本研究中提出了一种基于分布适应的有效 CPDP 模型 IAPCP,该模型由两个阶段组成:相关性对齐和域内编程。相关对齐首先计算源项目和目标项目的协方差矩阵,然后擦除源项目的部分特征(即白化操作),并利用目标项目的特征(即目标协方差)来填充源项目,从而很好地对齐源项目和目标项目的特征分布,减少项目间的分布差异。域内编程可以根据调整后的源项目特征,通过求解概率注释矩阵(PAM),直接学习具有较强判别能力的非参数线性转移缺陷预测器。该模型无需进行模型选择和参数调整。对来自 16 个软件项目的 82 个跨项目对进行的广泛实验表明,与多个最先进的基线模型相比,IAPCP 可以实现具有竞争力的 CPDP 效果和效率。
{"title":"IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation","authors":"Nana Zhang,&nbsp;Kun Zhu,&nbsp;Dandan Zhu","doi":"10.1049/2024/5358773","DOIUrl":"10.1049/2024/5358773","url":null,"abstract":"<p>Cross-project defect prediction (CPDP) aims to identify defect-prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution discrepancy between the source and target projects makes it challenging to transfer the matching feature representation and severely hinders CPDP performance. Besides, existing CPDP models require an intensively expensive and time-consuming process to tune a lot of parameters. To address the above limitations, we propose an effective CPDP model named IAPCP based on distribution adaptation in this study, which consists of two stages: correlation alignment and intra-domain programming. Correlation alignment first calculates the covariance matrices of the source and target projects and then erases some features of the source project (i.e., whitening operation) and employs the features of the target project (i.e., target covariance) to fill the source project, thereby well aligning the source and target feature distributions and reducing the distribution discrepancy across projects. Intra-domain programming can directly learn a nonparametric linear transfer defect predictor with strong discriminative capacity by solving a probabilistic annotation matrix (PAM) based on the adjusted features of the source project. The model does not require model selection and parameter tuning. Extensive experiments on a total of 82 cross-project pairs from 16 software projects demonstrate that IAPCP can achieve competitive CPDP effectiveness and efficiency compared with multiple state-of-the-art baseline models.</p>","PeriodicalId":50378,"journal":{"name":"IET Software","volume":"2024 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5358773","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IET Software
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1