首页 > 最新文献

Journal of Systems and Software最新文献

英文 中文
A GUI-based Metamorphic Testing Technique for Detecting Authentication Vulnerabilities in Android Mobile Apps
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-07 DOI: 10.1016/j.jss.2025.112364
Domenico Amalfitano , Misael Júnior , Anna Rita Fasolino , Marcio Delamaro

Context:

The increasing use of mobile apps in daily life involves managing and sharing sensitive user information.

Problem:

New vulnerabilities are frequently reported in bug tracking systems, highlighting the need for effective security testing processes for these applications.

Proposal:

This study introduces a GUI-based Metamorphic Testing technique designed to detect five common real-world vulnerabilities related to username and password authentication methods in Android applications, as identified by OWASP.

Methods:

We developed five Metamorphic Relationships to test for these vulnerabilities and implemented a Metamorphic Vulnerability Testing Environment to automate the technique. This environment facilitates the generation of Source test case and the automatic creation and execution of Follow-up test case.

Results:

The technique was applied to 163 real-world Android applications, uncovering 159 vulnerabilities. Out of these, 108 apps exhibited at least one vulnerability. The vulnerabilities were validated through expert analysis conducted by three security professionals, who confirmed the issues by interacting directly with the app’s graphical user interfaces (GUIs). Additionally, to assess the practical relevance of our approach, we engaged with 37 companies whose applications were identified as vulnerable. Nine companies confirmed the vulnerabilities, and 26 updated their apps to address the reported issues. Our findings also indicate a weak inverse correlation between user-perceived quality and vulnerabilities; even highly rated apps can harbor significant security flaws.
{"title":"A GUI-based Metamorphic Testing Technique for Detecting Authentication Vulnerabilities in Android Mobile Apps","authors":"Domenico Amalfitano ,&nbsp;Misael Júnior ,&nbsp;Anna Rita Fasolino ,&nbsp;Marcio Delamaro","doi":"10.1016/j.jss.2025.112364","DOIUrl":"10.1016/j.jss.2025.112364","url":null,"abstract":"<div><h3>Context:</h3><div>The increasing use of mobile apps in daily life involves managing and sharing sensitive user information.</div></div><div><h3>Problem:</h3><div>New vulnerabilities are frequently reported in bug tracking systems, highlighting the need for effective security testing processes for these applications.</div></div><div><h3>Proposal:</h3><div>This study introduces a GUI-based Metamorphic Testing technique designed to detect five common real-world vulnerabilities related to username and password authentication methods in Android applications, as identified by OWASP.</div></div><div><h3>Methods:</h3><div>We developed five Metamorphic Relationships to test for these vulnerabilities and implemented a Metamorphic Vulnerability Testing Environment to automate the technique. This environment facilitates the generation of <em>Source test case</em> and the automatic creation and execution of <em>Follow-up test case</em>.</div></div><div><h3>Results:</h3><div>The technique was applied to 163 real-world Android applications, uncovering 159 vulnerabilities. Out of these, 108 apps exhibited at least one vulnerability. The vulnerabilities were validated through expert analysis conducted by three security professionals, who confirmed the issues by interacting directly with the app’s graphical user interfaces (GUIs). Additionally, to assess the practical relevance of our approach, we engaged with 37 companies whose applications were identified as vulnerable. Nine companies confirmed the vulnerabilities, and 26 updated their apps to address the reported issues. Our findings also indicate a weak inverse correlation between user-perceived quality and vulnerabilities; even highly rated apps can harbor significant security flaws.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"224 ","pages":"Article 112364"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143438141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Attention-based Wide and Deep Neural Network for Reentrancy Vulnerability Detection in Smart Contracts
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-03 DOI: 10.1016/j.jss.2025.112361
Samuel Banning Osei , Rubing Huang , Zhongchen Ma
In recent years, smart contracts have become integral to blockchain applications, offering decentralized, transparent, and tamper-proof execution of agreements. However, vulnerabilities in smart contracts pose significant security risks, leading to financial losses. This paper presents an Attention-based Wide and Deep Neural Network (AWDNN) for Reentrancy vulnerability Detection in Ethereum smart contracts. By emphasizing crucial smart contract features, AWDNN enhances its precision in identifying complex vulnerability patterns. Our approach includes three phases: code optimization, vectorization, and vulnerability detection. We streamline smart contract code by removing extraneous components and extracting key fragments. These fragments are transformed into vectors that capture the smart contract’s semantic features, and subsequently subjected through the wide and deep neural network to detect vulnerabilities. Experimental results show that our model performs well compared to existing tools. Future work aims to detect additional vulnerabilities and incorporate advanced vectorization techniques to enhance efficiency.
{"title":"An Attention-based Wide and Deep Neural Network for Reentrancy Vulnerability Detection in Smart Contracts","authors":"Samuel Banning Osei ,&nbsp;Rubing Huang ,&nbsp;Zhongchen Ma","doi":"10.1016/j.jss.2025.112361","DOIUrl":"10.1016/j.jss.2025.112361","url":null,"abstract":"<div><div>In recent years, smart contracts have become integral to blockchain applications, offering decentralized, transparent, and tamper-proof execution of agreements. However, vulnerabilities in smart contracts pose significant security risks, leading to financial losses. This paper presents an Attention-based Wide and Deep Neural Network (AWDNN) for Reentrancy vulnerability Detection in Ethereum smart contracts. By emphasizing crucial smart contract features, AWDNN enhances its precision in identifying complex vulnerability patterns. Our approach includes three phases: code optimization, vectorization, and vulnerability detection. We streamline smart contract code by removing extraneous components and extracting key fragments. These fragments are transformed into vectors that capture the smart contract’s semantic features, and subsequently subjected through the wide and deep neural network to detect vulnerabilities. Experimental results show that our model performs well compared to existing tools. Future work aims to detect additional vulnerabilities and incorporate advanced vectorization techniques to enhance efficiency.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112361"},"PeriodicalIF":3.7,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143215069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SBFL fault localization considering fault-proneness
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-03 DOI: 10.1016/j.jss.2025.112363
Reza Torkashvan , Saeed Parsa , Babak Vaziri
Fault localization is a critical phase in software debugging, often posing significant challenges and demanding extensive time for large and complex programs. Spectrum-based fault localization (SBFL) is a straightforward and cost-effective technique that leverages program execution logs to identify faulty statements. However, the effectiveness of SBFL can be compromised by biases in the test data set, which may not uniformly cover all code features. This study demonstrates that the integration of fault-proneness scores of program classes, predicted by a machine learning model utilizing source code metrics, with the fault-suspiciousness scores of program statements, estimated by SBFL, can enhance the accuracy and efficacy of fault localization. A Random Forest model is employed to predict the fault-proneness of classes in five Java projects from the Unified-Bug-Dataset 1.2. Concurrently, three established SBFL formulas are used to compute the fault-suspiciousness of statements. Statements are ranked based on their faultiness scores, derived from a linear combination of class fault-proneness and statement fault-suspiciousness. This approach is compared with the original SBFL formulas using four evaluation metrics: F-measure, precision, recall, and accuracy. The results indicate that the proposed method surpasses the original SBFL formulas across all metrics and significantly reduces the search space for fault localization. These findings suggest that the integration of static and dynamic analysis provides a more reliable and efficient method for fault localization in software systems.
{"title":"SBFL fault localization considering fault-proneness","authors":"Reza Torkashvan ,&nbsp;Saeed Parsa ,&nbsp;Babak Vaziri","doi":"10.1016/j.jss.2025.112363","DOIUrl":"10.1016/j.jss.2025.112363","url":null,"abstract":"<div><div>Fault localization is a critical phase in software debugging, often posing significant challenges and demanding extensive time for large and complex programs. Spectrum-based fault localization (SBFL) is a straightforward and cost-effective technique that leverages program execution logs to identify faulty statements. However, the effectiveness of SBFL can be compromised by biases in the test data set, which may not uniformly cover all code features. This study demonstrates that the integration of fault-proneness scores of program classes, predicted by a machine learning model utilizing source code metrics, with the fault-suspiciousness scores of program statements, estimated by SBFL, can enhance the accuracy and efficacy of fault localization. A Random Forest model is employed to predict the fault-proneness of classes in five Java projects from the Unified-Bug-Dataset 1.2. Concurrently, three established SBFL formulas are used to compute the fault-suspiciousness of statements. Statements are ranked based on their faultiness scores, derived from a linear combination of class fault-proneness and statement fault-suspiciousness. This approach is compared with the original SBFL formulas using four evaluation metrics: F-measure, precision, recall, and accuracy. The results indicate that the proposed method surpasses the original SBFL formulas across all metrics and significantly reduces the search space for fault localization. These findings suggest that the integration of static and dynamic analysis provides a more reliable and efficient method for fault localization in software systems.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112363"},"PeriodicalIF":3.7,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting test failures induced by software defects: A lightweight alternative to software defect prediction and its industrial application
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-03 DOI: 10.1016/j.jss.2025.112360
Lech Madeyski , Szymon Stradowski

Context:

Machine Learning Software Defect Prediction (ML SDP) is a promising method to improve the quality and minimise the cost of software development.

Objective:

We aim to: (1) apropose and develop a Lightweight Alternative to SDP (LA2SDP) that predicts test failures induced by software defects to allow pinpointing defective software modules thanks to available mapping of predicted test failures to past defects and corrected modules, (2) preliminary evaluate the proposed method in a real-world Nokia 5G scenario.

Method:

We train machine learning models using test failures that come from confirmed software defects already available in the Nokia 5G environment. We implement LA2SDP using five supervised ML algorithms, together with their tuned versions, and use eXplainable AI (XAI) to provide feedback to stakeholders and initiate quality improvement actions.

Results:

We have shown that LA2SDP is feasible in vivo using test failure-to-defect report mapping readily available within the Nokia 5G system-level test process, achieving good predictive performance. Specifically, CatBoost Gradient Boosting turned out to perform the best and achieved satisfactory Matthew’s Correlation Coefficient (MCC) results for our feasibility study.

Conclusions:

Our efforts have successfully defined, developed, and validated LA2SDP, using the sliding and expanding window approaches on an industrial data set.
{"title":"Predicting test failures induced by software defects: A lightweight alternative to software defect prediction and its industrial application","authors":"Lech Madeyski ,&nbsp;Szymon Stradowski","doi":"10.1016/j.jss.2025.112360","DOIUrl":"10.1016/j.jss.2025.112360","url":null,"abstract":"<div><h3>Context:</h3><div>Machine Learning Software Defect Prediction (ML SDP) is a promising method to improve the quality and minimise the cost of software development.</div></div><div><h3>Objective:</h3><div>We aim to: (1) apropose and develop a Lightweight Alternative to SDP (LA2SDP) that predicts test failures induced by software defects to allow pinpointing defective software modules thanks to available mapping of predicted test failures to past defects and corrected modules, (2) preliminary evaluate the proposed method in a real-world Nokia 5G scenario.</div></div><div><h3>Method:</h3><div>We train machine learning models using test failures that come from confirmed software defects already available in the Nokia 5G environment. We implement LA2SDP using five supervised ML algorithms, together with their tuned versions, and use eXplainable AI (XAI) to provide feedback to stakeholders and initiate quality improvement actions.</div></div><div><h3>Results:</h3><div>We have shown that LA2SDP is feasible in vivo using test failure-to-defect report mapping readily available within the Nokia 5G system-level test process, achieving good predictive performance. Specifically, CatBoost Gradient Boosting turned out to perform the best and achieved satisfactory Matthew’s Correlation Coefficient (MCC) results for our feasibility study.</div></div><div><h3>Conclusions:</h3><div>Our efforts have successfully defined, developed, and validated LA2SDP, using the sliding and expanding window approaches on an industrial data set.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112360"},"PeriodicalIF":3.7,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143215068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The AmbiTRUS framework for identifying potential ambiguity in user stories
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-31 DOI: 10.1016/j.jss.2025.112357
Anis R. Amna , Yves Wautelet , Stephan Poelmans , Samedi Heng , Geert Poels
Ambiguity in natural language-based requirements is a well-known issue, often addressed as a singular problem despite its complexity. Studies reveal that ambiguity in user stories can manifest differently depending on the linguistic levels.
This study introduces the ambiguity analysis framework (AmbiTRUS) to address these diverse manifestations by composing quality criteria for 13 types of ambiguity problems, classified across four linguistic levels and linked to four types of requirements quality problems. The proposed quality criteria were selected and adapted from three established user story quality frameworks: the QUS framework, the Agile Requirements Verification framework, and the INVEST framework.
To assess the potential effectiveness of AmbiTRUS, a controlled laboratory experiment with advanced MSc students representing novice practitioners of the intended users of the framework. While the experiment did not demonstrate clear effectiveness, users found the framework useful despite its complexity.
Insights from the experiment allowed redefining the framework's quality criteria. The main lesson learned from the experiment is the need for tool support in applying AmbiTRUS, particularly using NLP techniques to verify the quality criteria. The development of such an NLP-based tool and the evaluation of AmbiTRUS through a usability study of the tool are the next steps in our research.
{"title":"The AmbiTRUS framework for identifying potential ambiguity in user stories","authors":"Anis R. Amna ,&nbsp;Yves Wautelet ,&nbsp;Stephan Poelmans ,&nbsp;Samedi Heng ,&nbsp;Geert Poels","doi":"10.1016/j.jss.2025.112357","DOIUrl":"10.1016/j.jss.2025.112357","url":null,"abstract":"<div><div>Ambiguity in natural language-based requirements is a well-known issue, often addressed as a singular problem despite its complexity. Studies reveal that ambiguity in user stories can manifest differently depending on the linguistic levels.</div><div>This study introduces the ambiguity analysis framework (AmbiTRUS) to address these diverse manifestations by composing quality criteria for 13 types of ambiguity problems, classified across four linguistic levels and linked to four types of requirements quality problems. The proposed quality criteria were selected and adapted from three established user story quality frameworks: the QUS framework, the Agile Requirements Verification framework, and the INVEST framework.</div><div>To assess the potential effectiveness of AmbiTRUS, a controlled laboratory experiment with advanced MSc students representing novice practitioners of the intended users of the framework. While the experiment did not demonstrate clear effectiveness, users found the framework useful despite its complexity.</div><div>Insights from the experiment allowed redefining the framework's quality criteria. The main lesson learned from the experiment is the need for tool support in applying AmbiTRUS, particularly using NLP techniques to verify the quality criteria. The development of such an NLP-based tool and the evaluation of AmbiTRUS through a usability study of the tool are the next steps in our research.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112357"},"PeriodicalIF":3.7,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143215070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UVL: Feature modelling with the Universal Variability Language
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-31 DOI: 10.1016/j.jss.2024.112326
David Benavides , Chico Sundermann , Kevin Feichtinger , José A. Galindo , Rick Rabiser , Thomas Thüm
Feature modelling is a cornerstone of software product line engineering, providing a means to represent software variability through features and their relationships. Since its inception in 1990, feature modelling has evolved through various extensions, and after three decades of development, there is a growing consensus on the need for a standardised feature modelling language. Despite multiple endeavours to standardise variability modelling and the creation of various textual languages, researchers and practitioners continue to use their own approaches, impeding effective model sharing. In 2018, a collaborative initiative was launched by a group of researchers to develop a novel textual language for representing feature models. This paper introduces the outcome of this effort: the Universal Variability Language (UVL), which is designed to be human-readable and serves as a pivot language for diverse software engineering tools. The development of UVL drew upon community feedback and leveraged established literature in the field of variability modelling. The language is structured into three levels – Boolean, Arithmetic, and Type – and allows for language extensions to introduce additional constructs enhancing its expressiveness. UVL is integrated into various existing software tools, such as FeatureIDE and flamapy, and is maintained by a consortium of institutions. All tools that support the language are released in an open-source format, complemented by dedicated parser implementations for Python and Java. Beyond academia, UVL has found adoption within a range of institutions and companies. It is envisaged that UVL will become the language of choice in the future for a multitude of purposes, including knowledge sharing, educational instruction, and tool integration and interoperability. We envision UVL as a pivotal solution, addressing the limitations of prior attempts and fostering collaboration and innovation in the domain of software product line engineering.
特征建模是软件产品生产线工程的基石,它提供了一种通过特征及其关系来表示软件可变性的方法。自 1990 年诞生以来,特征建模通过各种扩展不断发展,经过三十年的发展,人们越来越一致地认为需要一种标准化的特征建模语言。尽管在标准化可变性建模和创建各种文本语言方面做出了多种努力,但研究人员和从业人员仍在使用各自的方法,阻碍了模型的有效共享。2018 年,一群研究人员发起了一项合作倡议,旨在开发一种用于表示特征模型的新型文本语言。本文介绍了这一努力的成果:通用可变性语言(UVL),该语言设计为人类可读,可作为各种软件工程工具的枢纽语言。UVL 的开发借鉴了社区的反馈意见,并利用了可变性建模领域的既有文献。该语言分为三个层次--布尔、算术和类型,并允许语言扩展,以引入更多的结构来增强其表达能力。UVL 已集成到 FeatureIDE 和 flamapy 等多种现有软件工具中,并由一个机构联盟负责维护。所有支持该语言的工具都以开源格式发布,并辅以 Python 和 Java 专用解析器实现。除学术界外,UVL 还被许多机构和公司所采用。据设想,未来 UVL 将成为多种用途的首选语言,包括知识共享、教育教学、工具集成和互操作性。我们设想 UVL 将成为一个关键的解决方案,解决以往尝试的局限性,促进软件产品生产线工程领域的合作与创新。
{"title":"UVL: Feature modelling with the Universal Variability Language","authors":"David Benavides ,&nbsp;Chico Sundermann ,&nbsp;Kevin Feichtinger ,&nbsp;José A. Galindo ,&nbsp;Rick Rabiser ,&nbsp;Thomas Thüm","doi":"10.1016/j.jss.2024.112326","DOIUrl":"10.1016/j.jss.2024.112326","url":null,"abstract":"<div><div>Feature modelling is a cornerstone of software product line engineering, providing a means to represent software variability through features and their relationships. Since its inception in 1990, feature modelling has evolved through various extensions, and after three decades of development, there is a growing consensus on the need for a standardised feature modelling language. Despite multiple endeavours to standardise variability modelling and the creation of various textual languages, researchers and practitioners continue to use their own approaches, impeding effective model sharing. In 2018, a collaborative initiative was launched by a group of researchers to develop a novel textual language for representing feature models. This paper introduces the outcome of this effort: the Universal Variability Language (<span>UVL</span>), which is designed to be human-readable and serves as a pivot language for diverse software engineering tools. The development of <span>UVL</span> drew upon community feedback and leveraged established literature in the field of variability modelling. The language is structured into three levels – Boolean, Arithmetic, and Type – and allows for language extensions to introduce additional constructs enhancing its expressiveness. <span>UVL</span> is integrated into various existing software tools, such as FeatureIDE and flamapy, and is maintained by a consortium of institutions. All tools that support the language are released in an open-source format, complemented by dedicated parser implementations for Python and Java. Beyond academia, <span>UVL</span> has found adoption within a range of institutions and companies. It is envisaged that <span>UVL</span> will become the language of choice in the future for a multitude of purposes, including knowledge sharing, educational instruction, and tool integration and interoperability. We envision <span>UVL</span> as a pivotal solution, addressing the limitations of prior attempts and fostering collaboration and innovation in the domain of software product line engineering.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"225 ","pages":"Article 112326"},"PeriodicalIF":3.7,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143479175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical analysis of feature fusion task heads of ViT pre-trained models on OOD classification tasks
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-30 DOI: 10.1016/j.jss.2025.112358
Mingxing Zhang, Jun Ai, Tao Shi
ViT pre-training model has been widely used in various downstream tasks, and the structure of task head has a significant impact on downstream tasks. While it is a common practice to empirically concatenate the last few layers’ cls token of the ViT model for classification, there exists limited research on whether the feature fusion structure holds significance for the model. This paper primarily discusses the impact of attention-mechanism-based fusion structure on the backbone network and classification performance. Initially, we examine the relationship between dataset and feature fusion task head, followed by an exploration of how different locations of fusion middle layer affect model performance as well as how feature fusion task head influences the backbone network itself. Finally, we characterize the task head through the loss of models based on feature fusion structure. Based on empirical findings, we identify 5 important insights and provide recommendations for the model structures during downstream task fine-tuning.
{"title":"An empirical analysis of feature fusion task heads of ViT pre-trained models on OOD classification tasks","authors":"Mingxing Zhang,&nbsp;Jun Ai,&nbsp;Tao Shi","doi":"10.1016/j.jss.2025.112358","DOIUrl":"10.1016/j.jss.2025.112358","url":null,"abstract":"<div><div>ViT pre-training model has been widely used in various downstream tasks, and the structure of task head has a significant impact on downstream tasks. While it is a common practice to empirically concatenate the last few layers’ cls token of the ViT model for classification, there exists limited research on whether the feature fusion structure holds significance for the model. This paper primarily discusses the impact of attention-mechanism-based fusion structure on the backbone network and classification performance. Initially, we examine the relationship between dataset and feature fusion task head, followed by an exploration of how different locations of fusion middle layer affect model performance as well as how feature fusion task head influences the backbone network itself. Finally, we characterize the task head through the loss of models based on feature fusion structure. Based on empirical findings, we identify 5 important insights and provide recommendations for the model structures during downstream task fine-tuning.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112358"},"PeriodicalIF":3.7,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CG-FL: A data augmentation approach using context-aware genetic algorithm for fault localization
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-29 DOI: 10.1016/j.jss.2025.112359
Jian Hu
Fault localization (FL) is a critical step in software debugging. Coverage-based fault localization (CFL) as one of the most promising FL technique utilizes coverage information obtained from program entities executed by test cases to determine the entities that are more likely to be faulty. However, CFL faces two main issues that limit its effectiveness. Firstly, the code coverage data contains numerous irrelevant statements for the observed failure, which makes the search scope too large for FL. Secondly, the input coverage data is highly imbalanced due to the presence of significantly more passing test cases than failing test cases, which makes the FL model bias to the passing test cases. To address these problems, we propose CG-FL, a data augmentation approach using context-aware genetic algorithm. Specifically, CG-FL first uses program slicing to construct a failure context for FL. Subsequently, CG-FL generate synthesized failing test cases through the application of the genetic algorithm. To evaluate the effectiveness of CG-FL, we compared it with six state-of-the-art FL methods and three representative data augmentation methods on 420 versions of 9 benchmarks. The experimental findings clearly indicate that CG-FL substantially enhances the effectiveness of the six FL methods and outperforms the three data augmentation methods.
{"title":"CG-FL: A data augmentation approach using context-aware genetic algorithm for fault localization","authors":"Jian Hu","doi":"10.1016/j.jss.2025.112359","DOIUrl":"10.1016/j.jss.2025.112359","url":null,"abstract":"<div><div>Fault localization (FL) is a critical step in software debugging. Coverage-based fault localization (CFL) as one of the most promising FL technique utilizes coverage information obtained from program entities executed by test cases to determine the entities that are more likely to be faulty. However, CFL faces two main issues that limit its effectiveness. Firstly, the code coverage data contains numerous irrelevant statements for the observed failure, which makes the search scope too large for FL. Secondly, the input coverage data is highly imbalanced due to the presence of significantly more passing test cases than failing test cases, which makes the FL model bias to the passing test cases. To address these problems, we propose CG-FL, a data augmentation approach using context-aware genetic algorithm. Specifically, CG-FL first uses program slicing to construct a failure context for FL. Subsequently, CG-FL generate synthesized failing test cases through the application of the genetic algorithm. To evaluate the effectiveness of CG-FL, we compared it with six state-of-the-art FL methods and three representative data augmentation methods on 420 versions of 9 benchmarks. The experimental findings clearly indicate that CG-FL substantially enhances the effectiveness of the six FL methods and outperforms the three data augmentation methods.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112359"},"PeriodicalIF":3.7,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143104116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MPLinker: Multi-template Prompt-tuning with adversarial training for Issue–commit Link recovery
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1016/j.jss.2025.112351
Bangchao Wang , Yang Deng , Ruiqi Luo , Peng Liang , Tingting Bi
In recent years, the pre-training, prompting and prediction paradigm, known as prompt-tuning, has achieved significant success in Natural Language Processing (NLP). Issue–commit Link Recovery (ILR) in Software Traceability (ST) plays an important role in improving the reliability, quality, and security of software systems. The current ILR methods convert the ILR into a classification task using pre-trained language models (PLMs) and dedicated neural networks. These methods do not fully utilize the semantic information embedded in PLMs, failing to achieve acceptable performance. To address this limitation, we introduce a novel paradigm: Multi-template Prompt-tuning with adversarial training for issue–commit Link recovery (MPLinker). MPLinker redefines the ILR task as a cloze task via template-based prompt-tuning and incorporates adversarial training to enhance model generalization and reduce overfitting. We evaluated MPLinker on six open-source projects using a comprehensive set of performance metrics. The experiment results demonstrate that MPLinker achieves an average F1-score of 96.10%, Precision of 96.49%, Recall of 95.92%, MCC of 94.04%, AUC of 96.05%, and ACC of 98.15%, significantly outperforming existing state-of-the-art methods. Overall, MPLinker improves the performance and generalization of ILR models and introduces innovative concepts and methods for ILR. The replication package for MPLinker is available at https://github.com/WTU-intelligent-software-development/MPLinker.
{"title":"MPLinker: Multi-template Prompt-tuning with adversarial training for Issue–commit Link recovery","authors":"Bangchao Wang ,&nbsp;Yang Deng ,&nbsp;Ruiqi Luo ,&nbsp;Peng Liang ,&nbsp;Tingting Bi","doi":"10.1016/j.jss.2025.112351","DOIUrl":"10.1016/j.jss.2025.112351","url":null,"abstract":"<div><div>In recent years, the pre-training, prompting and prediction paradigm, known as prompt-tuning, has achieved significant success in Natural Language Processing (NLP). Issue–commit Link Recovery (ILR) in Software Traceability (ST) plays an important role in improving the reliability, quality, and security of software systems. The current ILR methods convert the ILR into a classification task using pre-trained language models (PLMs) and dedicated neural networks. These methods do not fully utilize the semantic information embedded in PLMs, failing to achieve acceptable performance. To address this limitation, we introduce a novel paradigm: <strong>Multi-template Prompt-tuning</strong> with adversarial training for issue–commit <strong>Link</strong> recovery (MPLinker). MPLinker redefines the ILR task as a cloze task via template-based prompt-tuning and incorporates adversarial training to enhance model generalization and reduce overfitting. We evaluated MPLinker on six open-source projects using a comprehensive set of performance metrics. The experiment results demonstrate that MPLinker achieves an average F1-score of 96.10%, Precision of 96.49%, Recall of 95.92%, MCC of 94.04%, AUC of 96.05%, and ACC of 98.15%, significantly outperforming existing state-of-the-art methods. Overall, MPLinker improves the performance and generalization of ILR models and introduces innovative concepts and methods for ILR. The replication package for MPLinker is available at <span><span>https://github.com/WTU-intelligent-software-development/MPLinker</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112351"},"PeriodicalIF":3.7,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143350866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Different approaches for testing body sensor network applications
IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1016/j.jss.2025.112336
Samira Silva , Ricardo Caldas , Patrizio Pelliccione , Antonia Bertolino
Body Sensor Networks (BSNs) offer a cost-effective way to monitor patients’ health and detect potential risks. Despite the growing interest attracted by BSNs, there is a lack of testing approaches for them. Testing a Body Sensor Network (BSN) is challenging due to its evolving nature, the complexity of sensor scenarios and their fusion, the potential necessity of third-party testing for certification, and the need to prioritize critical failures given limited resources. This paper addresses these challenges by proposing three BSN testing approaches: PASTA, ValComb, and TransCov. These approaches share common characteristics, which are described through a general framework called GATE4BSN. PASTA simulates patients with sensors and models sensor trends using a Discrete Time Markov Chain (DTMC). ValComb explores various health conditions by considering all sensor risk level combinations, while TransCov ensures full coverage of DTMC transitions. We empirically evaluate these approaches, comparing them with a baseline approach in terms of failure detection. The results demonstrate that PASTA, ValComb, and TransCov uncover previously undetected failures in an open-source BSN and outperform the baseline approach. Statistical analysis reveals that PASTA is the most effective, while ValComb is 76 times faster than PASTA and nearly as effective.
{"title":"Different approaches for testing body sensor network applications","authors":"Samira Silva ,&nbsp;Ricardo Caldas ,&nbsp;Patrizio Pelliccione ,&nbsp;Antonia Bertolino","doi":"10.1016/j.jss.2025.112336","DOIUrl":"10.1016/j.jss.2025.112336","url":null,"abstract":"<div><div>Body Sensor Networks (BSNs) offer a cost-effective way to monitor patients’ health and detect potential risks. Despite the growing interest attracted by BSNs, there is a lack of testing approaches for them. Testing a Body Sensor Network (BSN) is challenging due to its evolving nature, the complexity of sensor scenarios and their fusion, the potential necessity of third-party testing for certification, and the need to prioritize critical failures given limited resources. This paper addresses these challenges by proposing three BSN testing approaches: PASTA, ValComb, and TransCov. These approaches share common characteristics, which are described through a general framework called GATE4BSN. PASTA simulates patients with sensors and models sensor trends using a Discrete Time Markov Chain (DTMC). ValComb explores various health conditions by considering all sensor risk level combinations, while TransCov ensures full coverage of DTMC transitions. We empirically evaluate these approaches, comparing them with a baseline approach in terms of failure detection. The results demonstrate that PASTA, ValComb, and TransCov uncover previously undetected failures in an open-source BSN and outperform the baseline approach. Statistical analysis reveals that PASTA is the most effective, while ValComb is 76 times faster than PASTA and nearly as effective.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"223 ","pages":"Article 112336"},"PeriodicalIF":3.7,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143377629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Systems and Software
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1