Pub Date : 2025-03-04DOI: 10.1016/j.jss.2025.112407
Jefferson L. Santos , Luiz Eduardo G. Martins , Jefferson Seide Molléri
Collaboration and easy data exchange are crucial in modern systems that involve hardware, electronics, software, and users. Requirement Engineering (RE) and Systems Engineering (SE) are challenging fields that require tool support to automate activities. Natural language (NL) requirement documents can create processing issues. To address these issues, detailed models have been developed to represent a system effectively. These models are intend to replace inconsistent documents over time by using model-based methodologies like Model-Based SE (MBSE). Within the MBSE methodologies, Arcadia/Capella has proven its capabilities as a comprehensive tool in the SE community to define and validate complex system architecture. Thus, this paper aims to investigate the tools, methods, techniques, or processes for extracting requirements from the MBSE environment or model generation from NL requirements. Furthermore, this discusses how these approaches are applied specifically in the Arcadia/Capella and how transforming requirements are addressed to support textual requirements. We conducted a systematic literature review (SLR) by selecting 97 articles to examine advances in this field in various aspects of these approaches. The results presented in this SLR uncovered several key findings that have important implications for future research, such as the dominance of the model generation from NL; transforming model-based requirements to NL requires more data; and the fact that requirements extraction in Arcadia/Capella needs more evidence.
{"title":"Requirements extraction from model-based systems engineering: A systematic literature review","authors":"Jefferson L. Santos , Luiz Eduardo G. Martins , Jefferson Seide Molléri","doi":"10.1016/j.jss.2025.112407","DOIUrl":"10.1016/j.jss.2025.112407","url":null,"abstract":"<div><div>Collaboration and easy data exchange are crucial in modern systems that involve hardware, electronics, software, and users. Requirement Engineering (RE) and Systems Engineering (SE) are challenging fields that require tool support to automate activities. Natural language (NL) requirement documents can create processing issues. To address these issues, detailed models have been developed to represent a system effectively. These models are intend to replace inconsistent documents over time by using model-based methodologies like Model-Based SE (MBSE). Within the MBSE methodologies, Arcadia/Capella has proven its capabilities as a comprehensive tool in the SE community to define and validate complex system architecture. Thus, this paper aims to investigate the tools, methods, techniques, or processes for extracting requirements from the MBSE environment or model generation from NL requirements. Furthermore, this discusses how these approaches are applied specifically in the Arcadia/Capella and how transforming requirements are addressed to support textual requirements. We conducted a systematic literature review (SLR) by selecting 97 articles to examine advances in this field in various aspects of these approaches. The results presented in this SLR uncovered several key findings that have important implications for future research, such as the dominance of the model generation from NL; transforming model-based requirements to NL requires more data; and the fact that requirements extraction in Arcadia/Capella needs more evidence.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112407"},"PeriodicalIF":3.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developers continuously interact in crowd-sourced community-based question-answer (Q&A) sites. Reportedly, 30% of all software professionals visit the most popular Q&A site StackOverflow (SO) every day. Software engineering (SE) research studies are also increasingly using SO data. To find out the trend, implication, impact, and future research potential utilizing SO data, a systematic mapping study needs to be conducted. Following a rigorous reproducible mapping study approach, from 18 reputed SE journals and conferences, we collected 384 SO-based research articles and categorized them into 10 facets (i.e., themes). We found that SO contributes to 85% of SE research compared with popular Q&A sites such as Quora, and Reddit. We found that 18 SE domains directly benefited from SO data whereas Recommender Systems, and API Design and Evolution domains use SO data the most (15% and 16% of all SO-based research studies, respectively). API Design and Evolution, and Machine Learning with/for SE domains have consistent upward publication. Deep Learning Bug Analysis and Code Cloning research areas have the highest potential research impact recently. With the insights, recommendations, and facet-based categorized paper list from this mapping study, SE researchers can find out potential research areas according to their interest to utilize large-scale SO data.
{"title":"A systematic mapping study of crowd knowledge enhanced software engineering research using Stack Overflow","authors":"Minaoar Hossain Tanzil , Shaiful Chowdhury , Somayeh Modaberi , Gias Uddin , Hadi Hemmati","doi":"10.1016/j.jss.2025.112405","DOIUrl":"10.1016/j.jss.2025.112405","url":null,"abstract":"<div><div>Developers continuously interact in crowd-sourced community-based question-answer (Q&A) sites. Reportedly, <span><math><mo>∼</mo></math></span>30% of all software professionals visit the most popular Q&A site StackOverflow (SO) every day. Software engineering (SE) research studies are also increasingly using SO data. To find out the trend, implication, impact, and future research potential utilizing SO data, a systematic mapping study needs to be conducted. Following a rigorous reproducible mapping study approach, from 18 reputed SE journals and conferences, we collected 384 SO-based research articles and categorized them into 10 facets (i.e., themes). We found that SO contributes to 85% of SE research compared with popular Q&A sites such as Quora, and Reddit. We found that 18 SE domains directly benefited from SO data whereas <em>Recommender Systems</em>, and <em>API Design and Evolution</em> domains use SO data the most (15% and 16% of all SO-based research studies, respectively). <em>API Design and Evolution</em>, and <em>Machine Learning with/for SE</em> domains have consistent upward publication. <em>Deep Learning Bug Analysis</em> and <em>Code Cloning</em> research areas have the highest potential research impact recently. With the insights, recommendations, and facet-based categorized paper list from this mapping study, SE researchers can find out potential research areas according to their interest to utilize large-scale SO data.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112405"},"PeriodicalIF":3.7,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.jss.2025.112406
Majd Soud, Waltteri Nuutinen, Grischa Liebel
Context:
Modern blockchain, such as Ethereum, supports the deployment and execution of so-called smart contracts, autonomous digital programs with significant value of cryptocurrency. Executing smart contracts requires gas costs paid by users, which define the limits of the contract’s execution. Logic vulnerabilities in smart contracts can lead to excessive gas consumption, financial losses, and are often the root cause of high-impact cyberattacks.
Objective:
Our objective is threefold: (i) empirically investigate logic vulnerabilities in real-world smart contracts extracted from code changes on GitHub, (ii) introduce Sóley, an automated method for detecting logic vulnerabilities in smart contracts, leveraging Large Language Models (LLMs), and (iii) examine mitigation strategies employed by smart contract developers to address these vulnerabilities in real-world scenarios.
Method:
We obtained smart contracts and related code changes from GitHub. To address the first and third objectives, we qualitatively investigated available logic vulnerabilities using an open coding method. We identified these vulnerabilities and their mitigation strategies. For the second objective, we extracted various logic vulnerabilities, focusing on those containing inline assembly fragments. We then applied preprocessing techniques and trained the proposed Sóley model. We evaluated Sóley along with the performance of various LLMs and compared the results with the state-of-the-art baseline on the task of logic vulnerability detection.
Results:
Our results include the curation of a large-scale dataset comprising 50,000 Ethereum smart contracts, with a total of 428,569 labeled instances of smart contract vulnerabilities, including 171,180 logic-related vulnerabilities. Our analysis uncovered nine novel logic vulnerabilities, which we used to extend existing taxonomies. Furthermore, we introduced several mitigation strategies extracted from observed developer modifications in real-world scenarios. Experimental results show that Sóley outperforms existing approaches in automatically identifying logic vulnerabilities, achieving a 9% improvement in accuracy and a maximum improvement of 24% in F1-measure over the Baseline. Interestingly, the efficacy of LLMs in this task was evident with minimal feature engineering. Despite the positive results, Sóley struggles to identify certain classes of logic vulnerabilities, which remain for future work.
Conclusion:
Early identification of logic vulnerabilities from code changes can provide valuable insights into their detection and mitigation. Recent advancements, such as LLMs, show promise in detecting logic vulnerabilities and contributing to smart contract security and sustainability.
{"title":"Sóley: Automated detection of logic vulnerabilities in Ethereum smart contracts using large language models","authors":"Majd Soud, Waltteri Nuutinen, Grischa Liebel","doi":"10.1016/j.jss.2025.112406","DOIUrl":"10.1016/j.jss.2025.112406","url":null,"abstract":"<div><h3>Context:</h3><div>Modern blockchain, such as Ethereum, supports the deployment and execution of so-called smart contracts, autonomous digital programs with significant value of cryptocurrency. Executing smart contracts requires gas costs paid by users, which define the limits of the contract’s execution. Logic vulnerabilities in smart contracts can lead to excessive gas consumption, financial losses, and are often the root cause of high-impact cyberattacks.</div></div><div><h3>Objective:</h3><div>Our objective is threefold: (i) empirically investigate logic vulnerabilities in real-world smart contracts extracted from code changes on GitHub, (ii) introduce Sóley, an automated method for detecting logic vulnerabilities in smart contracts, leveraging Large Language Models (LLMs), and (iii) examine mitigation strategies employed by smart contract developers to address these vulnerabilities in real-world scenarios.</div></div><div><h3>Method:</h3><div>We obtained smart contracts and related code changes from GitHub. To address the first and third objectives, we qualitatively investigated available logic vulnerabilities using an open coding method. We identified these vulnerabilities and their mitigation strategies. For the second objective, we extracted various logic vulnerabilities, focusing on those containing inline assembly fragments. We then applied preprocessing techniques and trained the proposed Sóley model. We evaluated Sóley along with the performance of various LLMs and compared the results with the state-of-the-art baseline on the task of logic vulnerability detection.</div></div><div><h3>Results:</h3><div>Our results include the curation of a large-scale dataset comprising 50,000 Ethereum smart contracts, with a total of 428,569 labeled instances of smart contract vulnerabilities, including 171,180 logic-related vulnerabilities. Our analysis uncovered nine novel logic vulnerabilities, which we used to extend existing taxonomies. Furthermore, we introduced several mitigation strategies extracted from observed developer modifications in real-world scenarios. Experimental results show that Sóley outperforms existing approaches in automatically identifying logic vulnerabilities, achieving a 9% improvement in accuracy and a maximum improvement of 24% in F1-measure over the Baseline. Interestingly, the efficacy of LLMs in this task was evident with minimal feature engineering. Despite the positive results, Sóley struggles to identify certain classes of logic vulnerabilities, which remain for future work.</div></div><div><h3>Conclusion:</h3><div>Early identification of logic vulnerabilities from code changes can provide valuable insights into their detection and mitigation. Recent advancements, such as LLMs, show promise in detecting logic vulnerabilities and contributing to smart contract security and sustainability.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112406"},"PeriodicalIF":3.7,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143550871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-27DOI: 10.1016/j.jss.2025.112392
Daniel Russo
The COVID-19 pandemic triggered an unprecedented transformation in the educational landscape, requiring universities to swiftly pivot from in-person to online instruction. This rapid transition left many educators navigating the complexities of remote teaching for the first time. Now that we have moved past the pandemic, we present a critical retrospective study to analyze and assess the remote teaching practices employed during this challenging period. By conducting a cross-sectional analysis of 300 computer science students who experienced a full year of online education during the lockdown, we discovered that while remote teaching practices had a moderate impact on learning outcomes, they significantly influenced student satisfaction. Importantly, these trends were not isolated; they reflect a shared experience across various demographics, including country, gender, and educational background. This research delivers vital evidence-based recommendations that can guide educational strategies in the event of future challenges. By applying these insights, we can enhance both student satisfaction and the effectiveness of learning in online settings, ensuring that we are better prepared for whatever lies ahead.
{"title":"Pandemic pedagogy: Evaluating remote education strategies during COVID-19","authors":"Daniel Russo","doi":"10.1016/j.jss.2025.112392","DOIUrl":"10.1016/j.jss.2025.112392","url":null,"abstract":"<div><div>The COVID-19 pandemic triggered an unprecedented transformation in the educational landscape, requiring universities to swiftly pivot from in-person to online instruction. This rapid transition left many educators navigating the complexities of remote teaching for the first time. Now that we have moved past the pandemic, we present a critical retrospective study to analyze and assess the remote teaching practices employed during this challenging period. By conducting a cross-sectional analysis of 300 computer science students who experienced a full year of online education during the lockdown, we discovered that while remote teaching practices had a moderate impact on learning outcomes, they significantly influenced student satisfaction. Importantly, these trends were not isolated; they reflect a shared experience across various demographics, including country, gender, and educational background. This research delivers vital evidence-based recommendations that can guide educational strategies in the event of future challenges. By applying these insights, we can enhance both student satisfaction and the effectiveness of learning in online settings, ensuring that we are better prepared for whatever lies ahead.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112392"},"PeriodicalIF":3.7,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143526927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.1016/j.jss.2025.112391
Nirnaya Tripathi
NoSQL databases are essential for managing modern data-intensive applications. While SQL education is a crucial part of the software engineering and computer science curriculum, it is insufficient in addressing the rise of big data and cloud infrastructures. Despite extensive research on SQL education, there is limited exploration of NoSQL education, particularly in teaching methods and data models. This study addresses this gap by conducting a systematic literature review on NoSQL database education, aiming to assess current research, teaching practices, models, tools, scalability, and security mechanisms while offering a framework for integrating NoSQL into academic curricula. Out of 386 articles, 28 were selected for detailed analysis, focusing on NoSQL teaching methods, models, and curriculum development. Findings revealed that document-oriented and graph databases, especially MongoDB, Cassandra, and Neo4j, are the most taught. The project-based learning approach was the most common teaching method. Challenges identified include adapting to technological advancements, addressing diverse student needs, and the shift to online learning. This review contributes valuable insights into NoSQL education and offers recommendations for improving teaching practices in software engineering curricula.
{"title":"NoSQL database education: A review of models, tools and teaching methods","authors":"Nirnaya Tripathi","doi":"10.1016/j.jss.2025.112391","DOIUrl":"10.1016/j.jss.2025.112391","url":null,"abstract":"<div><div>NoSQL databases are essential for managing modern data-intensive applications. While SQL education is a crucial part of the software engineering and computer science curriculum, it is insufficient in addressing the rise of big data and cloud infrastructures. Despite extensive research on SQL education, there is limited exploration of NoSQL education, particularly in teaching methods and data models. This study addresses this gap by conducting a systematic literature review on NoSQL database education, aiming to assess current research, teaching practices, models, tools, scalability, and security mechanisms while offering a framework for integrating NoSQL into academic curricula. Out of 386 articles, 28 were selected for detailed analysis, focusing on NoSQL teaching methods, models, and curriculum development. Findings revealed that document-oriented and graph databases, especially MongoDB, Cassandra, and Neo4j, are the most taught. The project-based learning approach was the most common teaching method. Challenges identified include adapting to technological advancements, addressing diverse student needs, and the shift to online learning. This review contributes valuable insights into NoSQL education and offers recommendations for improving teaching practices in software engineering curricula.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112391"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.1016/j.jss.2025.112390
Hussain Ahmad , Christoph Treude , Markus Wagner , Claudia Szabo
Microservice architectures have become increasingly popular in both academia and industry, providing enhanced agility, elasticity, and maintainability in software development and deployment. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existing HPAs are not suitable for resource-constrained environments, as they make scaling decisions based on the individual resource capacities of microservices, leading to service unavailability, resource mismanagement, and financial losses. Furthermore, the inherent delay in initializing and terminating microservice pods hinders HPAs from timely responding to workload fluctuations, further exacerbating these issues. To address these concerns, we propose Smart HPA and ProSmart HPA, reactive and proactive resource-efficient horizontal pod auto-scalers respectively. Smart HPA employs a reactive scaling policy that facilitates resource exchange among microservices, optimizing auto-scaling in resource-constrained environments. For ProSmart HPA, we develop a machine-learning-driven resource-efficient scaling policy that proactively manages resource demands to address delays caused by microservice pod startup and termination, while enabling preemptive resource sharing in resource-constrained environments. Our experimental results show that Smart HPA outperforms the Kubernetes baseline HPA, while ProSmart HPA exceeds both Smart HPA and Kubernetes HPA by reducing resource overutilization, overprovisioning, and underprovisioning, and increasing resource allocation to microservice applications.
{"title":"Towards resource-efficient reactive and proactive auto-scaling for microservice architectures","authors":"Hussain Ahmad , Christoph Treude , Markus Wagner , Claudia Szabo","doi":"10.1016/j.jss.2025.112390","DOIUrl":"10.1016/j.jss.2025.112390","url":null,"abstract":"<div><div>Microservice architectures have become increasingly popular in both academia and industry, providing enhanced agility, elasticity, and maintainability in software development and deployment. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existing HPAs are not suitable for resource-constrained environments, as they make scaling decisions based on the individual resource capacities of microservices, leading to service unavailability, resource mismanagement, and financial losses. Furthermore, the inherent delay in initializing and terminating microservice pods hinders HPAs from timely responding to workload fluctuations, further exacerbating these issues. To address these concerns, we propose Smart HPA and ProSmart HPA, reactive and proactive resource-efficient horizontal pod auto-scalers respectively. Smart HPA employs a reactive scaling policy that facilitates resource exchange among microservices, optimizing auto-scaling in resource-constrained environments. For ProSmart HPA, we develop a machine-learning-driven resource-efficient scaling policy that proactively manages resource demands to address delays caused by microservice pod startup and termination, while enabling preemptive resource sharing in resource-constrained environments. Our experimental results show that Smart HPA outperforms the Kubernetes baseline HPA, while ProSmart HPA exceeds both Smart HPA and Kubernetes HPA by reducing resource overutilization, overprovisioning, and underprovisioning, and increasing resource allocation to microservice applications.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"225 ","pages":"Article 112390"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.1016/j.jss.2025.112389
Wenjie Ding , Zhihao Liu , Xuhui Lu , Xiaoting Du , Zheng Zheng
As machine learning (ML) systems continue to evolve and be applied, their user base and system size also expand. This expansion is particularly evident with the widespread adoption of large language models. Currently, the infrastructure supporting ML systems, such as cloud services and computing hardware, which are increasingly becoming foundational to the ML system environment, is increasingly adopted to support continuous training and inference services. Nevertheless, it has been shown that the increased data volume, complexity of computations, and extended run times challenge the stability of ML systems, efficiency, and availability, precipitating system aging. To address this issue, we develop a novel solution, KPAMA, leveraging Kubernetes, the leading container orchestration platform, to enhance the autoscaling of computing workflows and resources, effectively mitigating system aging. KPAMA employs a hybrid model to predict key aging metrics and uses decision and anti-oscillation algorithms to achieve system resource autoscaling. Our experiments indicate that KPAMA markedly mitigates system aging and enhances task reliability compared to the standard Horizontal Pod Autoscaler and systems without scaling capabilities.
{"title":"KPAMA: A Kubernetes based tool for Mitigating ML system Aging","authors":"Wenjie Ding , Zhihao Liu , Xuhui Lu , Xiaoting Du , Zheng Zheng","doi":"10.1016/j.jss.2025.112389","DOIUrl":"10.1016/j.jss.2025.112389","url":null,"abstract":"<div><div>As machine learning (ML) systems continue to evolve and be applied, their user base and system size also expand. This expansion is particularly evident with the widespread adoption of large language models. Currently, the infrastructure supporting ML systems, such as cloud services and computing hardware, which are increasingly becoming foundational to the ML system environment, is increasingly adopted to support continuous training and inference services. Nevertheless, it has been shown that the increased data volume, complexity of computations, and extended run times challenge the stability of ML systems, efficiency, and availability, precipitating system aging. To address this issue, we develop a novel solution, KPAMA, leveraging Kubernetes, the leading container orchestration platform, to enhance the autoscaling of computing workflows and resources, effectively mitigating system aging. KPAMA employs a hybrid model to predict key aging metrics and uses decision and anti-oscillation algorithms to achieve system resource autoscaling. Our experiments indicate that KPAMA markedly mitigates system aging and enhances task reliability compared to the standard Horizontal Pod Autoscaler and systems without scaling capabilities.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112389"},"PeriodicalIF":3.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-24DOI: 10.1016/j.jss.2025.112348
Torkil Clemmensen , Mahyar Tourchi Moghaddam , Jacob Nørbjerg
Understanding and designing Cyber-physical systems (CPS) with humans in the loop (HITL) is a basic cross-scientific research problem with large implications for industry. The current software engineering knowledge already explains how to include the humans in the operation of the machines in terms of interfaces, architectures, adaptive systems, and design methodologies for including the Human-in-the-Loop. This paper extends existing knowledge with a systematic review of socio-technical perspectives on CPS with HITL. The review was software engineering focused, as it searched the body of research on CPS with HITL, and only within that body, those papers that included socio-technical perspectives. The results indicated four main areas in the ST literature. Validating these insights by expert interviews with industry CPS experts showed some alignment and also fundamental differences between the socio-technical literature (ST literature) insights and the industry experts’ viewpoints. The discussion identifies useful crossings between the ST literature and research into CPS with HITL adaption, and touch on the issues of non-alignments in industry practice. The conclusion is that the ST perspectives on the body of knowledge on CPS with HITL has much to offer researchers in terms of innovative ways to look at the HITL, but the literature needs further development before industry experts can effectively use it. Future research possibilities are outlined.
{"title":"Cyber-physical systems with Human-in-the-Loop: A systematic review of socio-technical perspectives","authors":"Torkil Clemmensen , Mahyar Tourchi Moghaddam , Jacob Nørbjerg","doi":"10.1016/j.jss.2025.112348","DOIUrl":"10.1016/j.jss.2025.112348","url":null,"abstract":"<div><div>Understanding and designing Cyber-physical systems (CPS) with humans in the loop (HITL) is a basic cross-scientific research problem with large implications for industry. The current software engineering knowledge already explains how to include the humans in the operation of the machines in terms of interfaces, architectures, adaptive systems, and design methodologies for including the Human-in-the-Loop. This paper extends existing knowledge with a systematic review of socio-technical perspectives on CPS with HITL. The review was software engineering focused, as it searched the body of research on CPS with HITL, and only within that body, those papers that included socio-technical perspectives. The results indicated four main areas in the ST literature. Validating these insights by expert interviews with industry CPS experts showed some alignment and also fundamental differences between the socio-technical literature (ST literature) insights and the industry experts’ viewpoints. The discussion identifies useful crossings between the ST literature and research into CPS with HITL adaption, and touch on the issues of non-alignments in industry practice. The conclusion is that the ST perspectives on the body of knowledge on CPS with HITL has much to offer researchers in terms of innovative ways to look at the HITL, but the literature needs further development before industry experts can effectively use it. Future research possibilities are outlined.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"226 ","pages":"Article 112348"},"PeriodicalIF":3.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143526926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-21DOI: 10.1016/j.jss.2025.112386
JunJie Yu, Yong Li, ZhanDong Liu, QianRen Yang
Graph Neural Networks (GNNs) have gained widespread adoption across various fields due to their superior capability in processing graph-structured data. Nevertheless, these models are susceptible to unintentionally disclosing sensitive user information. Current differential privacy algorithms for graph neural networks exhibit constrained adaptability and prolonged runtimes. To address these issues, this paper introduces an adaptive GNN protection algorithm grounded in differential privacy. The algorithm offers robust privacy safeguards at both node and edge levels, employing a bespoke normalization approach based on mean and variance to effectively manage data non-uniformity and outliers, thereby enhancing the model’s adaptability to diverse data distributions. Furthermore, the implementation of an early stopping strategy markedly decreases runtime while exerting negligible influence on accuracy, thus enhancing computational efficiency. Experimental results indicate that this approach not only improves the model’s predictive accuracy but also significantly reduces its computational time.
{"title":"Adaptive graph neural network protection algorithm based on differential privacy","authors":"JunJie Yu, Yong Li, ZhanDong Liu, QianRen Yang","doi":"10.1016/j.jss.2025.112386","DOIUrl":"10.1016/j.jss.2025.112386","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have gained widespread adoption across various fields due to their superior capability in processing graph-structured data. Nevertheless, these models are susceptible to unintentionally disclosing sensitive user information. Current differential privacy algorithms for graph neural networks exhibit constrained adaptability and prolonged runtimes. To address these issues, this paper introduces an adaptive GNN protection algorithm grounded in differential privacy. The algorithm offers robust privacy safeguards at both node and edge levels, employing a bespoke normalization approach based on mean and variance to effectively manage data non-uniformity and outliers, thereby enhancing the model’s adaptability to diverse data distributions. Furthermore, the implementation of an early stopping strategy markedly decreases runtime while exerting negligible influence on accuracy, thus enhancing computational efficiency. Experimental results indicate that this approach not only improves the model’s predictive accuracy but also significantly reduces its computational time.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"225 ","pages":"Article 112386"},"PeriodicalIF":3.7,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-21DOI: 10.1016/j.jss.2025.112387
Yong Wang , Huadong Zhou , Gui-Fu Lu , Cuiyun Gao , Shuai Meng
A recommendation system is considered unfair when it does not perform equally well for different user groups according to users’ specific attributes. In recent research, the user groups are divided into active user group and inactive user group according to the number of interaction records in a recommendation system. Intuitively, increasing the number of inactive users’ interaction records would improve the fairness of the recommendation system. Existing data augmentation techniques can increase interaction records, however they usually fail to deeply mine user interaction patterns and fail to generate context-related feedback, which cannot effectively improve the quality of recommendations for inactive users. To resolve the problem, we use the Large Language Models (LLMs) to mine user historical interaction records to achieve data augmentation, which improve the quality of recommendations for inactive user groups. Experimental results on four classic baseline recommendation algorithms show that our data augmentation method for the inactive user group can effectively alleviate the poor recommendation quality caused by the low interaction with the recommendation system, reduce the recommendation quality gap with active user group, and further improve the user group fairness of the recommendation system.
{"title":"Improving user-oriented fairness in recommendation via data augmentation: Don’t worry about inactive users","authors":"Yong Wang , Huadong Zhou , Gui-Fu Lu , Cuiyun Gao , Shuai Meng","doi":"10.1016/j.jss.2025.112387","DOIUrl":"10.1016/j.jss.2025.112387","url":null,"abstract":"<div><div>A recommendation system is considered unfair when it does not perform equally well for different user groups according to users’ specific attributes. In recent research, the user groups are divided into active user group and inactive user group according to the number of interaction records in a recommendation system. Intuitively, increasing the number of inactive users’ interaction records would improve the fairness of the recommendation system. Existing data augmentation techniques can increase interaction records, however they usually fail to deeply mine user interaction patterns and fail to generate context-related feedback, which cannot effectively improve the quality of recommendations for inactive users. To resolve the problem, we use the Large Language Models (LLMs) to mine user historical interaction records to achieve data augmentation, which improve the quality of recommendations for inactive user groups. Experimental results on four classic baseline recommendation algorithms show that our data augmentation method for the inactive user group can effectively alleviate the poor recommendation quality caused by the low interaction with the recommendation system, reduce the recommendation quality gap with active user group, and further improve the user group fairness of the recommendation system.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"225 ","pages":"Article 112387"},"PeriodicalIF":3.7,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}