AI technologies are rapidly being integrated into society, offering numerous benefits but also raising significant ethical and social concerns. While some AI systems aim to improve efficiency and decision-making, they can also cause harmful impacts on individuals and society.
Objective:
This study examines both the immediate and systemic negative effects of AI systems, as well as the underlying factors that might contribute to these issues.
Method:
Using a multi-vocal literature review, we analyze 28 AI systems and their associated impacts, including discrimination, psychological and physical harm, and unfair treatment.
Results:
We identify key factors that might have led AI systems to operate in that manner and explain why these impacts may occur. Additionally, we propose initial concrete actions to mitigate these negative effects and promote the development of AI systems that align with ethical and social sustainability principles.
Impact:
By shedding light on these issues, we aim to raise awareness among researchers and developers, encouraging the adoption of more responsible and inclusive as well as concrete AI guidelines.
{"title":"AI systems’ negative social impact and factors","authors":"Nafen Haj Ahmad , Linnea Stigholt , Leticia Duboc , Birgit Penzenstadler","doi":"10.1016/j.infsof.2026.108038","DOIUrl":"10.1016/j.infsof.2026.108038","url":null,"abstract":"<div><h3>Context:</h3><div>AI technologies are rapidly being integrated into society, offering numerous benefits but also raising significant ethical and social concerns. While some AI systems aim to improve efficiency and decision-making, they can also cause harmful impacts on individuals and society.</div></div><div><h3>Objective:</h3><div>This study examines both the immediate and systemic negative effects of AI systems, as well as the underlying factors that might contribute to these issues.</div></div><div><h3>Method:</h3><div>Using a multi-vocal literature review, we analyze 28 AI systems and their associated impacts, including discrimination, psychological and physical harm, and unfair treatment.</div></div><div><h3>Results:</h3><div>We identify key factors that might have led AI systems to operate in that manner and explain why these impacts may occur. Additionally, we propose initial concrete actions to mitigate these negative effects and promote the development of AI systems that align with ethical and social sustainability principles.</div></div><div><h3>Impact:</h3><div>By shedding light on these issues, we aim to raise awareness among researchers and developers, encouraging the adoption of more responsible and inclusive as well as concrete AI guidelines.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108038"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-12DOI: 10.1016/j.infsof.2026.108030
Chao Wen , Jie Liu , Liang Du
While large language models (LLMs) have demonstrated impressive ability in natural language processing (NLP), they are struggling for addressing the code generation tasks with complicated human intent. It is universally recognized that humans require insights into problem descriptions, elaborate plans from collaborative perspectives and consciously organize modules prior to coding implementation. To achieve this aim, we introduce consensus to boost multi-agent prompting approach to code generation tasks by imitating human developers. The insights into consensus among distinct candidate plans are leveraged by LLM agent for mitigating discrepancies. The discrepancies indicate overlooked crucial details that may lead to potential errors. Besides, the consensus plan is exploited to firstly construct code modules at distinct levels and then hierarchically organize them for final code generation. We conduct extensive experiments on eight program synthesis benchmarks, three of which are challenging problem-solving. Experimental results show that the proposed framework showcases the improved reflection on code generation, achieving new state-of-the-art (pass@1) results. Moreover, our approach consistently delivers superior performance across various programming languages and varying problem difficulties. Code available at https://github.com/AISP-group/CPCG.
{"title":"Consensus planning boosts LLM code generation","authors":"Chao Wen , Jie Liu , Liang Du","doi":"10.1016/j.infsof.2026.108030","DOIUrl":"10.1016/j.infsof.2026.108030","url":null,"abstract":"<div><div>While large language models (LLMs) have demonstrated impressive ability in natural language processing (NLP), they are struggling for addressing the code generation tasks with complicated human intent. It is universally recognized that humans require insights into problem descriptions, elaborate plans from collaborative perspectives and consciously organize modules prior to coding implementation. To achieve this aim, we introduce consensus to boost multi-agent prompting approach to code generation tasks by imitating human developers. The insights into consensus among distinct candidate plans are leveraged by LLM agent for mitigating discrepancies. The discrepancies indicate overlooked crucial details that may lead to potential errors. Besides, the consensus plan is exploited to firstly construct code modules at distinct levels and then hierarchically organize them for final code generation. We conduct extensive experiments on eight program synthesis benchmarks, three of which are challenging problem-solving. Experimental results show that the proposed framework showcases the improved reflection on code generation, achieving new state-of-the-art (pass@1) results. Moreover, our approach consistently delivers superior performance across various programming languages and varying problem difficulties. Code available at <span><span>https://github.com/AISP-group/CPCG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108030"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-05DOI: 10.1016/j.infsof.2026.108013
Caner Balim , Naim Karasekreter , Özkan Aslan
Context
The SOLID design principles are fundamental in object-oriented software development, promoting modularity, maintainability, and scalability. Manual verification of these principles in code is often time-consuming and error-prone, especially in large-scale, multilingual projects. Since adherence to SOLID principles is closely linked to software quality, automating this verification can significantly enhance code reliability.
Objectives
This study proposes a machine learning-based approach for the automatic classification of SOLID principle compliance in object-oriented code. Specifically, we investigate the effectiveness of embedding representations generated by three pretrained transformer models: LongCoder and StarCoder2, which are both code-oriented, and BigBird, a general-purpose model, in supporting principle-specific classification across Java and Python codebases.
Methods
We compiled a novel multi-label dataset consisting of 1103 real-world multi-class code units in Java and Python, annotated for compliance with five SOLID principles. Feature embeddings were extracted using the three transformer models. These embeddings were input to six different classifiers per principle. We evaluated model performance using stratified 5-fold cross-validation and reported accuracy, precision, recall, and F1 scores.
Results
Principles with well-defined structural characteristics, such as Interface Segregation (ISP) and Dependency Inversion (DIP), achieved high F1 scores (>90%). Semantically complex principles like Single Responsibility (SRP) and Liskov Substitution (LSP) yielded lower F1 scores (∼70–75%). Among the models, StarCoder2 combined with Multi-Layer Perceptron (MLP) consistently outperformed others across both Java and Python datasets. Statistical analyses confirmed that these performance differences are significant. Furthermore, comparisons with open-source large language models (DeepSeek-Coder-V2 and CodeLlama) demonstrated that the approach yields more stable and interpretable results across all principles.
Conclusion
Machine learning models leveraging code-specific embeddings can accurately identify structurally explicit SOLID principles. Code-oriented transformers such as StarCoder2 and LongCoder outperformed the general-purpose model BigBird, especially for principles requiring nuanced semantic understanding. Beyond its experimental contributions, the study provides practical value by enabling automated design-principle assessment in large codebases, reducing manual inspection effort, and offering a foundation for integration into software quality assurance tools and continuous integration pipelines.
{"title":"Automatic multi-language analysis of SOLID compliance via machine learning algorithms","authors":"Caner Balim , Naim Karasekreter , Özkan Aslan","doi":"10.1016/j.infsof.2026.108013","DOIUrl":"10.1016/j.infsof.2026.108013","url":null,"abstract":"<div><h3>Context</h3><div>The SOLID design principles are fundamental in object-oriented software development, promoting modularity, maintainability, and scalability. Manual verification of these principles in code is often time-consuming and error-prone, especially in large-scale, multilingual projects. Since adherence to SOLID principles is closely linked to software quality, automating this verification can significantly enhance code reliability.</div></div><div><h3>Objectives</h3><div>This study proposes a machine learning-based approach for the automatic classification of SOLID principle compliance in object-oriented code. Specifically, we investigate the effectiveness of embedding representations generated by three pretrained transformer models: LongCoder and StarCoder2, which are both code-oriented, and BigBird, a general-purpose model, in supporting principle-specific classification across Java and Python codebases.</div></div><div><h3>Methods</h3><div>We compiled a novel multi-label dataset consisting of 1103 real-world multi-class code units in Java and Python, annotated for compliance with five SOLID principles. Feature embeddings were extracted using the three transformer models. These embeddings were input to six different classifiers per principle. We evaluated model performance using stratified 5-fold cross-validation and reported accuracy, precision, recall, and F1 scores.</div></div><div><h3>Results</h3><div>Principles with well-defined structural characteristics, such as Interface Segregation (ISP) and Dependency Inversion (DIP), achieved high F1 scores (>90%). Semantically complex principles like Single Responsibility (SRP) and Liskov Substitution (LSP) yielded lower F1 scores (∼70–75%). Among the models, StarCoder2 combined with Multi-Layer Perceptron (MLP) consistently outperformed others across both Java and Python datasets. Statistical analyses confirmed that these performance differences are significant. Furthermore, comparisons with open-source large language models (DeepSeek-Coder-V2 and CodeLlama) demonstrated that the approach yields more stable and interpretable results across all principles.</div></div><div><h3>Conclusion</h3><div>Machine learning models leveraging code-specific embeddings can accurately identify structurally explicit SOLID principles. Code-oriented transformers such as StarCoder2 and LongCoder outperformed the general-purpose model BigBird, especially for principles requiring nuanced semantic understanding. Beyond its experimental contributions, the study provides practical value by enabling automated design-principle assessment in large codebases, reducing manual inspection effort, and offering a foundation for integration into software quality assurance tools and continuous integration pipelines.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108013"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-07DOI: 10.1016/j.infsof.2026.108017
Chin Khor, Robyn R. Lutz
Context:
It is difficult, time-consuming, and error-prone to detect misalignments between the variability requirements in configurable software and the source code intended to implement those requirements.
Objective:
The paper reports progress in checking the consistency between variability requirements and their implementation.
Method:
To automate the consistency checking of variability requirements and variability source code, we create a variability model of configurable features and constraints from the requirements specification. We evaluate the consistency of the variability model against a formal representation of the presence conditions controlling variability in the source code. We generate a traceability-rich consistency dashboard for the developer of any misalignments and a minimal set of configurations providing full variability code coverage for variability testing. The approach is implemented in an open-source prototype tool called VarCHEK.
Results:
VarCHEK was evaluated on three diverse, configurable software projects. VarCHEK accurately identified variability requirements not implemented in the source code, found variabilities in the source code not specified in the requirements, and provided more relevant information to the user for troubleshooting and resolving inconsistencies than is currently available.
Conclusion:
This paper describes a new, practical way to automatically identify inconsistencies between the variability requirements specified for configurable software and the source code developed to implement those requirements.
{"title":"Requirements-driven analysis of variability in configurable software","authors":"Chin Khor, Robyn R. Lutz","doi":"10.1016/j.infsof.2026.108017","DOIUrl":"10.1016/j.infsof.2026.108017","url":null,"abstract":"<div><h3>Context:</h3><div>It is difficult, time-consuming, and error-prone to detect misalignments between the variability requirements in configurable software and the source code intended to implement those requirements.</div></div><div><h3>Objective:</h3><div>The paper reports progress in checking the consistency between variability requirements and their implementation.</div></div><div><h3>Method:</h3><div>To automate the consistency checking of variability requirements and variability source code, we create a variability model of configurable features and constraints from the requirements specification. We evaluate the consistency of the variability model against a formal representation of the presence conditions controlling variability in the source code. We generate a traceability-rich consistency dashboard for the developer of any misalignments and a minimal set of configurations providing full variability code coverage for variability testing. The approach is implemented in an open-source prototype tool called VarCHEK.</div></div><div><h3>Results:</h3><div>VarCHEK was evaluated on three diverse, configurable software projects. VarCHEK accurately identified variability requirements not implemented in the source code, found variabilities in the source code not specified in the requirements, and provided more relevant information to the user for troubleshooting and resolving inconsistencies than is currently available.</div></div><div><h3>Conclusion:</h3><div>This paper describes a new, practical way to automatically identify inconsistencies between the variability requirements specified for configurable software and the source code developed to implement those requirements.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108017"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-08DOI: 10.1016/j.infsof.2026.108014
Wei Wang , Hourieh Khalajzadeh , John Grundy , Anuradha Madugalla , Humphrey O. Obie
Context:
Mobile health (mHealth) applications are widely used for chronic disease management, but usability and accessibility challenges persist due to the diverse needs of users. Adaptive User Interfaces (AUIs) offer a promising approach to personalizing interactions and improving user experience. However, their adoption remains limited, partly due to a lack of understanding of how users perceive and evaluate different adaptation strategies. Addressing this gap is crucial for advancing user-centered design and requirements engineering in software systems for health contexts.
Objective:
This study identifies key factors influencing user preferences and trade-offs in mHealth adaptation design.
Method:
A Discrete Choice Experiment (DCE) was conducted with 186 participants living with chronic conditions who regularly use mHealth applications. Each participant completed a series of choice tasks, selecting their preferred adaptation designs from scenarios composed of six attributes with varying levels. A mixed logit model was applied to examine preference heterogeneity. Subgroup analyses were also conducted to explore variations in preferences across age, gender, health condition, and coping mechanism.
Results:
Participants preferred adaptation designs that preserved usability, offered controllability, introduced changes infrequently, and applied small-scale modifications. Conversely, adaptations affecting frequently used functions and those involving caregiver input were generally viewed less favorably. These findings highlight key trade-offs that influence user acceptance of adaptive mHealth interfaces.
Conclusion:
This study employs a data-driven approach to quantify user preferences, identify key trade-offs, and reveal variations across demographic and behavioral subgroups through preference heterogeneity modeling. These insights provide actionable guidance for designing more user-centered adaptive interfaces and contribute to advancing requirements prioritization practices in software engineering—particularly in the context of health technologies.
{"title":"User-centric requirements prioritization in mHealth applications: Insights from a Discrete Choice Experiment","authors":"Wei Wang , Hourieh Khalajzadeh , John Grundy , Anuradha Madugalla , Humphrey O. Obie","doi":"10.1016/j.infsof.2026.108014","DOIUrl":"10.1016/j.infsof.2026.108014","url":null,"abstract":"<div><h3>Context:</h3><div>Mobile health (mHealth) applications are widely used for chronic disease management, but usability and accessibility challenges persist due to the diverse needs of users. Adaptive User Interfaces (AUIs) offer a promising approach to personalizing interactions and improving user experience. However, their adoption remains limited, partly due to a lack of understanding of how users perceive and evaluate different adaptation strategies. Addressing this gap is crucial for advancing user-centered design and requirements engineering in software systems for health contexts.</div></div><div><h3>Objective:</h3><div>This study identifies key factors influencing user preferences and trade-offs in mHealth adaptation design.</div></div><div><h3>Method:</h3><div>A Discrete Choice Experiment (DCE) was conducted with 186 participants living with chronic conditions who regularly use mHealth applications. Each participant completed a series of choice tasks, selecting their preferred adaptation designs from scenarios composed of six attributes with varying levels. A mixed logit model was applied to examine preference heterogeneity. Subgroup analyses were also conducted to explore variations in preferences across age, gender, health condition, and coping mechanism.</div></div><div><h3>Results:</h3><div>Participants preferred adaptation designs that preserved usability, offered controllability, introduced changes infrequently, and applied small-scale modifications. Conversely, adaptations affecting frequently used functions and those involving caregiver input were generally viewed less favorably. These findings highlight key trade-offs that influence user acceptance of adaptive mHealth interfaces.</div></div><div><h3>Conclusion:</h3><div>This study employs a data-driven approach to quantify user preferences, identify key trade-offs, and reveal variations across demographic and behavioral subgroups through preference heterogeneity modeling. These insights provide actionable guidance for designing more user-centered adaptive interfaces and contribute to advancing requirements prioritization practices in software engineering—particularly in the context of health technologies.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108014"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-02DOI: 10.1016/j.infsof.2025.108008
Do Thi Thu Hien , Le Viet Tai Man , Le Trong Nhan , Phan Ngoc Yen Nhi , Hoang Thanh Lam , Nguyen Tan Cam , Van-Hau Pham
Context:
To keep pace with the rapid advancements in both the quality and complexity of malware, recent research has extensively employed machine learning (ML) and deep learning (DL) models to detect malicious software, particularly in the widely used Windows system. Despite demonstrating promising accuracy in identifying malware, these models remain vulnerable to adversarial attacks, where carefully modified malware samples can bypass detection. Consequently, there is a growing need to generate mutated malware by altering existing samples to comprehensively assess the robustness of ML/DL-based detectors. Unlike in the field of computer vision, functionality validation plays a crucial role in evaluating the effectiveness of these modified malware samples. Even if they achieve high evasion rates, any corruption in file format or execution can make them ineffective.
Objective:
To address this, we consider the essentials of functionality validation in creating malware samples by designing validators that can be used in reinforcement learning-based Windows malware mutation. Our focus is on workable and useful adversarial samples rather than the quantity.
Method:
Two different functionality validation methods are proposed, leveraging the static and dynamic analysis processes of PE files to capture the representation of their behaviors to verify the preservation of designed functionalities. They are then integrated into the RL framework to support the agent in recognizing actions that can cause broken samples.
Results:
Whether employing static or dynamic analysis for validation, the experimental results confirm that the proposed methods successfully maintain the original behavior of malware while enhancing its ability to evade ML-based detectors. Compared to other approaches, although the number of created adversarial malware drops due to stricter validation, a higher ratio of them are confirmed functionality-preserved.
Conclusions:
Functionality validation is an essential task in creating Windows malware mutants to ensure their reliability and usability in further assessment scenarios or real-life attacks.
{"title":"A study on functionality validation for windows malware mutating using reinforcement learning","authors":"Do Thi Thu Hien , Le Viet Tai Man , Le Trong Nhan , Phan Ngoc Yen Nhi , Hoang Thanh Lam , Nguyen Tan Cam , Van-Hau Pham","doi":"10.1016/j.infsof.2025.108008","DOIUrl":"10.1016/j.infsof.2025.108008","url":null,"abstract":"<div><h3>Context:</h3><div>To keep pace with the rapid advancements in both the quality and complexity of malware, recent research has extensively employed machine learning (ML) and deep learning (DL) models to detect malicious software, particularly in the widely used Windows system. Despite demonstrating promising accuracy in identifying malware, these models remain vulnerable to adversarial attacks, where carefully modified malware samples can bypass detection. Consequently, there is a growing need to generate mutated malware by altering existing samples to comprehensively assess the robustness of ML/DL-based detectors. Unlike in the field of computer vision, functionality validation plays a crucial role in evaluating the effectiveness of these modified malware samples. Even if they achieve high evasion rates, any corruption in file format or execution can make them ineffective.</div></div><div><h3>Objective:</h3><div>To address this, we consider the essentials of functionality validation in creating malware samples by designing validators that can be used in reinforcement learning-based Windows malware mutation. Our focus is on workable and useful adversarial samples rather than the quantity.</div></div><div><h3>Method:</h3><div>Two different functionality validation methods are proposed, leveraging the static and dynamic analysis processes of PE files to capture the representation of their behaviors to verify the preservation of designed functionalities. They are then integrated into the RL framework to support the agent in recognizing actions that can cause broken samples.</div></div><div><h3>Results:</h3><div>Whether employing static or dynamic analysis for validation, the experimental results confirm that the proposed methods successfully maintain the original behavior of malware while enhancing its ability to evade ML-based detectors. Compared to other approaches, although the number of created adversarial malware drops due to stricter validation, a higher ratio of them are confirmed functionality-preserved.</div></div><div><h3>Conclusions:</h3><div>Functionality validation is an essential task in creating Windows malware mutants to ensure their reliability and usability in further assessment scenarios or real-life attacks.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108008"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-01Epub Date: 2026-01-13DOI: 10.1016/j.infsof.2026.108036
Deo Shao, Fredrick Ishengoma
Context
Software development is evolving with the emergence of Generative AI (GAI) tools that boost productivity, reduce manual errors, and accelerate workflows. However, little is known about how users perceive the usability, effectiveness, and security of these tools, especially among varied user populations.
Objectives
This study examines the determinants of GAI tool adoption. Specifically, it examines the behavioural determinants driving GAI adoption in software development and investigates how students compare with professionals in their perception of GAI adoption.
Methods
This study employs a cross-sectional, quantitative approach, comprising structured surveys distributed to software engineering students and senior engineers. The survey was designed based on the UTAUT framework. Data was collected from 305 participants (125 students, 133 professional developers, and 47 other tech professionals; industry total = 180). Descriptive statistics, t-tests, and regression analysis were conducted to analyse data and report trends and predictors of adoption intention.
Results
Social influence was the most important predictor of adoption intention (β = 0.945, p< 0.001), and its effect differed between groups. Compared to professionals, students are more cautious about security, though their responses are less technically specific. Professional developers employ systematic refinement strategies; a large percentage make extensive code changes to improve maintainability and ensure architectural alignment. By contrast, students exhibit different usage behaviour, focusing more on getting the final product working but less on code refinement and security issues.
Conclusion
This study fills the empirical gap in the diffusion of generative AI into software development. The findings suggest different patterns between students and professional developers. The results are of interest to educators, developers, and industry leaders. Future studies should examine adoption trends among a broader range of user groups and assess the long-term effects of GAI tools on software engineering.
{"title":"Empirical analysis of generative AI tool adoption in software development","authors":"Deo Shao, Fredrick Ishengoma","doi":"10.1016/j.infsof.2026.108036","DOIUrl":"10.1016/j.infsof.2026.108036","url":null,"abstract":"<div><h3>Context</h3><div>Software development is evolving with the emergence of Generative AI (GAI) tools that boost productivity, reduce manual errors, and accelerate workflows. However, little is known about how users perceive the usability, effectiveness, and security of these tools, especially among varied user populations.</div></div><div><h3>Objectives</h3><div>This study examines the determinants of GAI tool adoption. Specifically, it examines the behavioural determinants driving GAI adoption in software development and investigates how students compare with professionals in their perception of GAI adoption.</div></div><div><h3>Methods</h3><div>This study employs a cross-sectional, quantitative approach, comprising structured surveys distributed to software engineering students and senior engineers. The survey was designed based on the UTAUT framework. Data was collected from 305 participants (125 students, 133 professional developers, and 47 other tech professionals; industry total = 180). Descriptive statistics, <em>t</em>-tests, and regression analysis were conducted to analyse data and report trends and predictors of adoption intention.</div></div><div><h3>Results</h3><div>Social influence was the most important predictor of adoption intention (<em>β</em> = 0.945, <em>p</em>< 0.001), and its effect differed between groups. Compared to professionals, students are more cautious about security, though their responses are less technically specific. Professional developers employ systematic refinement strategies; a large percentage make extensive code changes to improve maintainability and ensure architectural alignment. By contrast, students exhibit different usage behaviour, focusing more on getting the final product working but less on code refinement and security issues.</div></div><div><h3>Conclusion</h3><div>This study fills the empirical gap in the diffusion of generative AI into software development. The findings suggest different patterns between students and professional developers. The results are of interest to educators, developers, and industry leaders. Future studies should examine adoption trends among a broader range of user groups and assess the long-term effects of GAI tools on software engineering.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"192 ","pages":"Article 108036"},"PeriodicalIF":4.3,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.infsof.2025.107992
Eduardo Guerra , Darja Smite , Xiaofeng Wang
Over twenty-five years after the Agile Manifesto was introduced, agile and lean practices have matured and become a relevant paradigm for software development. Their widespread adoption has led to documented success cases in the literature, but according to the original manifest signatories, there are also troubling signs of superficial implementation and conceptual misalignments. This introduction to the Special Issue “Agile and Lean: How far did we come and what’s next?” reflects on this evolution and current state of agility in research and practice. The six contributions to this special issue highlight critical themes, including organizational agility, large-scale adoption, team diversity, agile culture, and remote collaboration, exposing existing gaps between agile values and their realization. Based on this, the present introduction also points toward future research directions in agile methods, including topics like hybrid work, cultural maturity, sustaining agility, and integrating AI technologies into agile development. Ultimately, we argued that the strong point of Agile lies not in specific and predictive frameworks and tools but in its human-centered philosophy of collaboration, learning, and continuous improvement.
{"title":"Introduction to the Special Issue - Agile and Lean: How far did we come and what’s next?","authors":"Eduardo Guerra , Darja Smite , Xiaofeng Wang","doi":"10.1016/j.infsof.2025.107992","DOIUrl":"10.1016/j.infsof.2025.107992","url":null,"abstract":"<div><div>Over twenty-five years after the Agile Manifesto was introduced, agile and lean practices have matured and become a relevant paradigm for software development. Their widespread adoption has led to documented success cases in the literature, but according to the original manifest signatories, there are also troubling signs of superficial implementation and conceptual misalignments. This introduction to the Special Issue “Agile and Lean: How far did we come and what’s next?” reflects on this evolution and current state of agility in research and practice. The six contributions to this special issue highlight critical themes, including organizational agility, large-scale adoption, team diversity, agile culture, and remote collaboration, exposing existing gaps between agile values and their realization. Based on this, the present introduction also points toward future research directions in agile methods, including topics like hybrid work, cultural maturity, sustaining agility, and integrating AI technologies into agile development. Ultimately, we argued that the strong point of Agile lies not in specific and predictive frameworks and tools but in its human-centered philosophy of collaboration, learning, and continuous improvement.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"191 ","pages":"Article 107992"},"PeriodicalIF":4.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145938581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-19DOI: 10.1016/j.infsof.2025.107999
Tiara Rojas-Stambuk , Juan Pablo Sandoval Alcocer , Leonel Merino , Andres Neyem
Context:
Extended Reality (XR) technologies, including virtual, augmented, and mixed reality, offer novel ways to support software development through immersive and spatial representations of complex software artifacts. Although many XR-based tools have been introduced, their coverage of development activities, types of visualized software data, and evaluation quality remain unclear.
Objectives:
This paper aims to systematically review the use of XR in software development, focusing on the tasks supported, the types of data visualized, the visualization and interaction techniques, the evaluation methods, and the limitations reported.
Methods:
We conducted a systematic review of the literature of 77 primary studies published between 1995 and February 2025. Each study was analyzed and classified according to the supported development tasks, the types of visualized software data, the visualization techniques used, the XR technologies used, the evaluation strategies, and the limitations.
Results:
Our findings show that most XR tools target software comprehension, primarily through structural visualizations. City metaphors and other metaphor-based techniques are the most common. However, XR remains underexplored in activities such as testing, performance analysis, and requirements engineering. Evaluation approaches are heterogeneous, often lacking methodological rigor, sufficient sample sizes, and standardized metrics.
Conclusion:
Although XR holds promise for improving software development, its current use is concentrated in a narrow set of activities and is hampered by limited evaluation quality. The challenges remain in tool integration, interaction design, and practical adoption. We identify key gaps and provide recommendations to guide future research toward broader and more effective use of XR in software engineering.
{"title":"On the use of extended reality to support software development activities: A systematic literature review","authors":"Tiara Rojas-Stambuk , Juan Pablo Sandoval Alcocer , Leonel Merino , Andres Neyem","doi":"10.1016/j.infsof.2025.107999","DOIUrl":"10.1016/j.infsof.2025.107999","url":null,"abstract":"<div><h3>Context:</h3><div>Extended Reality (XR) technologies, including virtual, augmented, and mixed reality, offer novel ways to support software development through immersive and spatial representations of complex software artifacts. Although many XR-based tools have been introduced, their coverage of development activities, types of visualized software data, and evaluation quality remain unclear.</div></div><div><h3>Objectives:</h3><div>This paper aims to systematically review the use of XR in software development, focusing on the tasks supported, the types of data visualized, the visualization and interaction techniques, the evaluation methods, and the limitations reported.</div></div><div><h3>Methods:</h3><div>We conducted a systematic review of the literature of 77 primary studies published between 1995 and February 2025. Each study was analyzed and classified according to the supported development tasks, the types of visualized software data, the visualization techniques used, the XR technologies used, the evaluation strategies, and the limitations.</div></div><div><h3>Results:</h3><div>Our findings show that most XR tools target software comprehension, primarily through structural visualizations. City metaphors and other metaphor-based techniques are the most common. However, XR remains underexplored in activities such as testing, performance analysis, and requirements engineering. Evaluation approaches are heterogeneous, often lacking methodological rigor, sufficient sample sizes, and standardized metrics.</div></div><div><h3>Conclusion:</h3><div>Although XR holds promise for improving software development, its current use is concentrated in a narrow set of activities and is hampered by limited evaluation quality. The challenges remain in tool integration, interaction design, and practical adoption. We identify key gaps and provide recommendations to guide future research toward broader and more effective use of XR in software engineering.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"191 ","pages":"Article 107999"},"PeriodicalIF":4.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-16DOI: 10.1016/j.infsof.2025.108000
Elisa Yumi Nakagawa , Rick Kazman
Context:
Reference architectures have significantly contributed to software system development, leading to system standardization, interoperability, and project costs and risk reduction. Although they can considerably promote architectural knowledge reuse, most of them do not remain useful or even survive over the years. As a consequence, the cost, effort, and time designing them are wasted.
Objective:
We introduce the concept of sustainability to the reference architecture field and detail the view of sustainability-aware reference architecture.
Methods:
Based on existing initiatives and evidence from both reference and software architectures, we propose this novel view that contains two sustainability perspectives (of and in) and five sustainability pillars: technical, economic, organizational, social, and environmental.
Results:
Several open issues still exist, so we highlight some breakthrough ideas for future research directions to make the community think.
Conclusions:
Changing the mindset towards this novel view on how to deal with reference architectures is necessary to ensure their long-term value.
{"title":"Sustainability-aware reference architectures: Needs and future research directions","authors":"Elisa Yumi Nakagawa , Rick Kazman","doi":"10.1016/j.infsof.2025.108000","DOIUrl":"10.1016/j.infsof.2025.108000","url":null,"abstract":"<div><h3>Context:</h3><div>Reference architectures have significantly contributed to software system development, leading to system standardization, interoperability, and project costs and risk reduction. Although they can considerably promote architectural knowledge reuse, most of them do not remain useful or even survive over the years. As a consequence, the cost, effort, and time designing them are wasted.</div></div><div><h3>Objective:</h3><div>We introduce the concept of sustainability to the reference architecture field and detail the view of sustainability-aware reference architecture.</div></div><div><h3>Methods:</h3><div>Based on existing initiatives and evidence from both reference and software architectures, we propose this novel view that contains two sustainability perspectives (<em>of</em> and <em>in</em>) and five sustainability pillars: technical, economic, organizational, social, and environmental.</div></div><div><h3>Results:</h3><div>Several open issues still exist, so we highlight some breakthrough ideas for future research directions to make the community think.</div></div><div><h3>Conclusions:</h3><div>Changing the mindset towards this novel view on how to deal with reference architectures is necessary to ensure their long-term value.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"191 ","pages":"Article 108000"},"PeriodicalIF":4.3,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}