首页 > 最新文献

Information and Software Technology最新文献

英文 中文
Test automation with selenium: A survey 使用selenium的测试自动化:一项调查
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-12 DOI: 10.1016/j.infsof.2026.108077
Boni García , Filippo Ricca , Maurizio Leotta , Mario Munoz-Organero

Context:

Selenium is a widely used tool for end-to-end (E2E) web testing. However, it is often criticized for brittleness, slowness, and flakiness. In parallel, newer frameworks and Artificial Intelligence (AI) are reshaping the test automation landscape.

Objectives:

This study aims to investigate current practices, challenges, and emerging trends in Selenium-based test automation.

Methods:

We designed and executed a large-scale online survey targeting software professionals who use Selenium. The questionnaire covered technical practices, tooling, AI usage, perceived challenges, and competing tools.

Results:

A total of 88 complete responses were analyzed using descriptive statistics and thematic coding. The results show that Selenium remains the dominant tool for regression and functional testing, primarily using the Page Object Model (POM) pattern. The most reported challenges are related to assertability, asynchrony, and brittleness. AI tools like ChatGPT are gaining traction for test generation. Playwright is the most prominent alternative.

Conclusion:

While Selenium is recognized as a cornerstone in many automation workflows, its limited native test-specific features present a significant drawback. The findings indicate an increasing demand for testing-focused improvements within the Selenium ecosystem, as well as for enhanced integration with AI-driven development tools.
上下文:Selenium是一个广泛用于端到端(E2E) web测试的工具。然而,它经常被批评为脆弱,缓慢和片状。与此同时,新的框架和人工智能(AI)正在重塑测试自动化的格局。目的:本研究旨在调查基于selenium的测试自动化的当前实践、挑战和新兴趋势。方法:我们设计并执行了一项针对使用Selenium的软件专业人员的大规模在线调查。调查问卷涵盖了技术实践、工具、AI使用、感知到的挑战和竞争工具。结果:采用描述性统计和主题编码对88份完整问卷进行分析。结果表明,Selenium仍然是回归和功能测试的主要工具,主要使用页面对象模型(Page Object Model, POM)模式。报告最多的挑战与可断言性、异步性和脆弱性有关。像ChatGPT这样的人工智能工具正在获得测试生成的牵引力。剧作家是最突出的选择。结论:虽然Selenium被认为是许多自动化工作流的基石,但其有限的本地特定于测试的特性存在一个显著的缺点。研究结果表明,对Selenium生态系统中以测试为重点的改进的需求越来越大,以及对与人工智能驱动的开发工具的增强集成的需求也越来越大。
{"title":"Test automation with selenium: A survey","authors":"Boni García ,&nbsp;Filippo Ricca ,&nbsp;Maurizio Leotta ,&nbsp;Mario Munoz-Organero","doi":"10.1016/j.infsof.2026.108077","DOIUrl":"10.1016/j.infsof.2026.108077","url":null,"abstract":"<div><h3>Context:</h3><div>Selenium is a widely used tool for end-to-end (E2E) web testing. However, it is often criticized for brittleness, slowness, and flakiness. In parallel, newer frameworks and Artificial Intelligence (AI) are reshaping the test automation landscape.</div></div><div><h3>Objectives:</h3><div>This study aims to investigate current practices, challenges, and emerging trends in Selenium-based test automation.</div></div><div><h3>Methods:</h3><div>We designed and executed a large-scale online survey targeting software professionals who use Selenium. The questionnaire covered technical practices, tooling, AI usage, perceived challenges, and competing tools.</div></div><div><h3>Results:</h3><div>A total of 88 complete responses were analyzed using descriptive statistics and thematic coding. The results show that Selenium remains the dominant tool for regression and functional testing, primarily using the Page Object Model (POM) pattern. The most reported challenges are related to assertability, asynchrony, and brittleness. AI tools like ChatGPT are gaining traction for test generation. Playwright is the most prominent alternative.</div></div><div><h3>Conclusion:</h3><div>While Selenium is recognized as a cornerstone in many automation workflows, its limited native test-specific features present a significant drawback. The findings indicate an increasing demand for testing-focused improvements within the Selenium ecosystem, as well as for enhanced integration with AI-driven development tools.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108077"},"PeriodicalIF":4.3,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-gile: Revisiting Agile principles in the era of AI AI- Agile:在AI时代重新审视敏捷原则
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-12 DOI: 10.1016/j.infsof.2026.108073
Salvatore F. Pileggi, Gnana Bharathy

Context:

Agile principles have progressively emerged in Software Engineering (SE) to address in fact the intrinsic complexity of modern software developments. The recent rise of AI is generating a disruptive impact in the different disciplines and domains, including also SE.

Objective:

We propose an integrated concise analysis of the Agile principles and their possible evolution in the era of AI. We refer this revised version to as AI-gile to put emphasis on the evolving nature of the Agile philosophy, in this specific case as a response to a widely disruptive technological advance.

Methods:

We adopted a hybrid analysis method, which combines a traditional critical analysis of major contributions in literature with AI-driven topic modeling. Such an approach enables a seamless semi-qualitative approach to enhance consistency within a naturally interdisciplinary context.

Results:

Holistically, our analysis confirms an on-going and potentially radical evolution of Software Engineering as a discipline in response to the incorporation of AI at different levels. Our synthesis indicates that AI’s influence on Agile principles clusters into three recurring patterns: (a) human/AI role re-balancing, (b) collaboration management, and (c) principle erosion and regeneration.

Conclusion:

In a context of fast technology evolution, the most significant challenge of AI-gile seems to be related to the rising trade-off between the capability to fully exploit the increasing AI potentialities and the preservation of the human-centric nature of the Agile philosophy.
上下文:敏捷原则在软件工程(SE)中逐渐出现,实际上是为了解决现代软件开发的内在复杂性。最近人工智能的兴起正在不同的学科和领域产生颠覆性的影响,包括人工智能。目的:对敏捷原则及其在人工智能时代可能的演变进行综合简明分析。我们将这个修订后的版本称为AI-gile,以强调敏捷哲学不断发展的本质,在这个特定的案例中,作为对广泛颠覆性技术进步的回应。方法:采用混合分析方法,将传统的对文献主要贡献的批判性分析与人工智能驱动的主题建模相结合。这种方法使无缝的半定性方法能够在自然的跨学科环境中增强一致性。结果:总体而言,我们的分析证实了软件工程作为一门学科正在进行和潜在的激进演变,以响应不同层次的人工智能的结合。我们的综合表明,人工智能对敏捷原则的影响分为三种反复出现的模式:(a)人类/人工智能角色的重新平衡,(b)协作管理,以及(c)原则的侵蚀和再生。结论:在技术快速发展的背景下,人工智能敏捷面临的最大挑战似乎与充分利用人工智能不断增长的潜力的能力和保持敏捷哲学以人为本的本质之间不断上升的权衡有关。
{"title":"AI-gile: Revisiting Agile principles in the era of AI","authors":"Salvatore F. Pileggi,&nbsp;Gnana Bharathy","doi":"10.1016/j.infsof.2026.108073","DOIUrl":"10.1016/j.infsof.2026.108073","url":null,"abstract":"<div><h3>Context:</h3><div>Agile principles have progressively emerged in Software Engineering (SE) to address in fact the intrinsic complexity of modern software developments. The recent rise of AI is generating a disruptive impact in the different disciplines and domains, including also SE.</div></div><div><h3>Objective:</h3><div>We propose an integrated concise analysis of the Agile principles and their possible evolution in the era of AI. We refer this revised version to as <em>AI-gile</em> to put emphasis on the evolving nature of the Agile philosophy, in this specific case as a response to a widely disruptive technological advance.</div></div><div><h3>Methods:</h3><div>We adopted a hybrid analysis method, which combines a traditional critical analysis of major contributions in literature with AI-driven topic modeling. Such an approach enables a seamless semi-qualitative approach to enhance consistency within a naturally interdisciplinary context.</div></div><div><h3>Results:</h3><div>Holistically, our analysis confirms an on-going and potentially radical evolution of Software Engineering as a discipline in response to the incorporation of AI at different levels. Our synthesis indicates that AI’s influence on Agile principles clusters into three recurring patterns: (a) human/AI role re-balancing, (b) collaboration management, and (c) principle erosion and regeneration.</div></div><div><h3>Conclusion:</h3><div>In a context of fast technology evolution, the most significant challenge of AI-gile seems to be related to the rising trade-off between the capability to fully exploit the increasing AI potentialities and the preservation of the human-centric nature of the Agile philosophy.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108073"},"PeriodicalIF":4.3,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SEDMR: A spreadsheet error detection approach based on metamorphic testing SEDMR:一种基于变质测试的电子表格错误检测方法
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-11 DOI: 10.1016/j.infsof.2026.108074
Bo Yang, Xuelian Hu, Jiatong Ma

Context:

Spreadsheets are ubiquitous in modern society, used in various domains such as corporate finance, market analysis, and personal budgeting. Despite their widespread use, spreadsheets are prone to errors, which can lead to significant issues, including financial losses. While numerous efforts have been made to detect these errors, existing methods often fail to provide a comprehensive, systematic approach to error detection.

Objective:

The primary goal of this research is to develop a more effective and systematic method for detecting errors in spreadsheets. This involves exploring a new research direction using Metamorphic Testing (MT), specifically focusing on defining Metamorphic Relations (MRs) between formula cells and reference cells in spreadsheets.

Methods:

We introduce a novel spreadsheet error detection approach called SEDMR (Spreadsheet Error Detection using MRs). SEDMR leverages expanded and tailored MRs that consider both cell data values and formula patterns, offering a comprehensive framework for error detection. The approach includes systematic steps such as extracting spreadsheet cell arrays, selecting MRs, detecting errors, and marking error cells.

Results:

SEDMR’s effectiveness is demonstrated through large-scale experiments on extensive datasets, including the EUSES and Enron corpora. The experiments, conducted on 160 spreadsheets, show that SEDMR outperforms state-of-the-art techniques in terms of error detection effectiveness.

Conclusions:

The SEDMR approach significantly enhances the efficiency and accuracy of spreadsheet error detection compared to traditional ad-hoc methods. Its scalability and applicability to a wide range of spreadsheet structures make it a versatile solution for real-world scenarios. This research not only provides a robust method for spreadsheet error detection but also opens up promising avenues for future research in this area.
背景:电子表格在现代社会中无处不在,用于公司财务、市场分析和个人预算等各个领域。尽管电子表格被广泛使用,但它很容易出错,这可能导致重大问题,包括经济损失。虽然为检测这些错误作出了许多努力,但现有方法往往不能提供一种全面、系统的错误检测方法。目的:本研究的主要目的是开发一种更有效和系统的方法来检测电子表格中的错误。这涉及到利用变形测试(MT)探索一个新的研究方向,特别是侧重于定义电子表格中公式单元格和参考单元格之间的变形关系(MRs)。方法:我们介绍了一种新的电子表格错误检测方法,称为SEDMR(电子表格错误检测使用MRs)。SEDMR利用扩展和定制的mr,考虑细胞数据值和公式模式,为错误检测提供全面的框架。该方法包括系统步骤,如提取电子表格单元阵列、选择MRs、检测错误和标记错误单元。结果:SEDMR的有效性通过在广泛的数据集(包括EUSES和Enron语料库)上的大规模实验得到了证明。在160张电子表格上进行的实验表明,SEDMR在错误检测效率方面优于最先进的技术。结论:与传统的ad-hoc方法相比,SEDMR方法显著提高了电子表格错误检测的效率和准确性。它的可伸缩性和对各种电子表格结构的适用性使其成为现实场景的通用解决方案。该研究不仅为电子表格错误检测提供了一种可靠的方法,而且为该领域的未来研究开辟了有希望的途径。
{"title":"SEDMR: A spreadsheet error detection approach based on metamorphic testing","authors":"Bo Yang,&nbsp;Xuelian Hu,&nbsp;Jiatong Ma","doi":"10.1016/j.infsof.2026.108074","DOIUrl":"10.1016/j.infsof.2026.108074","url":null,"abstract":"<div><h3>Context:</h3><div>Spreadsheets are ubiquitous in modern society, used in various domains such as corporate finance, market analysis, and personal budgeting. Despite their widespread use, spreadsheets are prone to errors, which can lead to significant issues, including financial losses. While numerous efforts have been made to detect these errors, existing methods often fail to provide a comprehensive, systematic approach to error detection.</div></div><div><h3>Objective:</h3><div>The primary goal of this research is to develop a more effective and systematic method for detecting errors in spreadsheets. This involves exploring a new research direction using Metamorphic Testing (MT), specifically focusing on defining Metamorphic Relations (MRs) between formula cells and reference cells in spreadsheets.</div></div><div><h3>Methods:</h3><div>We introduce a novel spreadsheet error detection approach called SEDMR (Spreadsheet Error Detection using MRs). SEDMR leverages expanded and tailored MRs that consider both cell data values and formula patterns, offering a comprehensive framework for error detection. The approach includes systematic steps such as extracting spreadsheet cell arrays, selecting MRs, detecting errors, and marking error cells.</div></div><div><h3>Results:</h3><div>SEDMR’s effectiveness is demonstrated through large-scale experiments on extensive datasets, including the EUSES and Enron corpora. The experiments, conducted on 160 spreadsheets, show that SEDMR outperforms state-of-the-art techniques in terms of error detection effectiveness.</div></div><div><h3>Conclusions:</h3><div>The SEDMR approach significantly enhances the efficiency and accuracy of spreadsheet error detection compared to traditional ad-hoc methods. Its scalability and applicability to a wide range of spreadsheet structures make it a versatile solution for real-world scenarios. This research not only provides a robust method for spreadsheet error detection but also opens up promising avenues for future research in this area.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108074"},"PeriodicalIF":4.3,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SRSPSQL: A dual-stage Text-to-SQL framework with semantic rewriting and schema pruning SRSPSQL:具有语义重写和模式修剪的双阶段文本到sql框架
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-09 DOI: 10.1016/j.infsof.2026.108064
Jianjun Lei , Zhichao Li , Ying Wang
Large language models (LLMs) have advanced the Text-to-SQL task significantly. However, existing methods still face challenges in handling natural language ambiguity, limited dataset content, and redundant database schema information. In this paper, we propose SRSPSQL, a dual-stage Text-to-SQL framework that integrates question semantic reconstruction and dynamic schema pruning to eliminate natural language ambiguity and minimize schema search space. Specifically, we design a novel semantic reconstruction mechanism leveraging LLMs’ summarization capabilities and concrete database content to eliminate ambiguities and align questions with database structures. Moreover, we develop a full-schema prompt strategy that adopts masked similarity matching for context-aware example selection and integrates database content information to minimize value errors. Furthermore, we propose a concise schema generation method that combines structured SQL information with dynamic schema pruning to focus on key schema and examples, thereby significantly improving the accuracy and efficiency of SQL generation. Experimental results demonstrate that SRSPSQL consistently outperforms baselines, achieving 86.4% and 62.13% execution accuracies (EX) on Spider and Bird, with robust gains on Spider-Syn and Spider-Realistic. Ablation studies further validate the individual and synergistic contributions of each component.
大型语言模型(llm)极大地推进了文本到sql的任务。然而,现有方法在处理自然语言歧义、有限的数据集内容和冗余的数据库模式信息方面仍然面临挑战。在本文中,我们提出了一种双阶段的文本到sql框架SRSPSQL,该框架集成了问题语义重构和动态模式修剪,以消除自然语言歧义并最小化模式搜索空间。具体来说,我们设计了一种新的语义重构机制,利用llm的摘要能力和具体的数据库内容来消除歧义,并使问题与数据库结构保持一致。此外,我们开发了一种全模式提示策略,该策略采用屏蔽相似度匹配进行上下文感知示例选择,并集成数据库内容信息以最小化值错误。此外,我们提出了一种简洁的模式生成方法,该方法将结构化的SQL信息与动态的模式修剪相结合,重点关注关键模式和示例,从而显著提高了SQL生成的准确性和效率。实验结果表明,SRSPSQL始终优于基线,在Spider和Bird上实现了86.4%和62.13%的执行精度(EX),在Spider- syn和Spider- realistic上取得了强劲的增长。消融研究进一步证实了每个组成部分的个体和协同作用。
{"title":"SRSPSQL: A dual-stage Text-to-SQL framework with semantic rewriting and schema pruning","authors":"Jianjun Lei ,&nbsp;Zhichao Li ,&nbsp;Ying Wang","doi":"10.1016/j.infsof.2026.108064","DOIUrl":"10.1016/j.infsof.2026.108064","url":null,"abstract":"<div><div>Large language models (LLMs) have advanced the Text-to-SQL task significantly. However, existing methods still face challenges in handling natural language ambiguity, limited dataset content, and redundant database schema information. In this paper, we propose SRSPSQL, a dual-stage Text-to-SQL framework that integrates question semantic reconstruction and dynamic schema pruning to eliminate natural language ambiguity and minimize schema search space. Specifically, we design a novel semantic reconstruction mechanism leveraging LLMs’ summarization capabilities and concrete database content to eliminate ambiguities and align questions with database structures. Moreover, we develop a full-schema prompt strategy that adopts masked similarity matching for context-aware example selection and integrates database content information to minimize value errors. Furthermore, we propose a concise schema generation method that combines structured SQL information with dynamic schema pruning to focus on key schema and examples, thereby significantly improving the accuracy and efficiency of SQL generation. Experimental results demonstrate that SRSPSQL consistently outperforms baselines, achieving 86.4% and 62.13% execution accuracies (EX) on Spider and Bird, with robust gains on Spider-Syn and Spider-Realistic. Ablation studies further validate the individual and synergistic contributions of each component.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108064"},"PeriodicalIF":4.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring and characterizing cross-service defects in microservice projects 探索和描述微服务项目中的跨服务缺陷
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-09 DOI: 10.1016/j.infsof.2026.108063
Chaochao Wu, Ran Mo, Wei Ding, Haopeng Song, Zengyang Li, Yutao Ma

Context:

In recent years, microservice architectures have attracted much attention as an approach to developing scalable and maintainable software systems. However, there is no systematic research on cross-service defects in microservice projects.

Objective:

To fill this gap, we conducted an empirical study of microservice projects from GitHub. This study aims to understand the types of cross-service defects and their involved fixing strategies in microservice projects through a comprehensive commit analysis.

Method:

Starting with an initial set of 3551 microservice-related repositories, we rigorously filtered and verified the projects to identify 78 projects that fully implemented microservice architectures. We extracted and analyzed 176 commits from these projects to gain insights into their cross-service defect patterns and fixing strategies.

Results:

Through analysis, we identified 10 types of cross-service defects, including service configuration defects, service build and dependency defects, service functionality defects, service communication defects, and service deployment defects. Based on specific commits, we summarized 13 fixing strategies, such as dependency, configuration, and method modifications, to examine the strategies utilized by microservice architectures to fix cross-service defects.

Conclusion:

We believe our study could contribute to the understanding of microservice development and offer a foundation for future research in this rapidly evolving field.
背景:近年来,微服务架构作为一种开发可伸缩和可维护的软件系统的方法引起了人们的广泛关注。然而,对于微服务项目中的跨服务缺陷,目前还没有系统的研究。目的:为了填补这一空白,我们对来自GitHub的微服务项目进行了实证研究。本研究旨在通过全面的提交分析,了解微服务项目中跨服务缺陷的类型及其涉及的修复策略。方法:从3551个微服务相关存储库的初始集合开始,我们严格过滤和验证了项目,以确定78个完全实现微服务架构的项目。我们从这些项目中提取并分析了176个提交,以深入了解它们的跨服务缺陷模式和修复策略。结果:通过分析,我们确定了10种类型的跨服务缺陷,包括服务配置缺陷、服务构建和依赖缺陷、服务功能缺陷、服务通信缺陷和服务部署缺陷。基于特定的提交,我们总结了13种修复策略,例如依赖、配置和方法修改,以检查微服务架构用于修复跨服务缺陷的策略。结论:我们相信我们的研究有助于理解微服务开发,并为这个快速发展的领域的未来研究奠定基础。
{"title":"Exploring and characterizing cross-service defects in microservice projects","authors":"Chaochao Wu,&nbsp;Ran Mo,&nbsp;Wei Ding,&nbsp;Haopeng Song,&nbsp;Zengyang Li,&nbsp;Yutao Ma","doi":"10.1016/j.infsof.2026.108063","DOIUrl":"10.1016/j.infsof.2026.108063","url":null,"abstract":"<div><h3>Context:</h3><div>In recent years, microservice architectures have attracted much attention as an approach to developing scalable and maintainable software systems. However, there is no systematic research on cross-service defects in microservice projects.</div></div><div><h3>Objective:</h3><div>To fill this gap, we conducted an empirical study of microservice projects from GitHub. This study aims to understand the types of cross-service defects and their involved fixing strategies in microservice projects through a comprehensive commit analysis.</div></div><div><h3>Method:</h3><div>Starting with an initial set of 3551 microservice-related repositories, we rigorously filtered and verified the projects to identify 78 projects that fully implemented microservice architectures. We extracted and analyzed 176 commits from these projects to gain insights into their cross-service defect patterns and fixing strategies.</div></div><div><h3>Results:</h3><div>Through analysis, we identified 10 types of cross-service defects, including service configuration defects, service build and dependency defects, service functionality defects, service communication defects, and service deployment defects. Based on specific commits, we summarized 13 fixing strategies, such as dependency, configuration, and method modifications, to examine the strategies utilized by microservice architectures to fix cross-service defects.</div></div><div><h3>Conclusion:</h3><div>We believe our study could contribute to the understanding of microservice development and offer a foundation for future research in this rapidly evolving field.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108063"},"PeriodicalIF":4.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepFlaky: Deep hybrid representation learning for flaky test prediction DeepFlaky:用于片状测试预测的深度混合表示学习
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.infsof.2026.108070
Jingwen Cai, Yan Lei, Zheyuan Li, Chunyan Liu, Huan Xie, Zhenyu Wu

Context:

Flaky tests, which exhibit non-deterministic pass or fail behavior without code changes, pose a significant challenge to continuous integration. Existing detection approaches either rely on manually engineered features or deep semantic representations but fail to effectively integrate them. These approaches still suffer from limited representational quality and lack mechanisms to enhance feature space, leading to weakly discriminative representations.

Objective:

To overcome the limitations of existing approaches that rely on a single source of features and lack mechanisms to learn robust representations from limited labeled data, we propose DeepFlaky. It not only fuses the interpretability of expert features with the contextual richness of deep semantic representations but also introduces contrastive learning to explicitly structure the feature space, thereby learning more discriminative and generalizable representations for flaky test prediction.

Method:

DeepFlaky extracts expert features and semantic representations from test code using CodeBERT. A contrastive learning module optimizes hybrid representations by aggregating positive samples and separating negative ones. The enhanced representation is finally fed into five classifiers for prediction.

Results:

Experiments on a widely-used public dataset demonstrate that DeepFlaky outperforms state-of-the-art baseline approaches. It achieves Precision, Recall, and F1-Scores of 93%, 93%, and 93%, with XGBoost performed the best in terms of the evaluation metrics. Ablation studies confirm the individual contributions of expert features, semantic features, and contrastive learning. Overlap analysis shows that DeepFlaky identifies 67 to 282 additional unique flaky tests missed by prior models, including 41 flaky tests not detected by any of the baseline models. Furthermore, DeepFlaky exhibits relatively better generalization in cross-project scenarios compared to other methods.

Conclusion:

DeepFlaky demonstrates the complementarity between expert features and semantic features, as well as the effectiveness of contrastive learning in enhancing the quality of code representation. It offers a robust and generalized solution that can effectively capture complex flaky patterns.
上下文:不稳定的测试,在没有代码更改的情况下表现出不确定的通过或失败行为,对持续集成构成了重大挑战。现有的检测方法要么依赖于人工设计的特征,要么依赖于深度语义表示,但无法有效地将它们集成在一起。这些方法仍然存在表征质量有限和缺乏增强特征空间的机制的问题,导致弱判别表征。为了克服现有方法依赖单一特征来源和缺乏从有限标记数据中学习鲁棒表示的机制的局限性,我们提出了DeepFlaky。它不仅将专家特征的可解释性与深度语义表征的上下文丰富性融合在一起,而且引入对比学习来显式地构建特征空间,从而为片状测试预测学习更具判别性和泛化的表征。方法:DeepFlaky使用CodeBERT从测试代码中提取专家特征和语义表示。对比学习模块通过聚合正样本和分离负样本来优化混合表示。最后将增强的表示输入到五个分类器中进行预测。结果:在广泛使用的公共数据集上的实验表明,DeepFlaky优于最先进的基线方法。它实现了93%、93%和93%的Precision、Recall和F1-Scores,其中XGBoost在评估指标方面表现最好。消融研究证实了专家特征、语义特征和对比学习的个体贡献。重叠分析表明,DeepFlaky识别了67至282个先前模型遗漏的独特片状测试,包括41个未被任何基线模型检测到的片状测试。此外,与其他方法相比,DeepFlaky在跨项目场景中表现出相对更好的泛化。结论:DeepFlaky展示了专家特征和语义特征之间的互补性,以及对比学习在提高代码表示质量方面的有效性。它提供了一个健壮和通用的解决方案,可以有效地捕获复杂的片状模式。
{"title":"DeepFlaky: Deep hybrid representation learning for flaky test prediction","authors":"Jingwen Cai,&nbsp;Yan Lei,&nbsp;Zheyuan Li,&nbsp;Chunyan Liu,&nbsp;Huan Xie,&nbsp;Zhenyu Wu","doi":"10.1016/j.infsof.2026.108070","DOIUrl":"10.1016/j.infsof.2026.108070","url":null,"abstract":"<div><h3>Context:</h3><div>Flaky tests, which exhibit non-deterministic pass or fail behavior without code changes, pose a significant challenge to continuous integration. Existing detection approaches either rely on manually engineered features or deep semantic representations but fail to effectively integrate them. These approaches still suffer from limited representational quality and lack mechanisms to enhance feature space, leading to weakly discriminative representations.</div></div><div><h3>Objective:</h3><div>To overcome the limitations of existing approaches that rely on a single source of features and lack mechanisms to learn robust representations from limited labeled data, we propose DeepFlaky. It not only fuses the interpretability of expert features with the contextual richness of deep semantic representations but also introduces contrastive learning to explicitly structure the feature space, thereby learning more discriminative and generalizable representations for flaky test prediction.</div></div><div><h3>Method:</h3><div>DeepFlaky extracts expert features and semantic representations from test code using CodeBERT. A contrastive learning module optimizes hybrid representations by aggregating positive samples and separating negative ones. The enhanced representation is finally fed into five classifiers for prediction.</div></div><div><h3>Results:</h3><div>Experiments on a widely-used public dataset demonstrate that DeepFlaky outperforms state-of-the-art baseline approaches. It achieves Precision, Recall, and F1-Scores of 93%, 93%, and 93%, with XGBoost performed the best in terms of the evaluation metrics. Ablation studies confirm the individual contributions of expert features, semantic features, and contrastive learning. Overlap analysis shows that DeepFlaky identifies 67 to 282 additional unique flaky tests missed by prior models, including 41 flaky tests not detected by any of the baseline models. Furthermore, DeepFlaky exhibits relatively better generalization in cross-project scenarios compared to other methods.</div></div><div><h3>Conclusion:</h3><div>DeepFlaky demonstrates the complementarity between expert features and semantic features, as well as the effectiveness of contrastive learning in enhancing the quality of code representation. It offers a robust and generalized solution that can effectively capture complex flaky patterns.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"193 ","pages":"Article 108070"},"PeriodicalIF":4.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI Act high-risk AI compliance challenge and industry impact: A multiple case study AI Act高风险AI合规挑战和行业影响:多案例研究
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-06 DOI: 10.1016/j.infsof.2026.108067
Matthias Wagner, Qunying Song, Markus Borg, Emelie Engström, Michal Lysek

Context:

The AI Act marks a new chapter in AI governance, affecting companies around the world seeking to offer their services within the European Union. This study focuses on the comprehensive AI Act requirements set out for high-risk AI systems.

Objectives:

We explored the perceived compliance challenge for the AI Act’s high-risk requirements and associated contributing factors; the AI Act’s impact on industry in terms of positive and negative side effects; and the sentiment of industry practitioners towards the AI Act’s codes of conduct for the voluntary application of the act’s high-risk AI requirements.

Method:

A multiple case study encompassing six case companies supplemented by three independent experts with a total of 16 respondents was conducted.

Results:

A ranking represents the different perceived levels of challenge for each AI Act high-risk requirement. The ranking is led by the following requirements, starting with the most challenging one: (1) data quality and governance (Art 10), (2) accuracy, robustness, and cybersecurity (Art 15), (3) risk and quality management system (Art 9, 17), and (4) transparency (Art 13). Moreover, four contributing factors emerged that impact the perceived compliance challenge: (1) industry and brand values, (2) existing regulatory environment, (3) AI maturity level and proficiency, and (4) company size. We identified several general key factors for the AI Act’s impact on industry and outlined strong arguments both for and against the AI Act voiced by practitioners. The sentiment towards the AI Act’s codes of conduct turned out very positive.

Conclusion:

This study offers a valuable primary research contribution to software engineering, where the state-of-the-art remains short of compliance-oriented studies with a focus on the operationalization of certain AI Act aspects. Future work is advised to develop artifacts facilitating AI Act operationalization and to validate them with industry partners.
背景:《人工智能法案》标志着人工智能治理的新篇章,影响着世界各地寻求在欧盟内提供服务的公司。本研究侧重于为高风险人工智能系统制定的全面人工智能法案要求。目的:我们探讨了人工智能法案的高风险要求和相关因素的合规性挑战;人工智能法案对行业的积极和消极影响;以及行业从业者对《人工智能法案》行为准则的看法,以自愿适用该法案的高风险人工智能要求。方法:采用多案例研究,包括6家案例公司,由3名独立专家补充,共16名受访者进行。结果:排名代表了每个AI法案高风险要求的不同感知挑战水平。排名靠前的是以下要求,从最具挑战性的要求开始:(1)数据质量和治理(第10条),(2)准确性、稳健性和网络安全(第15条),(3)风险和质量管理系统(第9条、第17条),以及(4)透明度(第13条)。此外,出现了四个影响感知合规挑战的因素:(1)行业和品牌价值,(2)现有监管环境,(3)人工智能成熟度水平和熟练程度,以及(4)公司规模。我们确定了人工智能法案对行业影响的几个一般关键因素,并概述了从业者表达的支持和反对人工智能法案的有力论据。人们对《人工智能法案》行为准则的看法非常积极。结论:本研究为软件工程提供了有价值的主要研究贡献,其中最先进的技术仍然缺乏以合规性为导向的研究,重点关注某些AI法案方面的操作化。建议未来的工作是开发促进人工智能法案运作的工件,并与行业合作伙伴一起验证它们。
{"title":"AI Act high-risk AI compliance challenge and industry impact: A multiple case study","authors":"Matthias Wagner,&nbsp;Qunying Song,&nbsp;Markus Borg,&nbsp;Emelie Engström,&nbsp;Michal Lysek","doi":"10.1016/j.infsof.2026.108067","DOIUrl":"10.1016/j.infsof.2026.108067","url":null,"abstract":"<div><h3>Context:</h3><div>The AI Act marks a new chapter in AI governance, affecting companies around the world seeking to offer their services within the European Union. This study focuses on the comprehensive AI Act requirements set out for high-risk AI systems.</div></div><div><h3>Objectives:</h3><div>We explored the perceived compliance challenge for the AI Act’s high-risk requirements and associated contributing factors; the AI Act’s impact on industry in terms of positive and negative side effects; and the sentiment of industry practitioners towards the AI Act’s codes of conduct for the voluntary application of the act’s high-risk AI requirements.</div></div><div><h3>Method:</h3><div>A multiple case study encompassing six case companies supplemented by three independent experts with a total of 16 respondents was conducted.</div></div><div><h3>Results:</h3><div>A ranking represents the different perceived levels of challenge for each AI Act high-risk requirement. The ranking is led by the following requirements, starting with the most challenging one: (1) data quality and governance (Art 10), (2) accuracy, robustness, and cybersecurity (Art 15), (3) risk and quality management system (Art 9, 17), and (4) transparency (Art 13). Moreover, four contributing factors emerged that impact the perceived compliance challenge: (1) industry and brand values, (2) existing regulatory environment, (3) AI maturity level and proficiency, and (4) company size. We identified several general key factors for the AI Act’s impact on industry and outlined strong arguments both for and against the AI Act voiced by practitioners. The sentiment towards the AI Act’s codes of conduct turned out very positive.</div></div><div><h3>Conclusion:</h3><div>This study offers a valuable primary research contribution to software engineering, where the state-of-the-art remains short of compliance-oriented studies with a focus on the operationalization of certain AI Act aspects. Future work is advised to develop artifacts facilitating AI Act operationalization and to validate them with industry partners.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108067"},"PeriodicalIF":4.3,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAME: A Similarity Analysis Method for Evaluating Metamorphic Relations in Testing AI systems SAME:一种评估人工智能系统测试中变质关系的相似分析方法
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-05 DOI: 10.1016/j.infsof.2026.108069
Zhehao Li , Jinfu Chen , Xiaodong Xie , Tsong Yueh Chen , Saihua Cai

Context:

Metamorphic Testing (MT) is a technique that employs Metamorphic Relations (MRs) to act as test oracles to verify software correctness. An effective evaluation of the similarity among MRs is crucial for selecting a diverse and effective set of MRs to maximize fault detection capabilities. Yet, in AI domains, the similarity among MRs remains underexplored. Existing MR similarity methods are complex and do not generalize well to AI tasks.

Objective:

We propose SAME (Similarity Analysis Method for Evaluating MRs), a lightweight clustering-based approach to quantify MR similarity and reveal hidden structures, offering a fresh perspective for AI system testing.

Method:

SAME uses hierarchical clustering to analyze MR-violation vectors produced by MT. We evaluate it on three application domains: (1) object detection (COCO2017 with YOLOv7), (2) multimodal captioning (Flickr30k/COCO2017 with BLIP2, GIT, OFA), and (3) natural-language inference (SNLI/MNLI/ANLI with BERT/RoBERTa).

Results:

Compared to the existing method, SAME increases the Adjusted Rand Index (ARI) by up to 0.45 (0.95 vs 0.50) and the Adjusted Mutual Information (AMI) by up to 0.32 (0.91 vs 0.59). Pairwise ARI across six dataset-model pairs, and comparison across different clustering algorithms, show that SAME’s groupings are insensitive to data or algorithm choice but strongly reflect model behavior.

Conclusion:

SAME is simpler than prior methods yet more accurate. Its violation-vector design faithfully captures model behavior, exposing redundant MRs (e.g., high-parameter crop stretch) and subtle bias links (e.g., ”elderly” ”wealthy”). These insights provide a new analytical lens for MT results.
上下文:变形测试(MT)是一种使用变形关系(MRs)作为测试预言器来验证软件正确性的技术。有效地评估磁电阻之间的相似性对于选择多样化和有效的磁电阻集以最大限度地提高故障检测能力至关重要。然而,在人工智能领域,MRs之间的相似性仍未得到充分探索。现有的MR相似度方法比较复杂,不能很好地泛化到人工智能任务中。目的:提出基于轻量级聚类的MR相似性分析方法(SAME, Similarity Analysis Method for evaluation MRs),用于量化MR相似性并揭示隐藏结构,为AI系统测试提供新的视角。方法:SAME使用分层聚类分析机器翻译产生的mr违反向量。我们在三个应用领域对其进行评估:(1)目标检测(COCO2017与YOLOv7),(2)多模态标题(Flickr30k/COCO2017与BLIP2, GIT, OFA)和(3)自然语言推理(SNLI/MNLI/ANLI与BERT/RoBERTa)。结果:与现有方法相比,SAME可将调整后的Rand指数(ARI)提高0.45 (0.95 vs 0.50),将调整后的互信息(AMI)提高0.32 (0.91 vs 0.59)。跨六个数据集模型对的成对ARI,以及跨不同聚类算法的比较,表明SAME的分组对数据或算法选择不敏感,但强烈反映模型行为。结论:SAME方法简便,准确度高。它的违背向量设计忠实地捕捉模型行为,暴露冗余MRs(例如,高参数作物≈拉伸)和微妙的偏差联系(例如,“老年人”↔“富人”)。这些见解为MT结果提供了一个新的分析视角。
{"title":"SAME: A Similarity Analysis Method for Evaluating Metamorphic Relations in Testing AI systems","authors":"Zhehao Li ,&nbsp;Jinfu Chen ,&nbsp;Xiaodong Xie ,&nbsp;Tsong Yueh Chen ,&nbsp;Saihua Cai","doi":"10.1016/j.infsof.2026.108069","DOIUrl":"10.1016/j.infsof.2026.108069","url":null,"abstract":"<div><h3>Context:</h3><div>Metamorphic Testing (MT) is a technique that employs Metamorphic Relations (MRs) to act as test oracles to verify software correctness. An effective evaluation of the similarity among MRs is crucial for selecting a diverse and effective set of MRs to maximize fault detection capabilities. Yet, in AI domains, the similarity among MRs remains underexplored. Existing MR similarity methods are complex and do not generalize well to AI tasks.</div></div><div><h3>Objective:</h3><div>We propose SAME (<strong>S</strong>imilarity <strong>A</strong>nalysis <strong>M</strong>ethod for <strong>E</strong>valuating MRs), a lightweight clustering-based approach to quantify MR similarity and reveal hidden structures, offering a fresh perspective for AI system testing.</div></div><div><h3>Method:</h3><div>SAME uses hierarchical clustering to analyze MR-violation vectors produced by MT. We evaluate it on three application domains: (1) object detection (COCO2017 with YOLOv7), (2) multimodal captioning (Flickr30k/COCO2017 with BLIP2, GIT, OFA), and (3) natural-language inference (SNLI/MNLI/ANLI with BERT/RoBERTa).</div></div><div><h3>Results:</h3><div>Compared to the existing method, SAME increases the Adjusted Rand Index (ARI) by up to 0.45 (0.95 vs 0.50) and the Adjusted Mutual Information (AMI) by up to 0.32 (0.91 vs 0.59). Pairwise ARI across six dataset-model pairs, and comparison across different clustering algorithms, show that SAME’s groupings are insensitive to data or algorithm choice but strongly reflect model behavior.</div></div><div><h3>Conclusion:</h3><div>SAME is simpler than prior methods yet more accurate. Its violation-vector design faithfully captures model behavior, exposing redundant MRs (e.g., high-parameter crop <span><math><mo>≈</mo></math></span> stretch) and subtle bias links (e.g., ”elderly” <span><math><mo>↔</mo></math></span> ”wealthy”). These insights provide a new analytical lens for MT results.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108069"},"PeriodicalIF":4.3,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating formal methods and automated tools for DO-178C compliance in UAV software 集成无人机软件中DO-178C合规的形式化方法和自动化工具
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-05 DOI: 10.1016/j.infsof.2026.108068
Rim Zrelli , Henrique Amaral Misson , Sorelle Kamkuimo , Maroua Ben Attia , Abdo Shabah , Felipe Gohring de Magalhaes , Gabriela Nicolescu

Context:

The development of software for Unmanned Aerial Vehicles (UAVs) is governed by stringent safety-critical regulations, with DO-178C serving as the primary standard for airborne systems. Ensuring compliance requires extensive verification, validation, and traceability across the software lifecycle, which becomes increasingly complex for autonomous and adaptive UAV functions.

Objective:

This paper proposes an integrated methodology for regulatory compliance checking that combines formal methods with automated verification tools to generate certification-ready evidence under DO-178C.

Methods:

Formal methods are applied at multiple levels: Alloy is used for requirements consistency checking, the SPIN model checker for architectural interaction properties, and bounded model checking for code-level analysis. These techniques are integrated with automated toolchains that provide continuous bidirectional traceability, structural coverage analysis, and automated test execution across the software lifecycle. The approach is evaluated on a Design Assurance Level (DAL) B UAV Collision Avoidance System.

Results:

The case study demonstrates the production of certification-ready evidence bundles, including closed bidirectional traceability from system requirements through high and low-level software requirements to source code and tests, formal proof summaries linked to requirements, and decision coverage reports on mission-critical logic. Results indicate that tightly integrating formal analysis with automated verification improves early defect detection, reduces manual evidence assembly, and strengthens the auditability of DO-178C compliance.

Conclusions:

The combined use of formal methods and automation offers a scalable pathway for UAVs and other autonomous systems to achieve compliance with evolving safety regulations. The findings highlight that integrating regulatory compliance checking into development processes can simultaneously enhance rigor and efficiency, providing a model for certifiable autonomy software in civil airspace.
背景:无人机(uav)软件的开发受到严格的安全关键法规的约束,DO-178C作为机载系统的主要标准。确保遵从性需要在整个软件生命周期中进行广泛的验证、确认和可追溯性,这对于自主和自适应无人机功能来说变得越来越复杂。目的:本文提出了一种法规符合性检查的集成方法,该方法将正式方法与自动验证工具相结合,以生成DO-178C下的认证就绪证据。方法:形式化方法应用于多个层次:Alloy用于需求一致性检查,SPIN模型检查器用于体系结构交互属性,有界模型检查用于代码级分析。这些技术与自动化的工具链集成在一起,这些工具链提供了连续的双向跟踪、结构覆盖分析,以及跨软件生命周期的自动化测试执行。在设计保证级(DAL) B无人机避碰系统上对该方法进行了评估。结果:案例研究演示了准备好认证的证据包的生产,包括从系统需求到高级和低级软件需求到源代码和测试的封闭的双向可追溯性,与需求相关联的正式证明摘要,以及任务关键逻辑的决策覆盖报告。结果表明,将形式化分析与自动化验证紧密集成可以改善早期缺陷检测,减少人工证据组装,并加强DO-178C遵从性的可审核性。结论:形式化方法和自动化的结合使用为无人机和其他自主系统提供了可扩展的途径,以实现对不断发展的安全法规的遵守。研究结果强调,将法规遵从性检查整合到开发过程中可以同时提高严谨性和效率,为民用空域的可认证自主软件提供了一个模型。
{"title":"Integrating formal methods and automated tools for DO-178C compliance in UAV software","authors":"Rim Zrelli ,&nbsp;Henrique Amaral Misson ,&nbsp;Sorelle Kamkuimo ,&nbsp;Maroua Ben Attia ,&nbsp;Abdo Shabah ,&nbsp;Felipe Gohring de Magalhaes ,&nbsp;Gabriela Nicolescu","doi":"10.1016/j.infsof.2026.108068","DOIUrl":"10.1016/j.infsof.2026.108068","url":null,"abstract":"<div><h3>Context:</h3><div>The development of software for Unmanned Aerial Vehicles (UAVs) is governed by stringent safety-critical regulations, with DO-178C serving as the primary standard for airborne systems. Ensuring compliance requires extensive verification, validation, and traceability across the software lifecycle, which becomes increasingly complex for autonomous and adaptive UAV functions.</div></div><div><h3>Objective:</h3><div>This paper proposes an integrated methodology for regulatory compliance checking that combines formal methods with automated verification tools to generate certification-ready evidence under DO-178C.</div></div><div><h3>Methods:</h3><div>Formal methods are applied at multiple levels: Alloy is used for requirements consistency checking, the SPIN model checker for architectural interaction properties, and bounded model checking for code-level analysis. These techniques are integrated with automated toolchains that provide continuous bidirectional traceability, structural coverage analysis, and automated test execution across the software lifecycle. The approach is evaluated on a Design Assurance Level (DAL) B UAV Collision Avoidance System.</div></div><div><h3>Results:</h3><div>The case study demonstrates the production of certification-ready evidence bundles, including closed bidirectional traceability from system requirements through high and low-level software requirements to source code and tests, formal proof summaries linked to requirements, and decision coverage reports on mission-critical logic. Results indicate that tightly integrating formal analysis with automated verification improves early defect detection, reduces manual evidence assembly, and strengthens the auditability of DO-178C compliance.</div></div><div><h3>Conclusions:</h3><div>The combined use of formal methods and automation offers a scalable pathway for UAVs and other autonomous systems to achieve compliance with evolving safety regulations. The findings highlight that integrating regulatory compliance checking into development processes can simultaneously enhance rigor and efficiency, providing a model for certifiable autonomy software in civil airspace.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"194 ","pages":"Article 108068"},"PeriodicalIF":4.3,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empirical insights on interoperability in digital twins: Challenges & LCIM perspectives 数字孪生中互操作性的实证见解:挑战与LCIM视角
IF 4.3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-05 DOI: 10.1016/j.infsof.2026.108048
Sarthak Acharya , Yueqiang Xu , Nirnaya Tripathi , Tero Päivärinta , Arif Ali Khan

Context:

Digital twins (DTs) have become integral in diverse cyber–physical production systems (CPPS), enabling dynamic interactions between physical entities and their digital counterparts. Yet their integration into such complex ecosystems raises substantial interoperability challenges. While these challenges and associated frameworks for DTs have been extensively theorized in scholarly literature, there are limited empirical investigations that capture industrial perspectives on these aspects.

Objective:

This exploratory study aims to empirically investigate real-world interoperability challenges in DT deployments and assess the relevance of a layered interoperability framework as a structured approach to address these issues.

Methods:

We addressed this gap by conducting interviews with 12 DT practitioners from 10 companies across five European countries. Interviewees are guided through two reference models: a simplified view of the DT ecosystem and a layered framework based on the Level of Conceptual Interoperability Model (LCIM). The thematic synthesis and systematic mapping of the collected data have used Grounded Theory (GT)-based open coding. Sentiment analysis was used as an illustrative complement to the qualitative findings by capturing expert attitudes towards the LCIM for DTs.

Results:

The analysis identified 26 practical interoperability challenges, thematically synthesized into 7 categories. Experts’ perspectives on the LCIM for DTs revealed two key outcomes: 4 drivers of the open and closed-ended nature of interoperability layers, and 4 value propositions highlighting the framework’s relevance for DT deployments. Further, the identified challenge categories are mapped across layers, highlighting the dichotomy of open-source and proprietary approaches, the need for Dynamism and Ecosystem-oriented Interoperability.

Conclusions:

This work advances empirical and theoretical understandings of DT interoperability within CPPS. Our findings contribute to addressing practical interoperability challenges, provide empirical values for the layered model in cross-disciplinary approaches to DT integration, and offer guidance for researchers and practitioners. Future work could validate and adapt the layered approach through domain-specific DT applications to assess its effectiveness in digital transformation initiatives.
背景:数字孪生体(dt)已成为各种网络物理生产系统(CPPS)的组成部分,使物理实体与其数字对应实体之间能够进行动态交互。然而,将它们集成到如此复杂的生态系统中,会带来实质性的互操作性挑战。虽然这些挑战和相关框架在学术文献中已经被广泛理论化,但在这些方面捕捉行业观点的实证调查有限。目的:本探索性研究旨在实证研究DT部署中现实世界的互操作性挑战,并评估分层互操作性框架作为解决这些问题的结构化方法的相关性。方法:我们通过对来自五个欧洲国家的10家公司的12名DT从业者进行访谈来解决这一差距。受访者通过两个参考模型进行指导:DT生态系统的简化视图和基于概念互操作性水平模型(LCIM)的分层框架。采用基于扎根理论(GT)的开放编码对收集到的数据进行专题综合和系统映射。通过捕捉专家对DTs的LCIM的态度,情感分析被用作定性研究结果的说明性补充。结果:分析确定了26个实际互操作性挑战,按主题综合为7类。专家对用于DT的LCIM的观点揭示了两个关键结果:互操作性层开放和封闭本质的4个驱动因素,以及强调框架与DT部署相关的4个价值主张。此外,已确定的挑战类别跨层映射,突出了开源和专有方法的二分法,对动态和面向生态系统的互操作性的需求。结论:这项工作推进了对CPPS内DT互操作性的实证和理论理解。我们的研究结果有助于解决实际的互操作性挑战,为跨学科DT集成方法中的分层模型提供经验价值,并为研究人员和实践者提供指导。未来的工作可以通过特定领域的DT应用来验证和调整分层方法,以评估其在数字化转型计划中的有效性。
{"title":"Empirical insights on interoperability in digital twins: Challenges & LCIM perspectives","authors":"Sarthak Acharya ,&nbsp;Yueqiang Xu ,&nbsp;Nirnaya Tripathi ,&nbsp;Tero Päivärinta ,&nbsp;Arif Ali Khan","doi":"10.1016/j.infsof.2026.108048","DOIUrl":"10.1016/j.infsof.2026.108048","url":null,"abstract":"<div><h3>Context:</h3><div>Digital twins (DTs) have become integral in diverse cyber–physical production systems (CPPS), enabling dynamic interactions between physical entities and their digital counterparts. Yet their integration into such complex ecosystems raises substantial interoperability challenges. While these challenges and associated frameworks for DTs have been extensively theorized in scholarly literature, there are limited empirical investigations that capture industrial perspectives on these aspects.</div></div><div><h3>Objective:</h3><div>This exploratory study aims to empirically investigate real-world interoperability challenges in DT deployments and assess the relevance of a layered interoperability framework as a structured approach to address these issues.</div></div><div><h3>Methods:</h3><div>We addressed this gap by conducting interviews with 12 DT practitioners from 10 companies across five European countries. Interviewees are guided through two reference models: a simplified view of the DT ecosystem and a layered framework based on the Level of Conceptual Interoperability Model (LCIM). The thematic synthesis and systematic mapping of the collected data have used Grounded Theory (GT)-based open coding. Sentiment analysis was used as an illustrative complement to the qualitative findings by capturing expert attitudes towards the LCIM for DTs.</div></div><div><h3>Results:</h3><div>The analysis identified 26 practical interoperability challenges, thematically synthesized into 7 categories. Experts’ perspectives on the LCIM for DTs revealed two key outcomes: 4 drivers of the open and closed-ended nature of interoperability layers, and 4 value propositions highlighting the framework’s relevance for DT deployments. Further, the identified challenge categories are mapped across layers, highlighting the dichotomy of open-source and proprietary approaches, the need for Dynamism and Ecosystem-oriented Interoperability.</div></div><div><h3>Conclusions:</h3><div>This work advances empirical and theoretical understandings of DT interoperability within CPPS. Our findings contribute to addressing practical interoperability challenges, provide empirical values for the layered model in cross-disciplinary approaches to DT integration, and offer guidance for researchers and practitioners. Future work could validate and adapt the layered approach through domain-specific DT applications to assess its effectiveness in digital transformation initiatives.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"193 ","pages":"Article 108048"},"PeriodicalIF":4.3,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information and Software Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1