Pub Date : 2025-10-01Epub Date: 2025-04-04DOI: 10.1016/j.scico.2025.103312
Stefan Hallerstede , John Hatcliff
Many visions for model-driven component-based development emphasize models as the “single source of truth” by which different forms of analysis, specification, verification, and code generation are integrated. Such visions depend strongly on a clear modeling language semantics that provides different tools and stakeholders with a common understanding of a model's meaning. In this paper, we report on a mechanization in the Isabelle theorem prover of a formal semantics for key aspects of the SAE standard AADL modeling language. A primary goal of this semantics is to support component-oriented contract specification and verification as well as code generation implemented in the HAMR AADL model-driven development tool chain. We provide formal definitions of run-time system state, execution steps, reachable states, and property verification. Use of the mechanization for real-world applications is supported by automated HAMR translation from AADL models into the Isabelle specifications. In addition to general verification support, we define well-formedness properties and associated proofs for models, system states, and traces that are automatically proven for HAMR-generated Isabelle models.
{"title":"A mechanized semantics for component-based systems in the HAMR AADL runtime","authors":"Stefan Hallerstede , John Hatcliff","doi":"10.1016/j.scico.2025.103312","DOIUrl":"10.1016/j.scico.2025.103312","url":null,"abstract":"<div><div>Many visions for model-driven component-based development emphasize models as the “single source of truth” by which different forms of analysis, specification, verification, and code generation are integrated. Such visions depend strongly on a clear modeling language semantics that provides different tools and stakeholders with a common understanding of a model's meaning. In this paper, we report on a mechanization in the Isabelle theorem prover of a formal semantics for key aspects of the SAE standard AADL modeling language. A primary goal of this semantics is to support component-oriented contract specification and verification as well as code generation implemented in the HAMR AADL model-driven development tool chain. We provide formal definitions of run-time system state, execution steps, reachable states, and property verification. Use of the mechanization for real-world applications is supported by automated HAMR translation from AADL models into the Isabelle specifications. In addition to general verification support, we define well-formedness properties and associated proofs for models, system states, and traces that are automatically proven for HAMR-generated Isabelle models.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103312"},"PeriodicalIF":1.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143799712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-04-24DOI: 10.1016/j.scico.2025.103318
Magalí González, Luca Cernuzzi
Specifying architectural properties is still an open issue for Model Driven Web Engineering to address portability, adaptability, and evolution. Several Model-Driven Web methods and methodologies consider extensions that enrich the Platform Independent Model (PIM) or the Platform Specific Model (PSM) to include elements of a particular platform or architecture. However, the degree of independence of the model is critical to achieving adaptability and evolution. Therefore, some authors have proposed a new layer called an Architecture Specific Model (ASM) to model architectural properties. Some evidence suggests that adopting ASM as an intermediate stage between PIM and PSM is a way to facilitate the evolution of web system. This paper focuses on the Architecture Specific Model (ASM) of MoWebA (Model Oriented Web Approach), and analyzes its impact on adaptability across different architectural styles. A case study is presented to validate this issue by extending MoWebA to three different architectures. In such extensions, we analyze the degree of adaptability of MoWebA and the automation of PIM-ASM, as well as the degree of independence of the PIM metamodel. In addition, through three types of questionnaire and other quantitative data, the study analyzes user satisfaction with the adoption of MoWebA.
对于模型驱动Web工程来说,指定体系结构属性仍然是一个悬而未决的问题,以解决可移植性、适应性和进化问题。一些模型驱动的Web方法和方法学考虑了丰富平台独立模型(PIM)或平台特定模型(PSM)的扩展,以包含特定平台或体系结构的元素。然而,模型的独立程度对于实现适应性和进化至关重要。因此,一些作者提出了一个称为体系结构特定模型(ASM)的新层来为体系结构属性建模。一些证据表明,采用ASM作为PIM和PSM之间的中间阶段是促进web系统发展的一种方式。本文重点研究了MoWebA(面向模型的Web方法)的体系结构特定模型(Architecture Specific Model, ASM),并分析了它对不同体系结构风格的适应性的影响。通过将MoWebA扩展到三种不同的体系结构,给出了一个案例研究来验证这个问题。在这些扩展中,我们分析了MoWebA的自适应程度和PIM- asm的自动化程度,以及PIM元模型的独立程度。此外,本研究通过三种类型的问卷调查和其他定量数据,分析了MoWebA采用的用户满意度。
{"title":"Analyzing MoWebA's adaptability across architectures","authors":"Magalí González, Luca Cernuzzi","doi":"10.1016/j.scico.2025.103318","DOIUrl":"10.1016/j.scico.2025.103318","url":null,"abstract":"<div><div>Specifying architectural properties is still an open issue for Model Driven Web Engineering to address portability, adaptability, and evolution. Several Model-Driven Web methods and methodologies consider extensions that enrich the Platform Independent Model (PIM) or the Platform Specific Model (PSM) to include elements of a particular platform or architecture. However, the degree of independence of the model is critical to achieving adaptability and evolution. Therefore, some authors have proposed a new layer called an Architecture Specific Model (ASM) to model architectural properties. Some evidence suggests that adopting ASM as an intermediate stage between PIM and PSM is a way to facilitate the evolution of web system. This paper focuses on the Architecture Specific Model (ASM) of MoWebA (Model Oriented Web Approach), and analyzes its impact on adaptability across different architectural styles. A case study is presented to validate this issue by extending MoWebA to three different architectures. In such extensions, we analyze the degree of adaptability of MoWebA and the automation of PIM-ASM, as well as the degree of independence of the PIM metamodel. In addition, through three types of questionnaire and other quantitative data, the study analyzes user satisfaction with the adoption of MoWebA.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103318"},"PeriodicalIF":1.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143903524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-04-04DOI: 10.1016/j.scico.2025.103314
Sascha Lehmann , Antje Rogalla , Maximilian Neidhardt , Alexander Schlaefer , Sibylle Schupp
Autonomous systems often address complex planning problems, which require both prospective action planning and retrospective data evaluation. Timed games could aid since they automatically synthesize strategies that, provably correct, solve those planning problems; yet, they assume a static model of the environment, which is not realistic for autonomous systems. However, many autonomous systems are control applications, which employ sensors that capture system behavior at run time and can thus compensate for incomplete knowledge at modeling time. In this paper, we propose an online strategy synthesis, which, based on offline strategy synthesis on the one hand and on sensor information about the current state of the physical world on the other hand, derives formal safety guarantees while reacting and adapting to environment changes. We formalize the needle-steering problem from medical robotics, i.e., the problem of navigating a (flexible and beveled) needle through partially unknown tissue towards a target without damaging its surroundings, by interpreting it as a timed game. Further, we introduce a new representation of its environment through different region types that determine the acceptance of action plans and trigger local correcting actions. We present an algorithm for online strategy synthesis and, for the given region representation, formally prove that it returns safe online controllers. The algorithm is implemented on top of Uppaal Stratego. For two medical applications of needle steering, peridural anesthesia and predefined needle trajectory, we demonstrate the necessity of online adjustments in a series of simulations with various degrees of initial knowledge about the environment, and show that the overhead of online synthesis remains practical.
{"title":"A provably safe controller for the needle-steering problem using online strategy synthesis","authors":"Sascha Lehmann , Antje Rogalla , Maximilian Neidhardt , Alexander Schlaefer , Sibylle Schupp","doi":"10.1016/j.scico.2025.103314","DOIUrl":"10.1016/j.scico.2025.103314","url":null,"abstract":"<div><div>Autonomous systems often address complex planning problems, which require both prospective action planning and retrospective data evaluation. Timed games could aid since they automatically synthesize strategies that, provably correct, solve those planning problems; yet, they assume a static model of the environment, which is not realistic for autonomous systems. However, many autonomous systems are control applications, which employ sensors that capture system behavior at run time and can thus compensate for incomplete knowledge at modeling time. In this paper, we propose an <em>online strategy synthesis</em>, which, based on offline strategy synthesis on the one hand and on sensor information about the current state of the physical world on the other hand, derives formal safety guarantees while reacting and adapting to environment changes. We formalize the needle-steering problem from medical robotics, i.e., the problem of navigating a (flexible and beveled) needle through partially unknown tissue towards a target without damaging its surroundings, by interpreting it as a timed game. Further, we introduce a new representation of its environment through different region types that determine the acceptance of action plans and trigger local correcting actions. We present an algorithm for online strategy synthesis and, for the given region representation, formally prove that it returns safe online controllers. The algorithm is implemented on top of Uppaal Stratego. For two medical applications of needle steering, <em>peridural anesthesia</em> and <em>predefined needle trajectory</em>, we demonstrate the necessity of online adjustments in a series of simulations with various degrees of initial knowledge about the environment, and show that the overhead of online synthesis remains practical.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"245 ","pages":"Article 103314"},"PeriodicalIF":1.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143799713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-02-28DOI: 10.1016/j.scico.2025.103286
Fatih Sarikoc
Ensuring safe operation of level crossings is crucial in preventing fatalities and injuries at rail-road intersections. This study presents a Petri Net model of a level-crossing system that includes an obstacle detection system and dedicated protection signal to enhance safety. The model was analyzed using linear temporal logic and computation tree logic, with formal verification performed via TINA and TAPAAL tools. As a result of the model-checking experiments, the proposed Petri Net model successfully complies with the given safety criteria. Additionally, Fault Tree Analysis (FTA) was employed to systematically assess system-level risk, highlighting potential failure points and their impact. FTA provides a quantitative risk evaluation that complements the formal verification process, ensuring a thorough system reliability assessment.
{"title":"Model checking and verification of a rail-side protection system","authors":"Fatih Sarikoc","doi":"10.1016/j.scico.2025.103286","DOIUrl":"10.1016/j.scico.2025.103286","url":null,"abstract":"<div><div>Ensuring safe operation of level crossings is crucial in preventing fatalities and injuries at rail-road intersections. This study presents a Petri Net model of a level-crossing system that includes an obstacle detection system and dedicated protection signal to enhance safety. The model was analyzed using linear temporal logic and computation tree logic, with formal verification performed via TINA and TAPAAL tools. As a result of the model-checking experiments, the proposed Petri Net model successfully complies with the given safety criteria. Additionally, Fault Tree Analysis (FTA) was employed to systematically assess system-level risk, highlighting potential failure points and their impact. FTA provides a quantitative risk evaluation that complements the formal verification process, ensuring a thorough system reliability assessment.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103286"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143580311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Autonomous driving functions (ADFs) are becoming more relevant and complex. Still, their safe and correct operation must be guaranteed. Scenario-based testing, i.e. confronting the ADF under test with other traffic in specified scenarios is an established approach for the validation and verification of ADFs, but tests currently often only consider simple technical requirements. Safe and correct operation is not only the absence of collisions but involves complex spatio-temporal requirements on the externally observable, functional driving behaviour in traffic.
In this work, we consider Traffic Sequence Charts (TSCs) as a visual formalism for the specification of complex, functional ADF requirements. We define a monitoring problem for TSCs and finite, sampled observations of ADF behaviour and discuss how monitor verdicts contribute to requirements testing. We show that such monitors can effectively be constructed for realistic requirements and that they can contribute to efficient testing by assessing ADF behaviour at runtime.
{"title":"Runtime monitoring of complex scenario-based requirements for autonomous driving functions","authors":"Ralf Stemmer, Ishan Saxena, Lukas Panneke, Dominik Grundt, Anna Austel, Eike Möhlmann, Bernd Westphal","doi":"10.1016/j.scico.2025.103301","DOIUrl":"10.1016/j.scico.2025.103301","url":null,"abstract":"<div><div>Autonomous driving functions (ADFs) are becoming more relevant and complex. Still, their safe and correct operation must be guaranteed. Scenario-based testing, i.e. confronting the ADF under test with other traffic in specified scenarios is an established approach for the validation and verification of ADFs, but tests currently often only consider simple technical requirements. Safe and correct operation is not only the absence of collisions but involves complex spatio-temporal requirements on the externally observable, functional driving behaviour in traffic.</div><div>In this work, we consider Traffic Sequence Charts (TSCs) as a visual formalism for the specification of complex, functional ADF requirements. We define a monitoring problem for TSCs and finite, sampled observations of ADF behaviour and discuss how monitor verdicts contribute to requirements testing. We show that such monitors can effectively be constructed for realistic requirements and that they can contribute to efficient testing by assessing ADF behaviour at runtime.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103301"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143696676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-03-19DOI: 10.1016/j.scico.2025.103300
Alice Miller , Bernd Porr , Ivaylo Valkov , Douglas Fraser , Daumantas Pagojus
Fast and reliable trajectory planning is a key requirement of autonomous vehicles. In this paper we introduce a novel technique for planning the route of an autonomous vehicle on a straight, traffic-heavy rural road using the SPIN model checker. We show how we can combine SPIN's ability to identify paths violating temporal properties with sensor information from a 3D Unity simulation of an autonomous vehicle, to plan and perform consecutive overtaking manoeuvres. This involves discretising the sensory information and combining multiple sequential SPIN models with a Linear-time Temporal Logic specification to generate an error path. This path provides the autonomous vehicle with an action plan. The entire process is fast (using no precomputed data) and the action plan is tailored for individual scenarios. Our experiments demonstrate that the simulated autonomous vehicle implementing our approach can drive a median of 37 km and overtake a median of 187 vehicles before experiencing a collision - which is usually caused by inaccuracies in the sensory system. We also describe a memoisation approach which helps to mitigate one of the drawbacks of our approach - the cost of model compilation. Our novel approach demonstrates a potentially powerful future tool for efficient trajectory planning for autonomous vehicles.
{"title":"Model checking with memoisation for fast overtaking planning","authors":"Alice Miller , Bernd Porr , Ivaylo Valkov , Douglas Fraser , Daumantas Pagojus","doi":"10.1016/j.scico.2025.103300","DOIUrl":"10.1016/j.scico.2025.103300","url":null,"abstract":"<div><div>Fast and reliable trajectory planning is a key requirement of autonomous vehicles. In this paper we introduce a novel technique for planning the route of an autonomous vehicle on a straight, traffic-heavy rural road using the SPIN model checker. We show how we can combine SPIN's ability to identify paths violating temporal properties with sensor information from a 3D Unity simulation of an autonomous vehicle, to plan and perform consecutive overtaking manoeuvres. This involves discretising the sensory information and combining multiple sequential SPIN models with a Linear-time Temporal Logic specification to generate an error path. This path provides the autonomous vehicle with an action plan. The entire process is fast (using no precomputed data) and the action plan is tailored for individual scenarios. Our experiments demonstrate that the simulated autonomous vehicle implementing our approach can drive a median of 37 km and overtake a median of 187 vehicles before experiencing a collision - which is usually caused by inaccuracies in the sensory system. We also describe a memoisation approach which helps to mitigate one of the drawbacks of our approach - the cost of model compilation. Our novel approach demonstrates a potentially powerful future tool for efficient trajectory planning for autonomous vehicles.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103300"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143682126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-03-13DOI: 10.1016/j.scico.2025.103298
Patrick Rodrigo Da Silva , Érica Ferreira de Souza , Glaúcia Braga e Silva , Giovani Volnei Meinerz , Katia Romero Felizardo
Context: In the social web paradigm, discussion forums facilitate knowledge transfer among developers. However, manually finding helpful information in discussions on a particular topic is complex, making it a significant challenge for knowledge management. Objective: The objective of this paper is to explore the representation of knowledge supported by graphs generated from discussion forums in Software Engineering. Method: Firstly, graphs were built considering the discussion topics of the Stack Overflow forum. Visual analysis and analysis of the thematic relevance of the graphs were performed. Next, an evaluation of the graphs generated through interviews with software industry professionals was also conducted to obtain a practical view of the study. Finally, a preliminary practical analysis was conducted to evaluate the use of graphs, visually representing the Stack Overflow discussion topic content, as a complementary resource to understanding the discussion text. Results. The use of graphs presented interesting results both in visual analyzes and in analyzes from a professional's perspective. Conclusion: Using graphs generated from discussion forums can help the software industry identify useful information and new trends. Graphs can be considered a complementary resource for understanding the discussion text. We expect that, with the results achieved in this study, software organizations, as well as researchers in the area, can focus efforts on the use of approaches that help, through visual representation of knowledge, the understanding of large textual bases of discussion forums, as Stack Overflow, and allow us to infer helpful information that assists organizations in project decisions.
{"title":"Applying graph-based knowledge representation to capture insights from discussions forum in software engineering","authors":"Patrick Rodrigo Da Silva , Érica Ferreira de Souza , Glaúcia Braga e Silva , Giovani Volnei Meinerz , Katia Romero Felizardo","doi":"10.1016/j.scico.2025.103298","DOIUrl":"10.1016/j.scico.2025.103298","url":null,"abstract":"<div><div><em>Context:</em> In the social web paradigm, discussion forums facilitate knowledge transfer among developers. However, manually finding helpful information in discussions on a particular topic is complex, making it a significant challenge for knowledge management. <em>Objective:</em> The objective of this paper is to explore the representation of knowledge supported by graphs generated from discussion forums in Software Engineering. <em>Method:</em> Firstly, graphs were built considering the discussion topics of the Stack Overflow forum. Visual analysis and analysis of the thematic relevance of the graphs were performed. Next, an evaluation of the graphs generated through interviews with software industry professionals was also conducted to obtain a practical view of the study. Finally, a preliminary practical analysis was conducted to evaluate the use of graphs, visually representing the Stack Overflow discussion topic content, as a complementary resource to understanding the discussion text. <em>Results</em>. The use of graphs presented interesting results both in visual analyzes and in analyzes from a professional's perspective. <em>Conclusion:</em> Using graphs generated from discussion forums can help the software industry identify useful information and new trends. Graphs can be considered a complementary resource for understanding the discussion text. We expect that, with the results achieved in this study, software organizations, as well as researchers in the area, can focus efforts on the use of approaches that help, through visual representation of knowledge, the understanding of large textual bases of discussion forums, as Stack Overflow, and allow us to infer helpful information that assists organizations in project decisions.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103298"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143642927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-02-26DOI: 10.1016/j.scico.2025.103287
Yue Wu , Zhentao He , Qingnan Wang , Yihui Wang , Huaxiao Liu
The perceived delays experienced by Android application users can have a significant impact on their overall experience. Slow UI rendering is a major factor causing perceived delays. Poorly implemented UI layouts can have a considerable impact on rendering performance. To optimize the rendering performance of a layout, an important way is to solve its hierarchy issues. While there are layout performance analysis tools available, they lack effective solutions for fixing hierarchy issues, which limits their ability to assist developers in resolving such issues. In this paper, we propose a novel approach called LayoutOptimizer, which can automatically identify and solve two common hierarchy issues in Android layouts. The evaluation based on 31 layouts from real-world apps demonstrates that LayoutOptimizer can effectively fix the two common hierarchy issues while ensuring visual consistency.
{"title":"LayoutOptimizer: A layout rendering performance optimizer for Android application","authors":"Yue Wu , Zhentao He , Qingnan Wang , Yihui Wang , Huaxiao Liu","doi":"10.1016/j.scico.2025.103287","DOIUrl":"10.1016/j.scico.2025.103287","url":null,"abstract":"<div><div>The perceived delays experienced by Android application users can have a significant impact on their overall experience. Slow UI rendering is a major factor causing perceived delays. Poorly implemented UI layouts can have a considerable impact on rendering performance. To optimize the rendering performance of a layout, an important way is to solve its hierarchy issues. While there are layout performance analysis tools available, they lack effective solutions for fixing hierarchy issues, which limits their ability to assist developers in resolving such issues. In this paper, we propose a novel approach called LayoutOptimizer, which can automatically identify and solve two common hierarchy issues in Android layouts. The evaluation based on 31 layouts from real-world apps demonstrates that LayoutOptimizer can effectively fix the two common hierarchy issues while ensuring visual consistency.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103287"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143510919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background. The Online Just-In-Time Software Defect Prediction (O-JIT-SDP) employs an online model to predict whether a new software change will introduce a bug. Previous studies have neglected to consider the interaction between Software Quality Assurance (SQA) personnel and the model, potentially missing opportunities to refine prediction accuracy through human feedback. Problem. A recent study introduced the first Human-In-The-Loop (HITL) O-JIT-SDP framework called HumLa, integrating SQA staff feedback without accounting for inspection time to boost the prediction performance of O-JIT-SDP. However, upon a thorough revisit of HumLa, we find that while certain aspects of the HITL O-JIT-SDP system appear feasible in ideal conditions, they prove impractical in real-world context. Objective. We aim to reformulate HITL O-JIT-SDP, which are crucial yet absent for practical application. Method. We propose four crucial enhancements to facilitate practical application of HITL O-JIT-SDP. First, we advocate for the use of observed labels rather than ground-truth labels to evaluate online classifiers in real-world settings. Second, we suggest refraining from utilizing the entire data stream for normalizing features of each new instance, as was done in HumLa. Third, we propose incorporating non-zero SQA inspection time into the formulation of HITL O-JIT-SDP. Fourth, we introduce real-time statistical classifier comparison into the HITL system. Result. Our replication uncovers that the performance evaluation of HumLa under a practical scenario significantly deviate from the originally reported performance under an ideal experimental scenario, potentially diminishing the promise of HITL O-JIT-SDP. Furthermore, with our enhanced HITL O-JIT-SDP framework, we revisit a fundamental question in O-JIT-SDP: the benefits of HITL integration. Our experimental findings demonstrate that HITL not only enhances O-JIT-SDP when SQA feedback surpasses Bug-Fixing Commit (BFC) feedback (by providing training commits with superior label quality in less time) but also improves O-JIT-SDP even when SQA feedback delay equals that of BFC feedback (by consistently delivering training commits with improved label quality). The real-time statistical analysis reveals that HITL approaches generally outperform non-HITL O-JIT-SDP approaches with a statistically significant margin. Conclusion. Our work bolsters model evaluation credibility and holds the potential to substantially enhance the value of HITL O-JIT-SDP for industrial applications.
{"title":"Human-in-the-loop online just-in-time software defect prediction: What have we achieved and what do we still miss?","authors":"Xutong Liu , Yufei Zhou , Yutian Tang , Junyan Qian , Yuming Zhou","doi":"10.1016/j.scico.2025.103296","DOIUrl":"10.1016/j.scico.2025.103296","url":null,"abstract":"<div><div><strong>Background.</strong> The Online Just-In-Time Software Defect Prediction (O-JIT-SDP) employs an online model to predict whether a new software change will introduce a bug. Previous studies have neglected to consider the interaction between Software Quality Assurance (SQA) personnel and the model, potentially missing opportunities to refine prediction accuracy through human feedback. <strong>Problem.</strong> A recent study introduced the first Human-In-The-Loop (HITL) O-JIT-SDP framework called HumLa, integrating SQA staff feedback without accounting for inspection time to boost the prediction performance of O-JIT-SDP. However, upon a thorough revisit of HumLa, we find that while certain aspects of the HITL O-JIT-SDP system appear feasible in ideal conditions, they prove impractical in real-world context. <strong>Objective.</strong> We aim to reformulate HITL O-JIT-SDP, which are crucial yet absent for practical application. <strong>Method.</strong> We propose four crucial enhancements to facilitate practical application of HITL O-JIT-SDP. First, we advocate for the use of observed labels rather than ground-truth labels to evaluate online classifiers in real-world settings. Second, we suggest refraining from utilizing the entire data stream for normalizing features of each new instance, as was done in HumLa. Third, we propose incorporating non-zero SQA inspection time into the formulation of HITL O-JIT-SDP. Fourth, we introduce real-time statistical classifier comparison into the HITL system. <strong>Result.</strong> Our replication uncovers that the performance evaluation of HumLa under a practical scenario significantly deviate from the originally reported performance under an ideal experimental scenario, potentially diminishing the promise of HITL O-JIT-SDP. Furthermore, with our enhanced HITL O-JIT-SDP framework, we revisit a fundamental question in O-JIT-SDP: the benefits of HITL integration. Our experimental findings demonstrate that HITL not only enhances O-JIT-SDP when SQA feedback surpasses Bug-Fixing Commit (BFC) feedback (by providing training commits with superior label quality in less time) but also improves O-JIT-SDP even when SQA feedback delay equals that of BFC feedback (by consistently delivering training commits with improved label quality). The real-time statistical analysis reveals that HITL approaches generally outperform non-HITL O-JIT-SDP approaches with a statistically significant margin. <strong>Conclusion.</strong> Our work bolsters model evaluation credibility and holds the potential to substantially enhance the value of HITL O-JIT-SDP for industrial applications.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103296"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143551678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-01Epub Date: 2025-02-28DOI: 10.1016/j.scico.2025.103288
Zezhong Chen , Yuxin Deng , Wenjie Du
Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. We introduce the Trustworthiness Derivation Tree Analyzer (Trusta), a tool designed to enhance the development and evaluation of assurance cases by integrating formal methods and large language models (LLMs). The tool incorporates a Prolog interpreter and solvers like Z3 and MONA to handle various constraint types, enhancing the precision and efficiency of assurance case assessment. Beyond traditional formal methods, Trusta harnesses the power of LLMs including ChatGPT-3.5, ChatGPT-4, and PaLM 2, assisting humans in the development of assurance cases and the writing of formal constraints. Our evaluation, through qualitative and quantitative analyses, shows Trusta's impact on improving assurance case quality and efficiency. Trusta enables junior engineers to reach the skill level of experienced safety experts, narrowing the expertise gap and greatly benefiting those with limited experience. Case studies, including automated guided vehicles (AGVs), demonstrate Trusta's effectiveness in identifying subtle issues and improving the overall trustworthiness of complex systems.
{"title":"Trusta: Reasoning about assurance cases with formal methods and large language models","authors":"Zezhong Chen , Yuxin Deng , Wenjie Du","doi":"10.1016/j.scico.2025.103288","DOIUrl":"10.1016/j.scico.2025.103288","url":null,"abstract":"<div><div>Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. We introduce the Trustworthiness Derivation Tree Analyzer (Trusta), a tool designed to enhance the development and evaluation of assurance cases by integrating formal methods and large language models (LLMs). The tool incorporates a Prolog interpreter and solvers like Z3 and MONA to handle various constraint types, enhancing the precision and efficiency of assurance case assessment. Beyond traditional formal methods, Trusta harnesses the power of LLMs including ChatGPT-3.5, ChatGPT-4, and PaLM 2, assisting humans in the development of assurance cases and the writing of formal constraints. Our evaluation, through qualitative and quantitative analyses, shows Trusta's impact on improving assurance case quality and efficiency. Trusta enables junior engineers to reach the skill level of experienced safety experts, narrowing the expertise gap and greatly benefiting those with limited experience. Case studies, including automated guided vehicles (AGVs), demonstrate Trusta's effectiveness in identifying subtle issues and improving the overall trustworthiness of complex systems.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"244 ","pages":"Article 103288"},"PeriodicalIF":1.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}