Pub Date : 2026-01-01Epub Date: 2025-06-13DOI: 10.1016/j.scico.2025.103351
Cecilia Manzino, Gonzalo de Latorre
This paper presents a domain-specific language, embedded in Haskell (EDSL), for enforcing the information flow property Delimited Release. To build this language we use Haskell extensions that will allow some kind of dependently-typed programming.
Considering the effort it takes to build a language from scratch, we decided to provide an information-flow security language as an EDSL, using the infrastructure of the host language to support it.
The decision to use Haskell as the implementation language was driven by its powerful type system that makes it possible to encode the security type system of the embedded language at the type level, as well as by its nature as a general-purpose language.
The implementation follows an approach in which the type of the abstract syntax of the embedded language is decorated with security type information. In this way, typed programs will correspond to secure programs, and the verification of the security invariants of programs will be reduced to type-checking.
The embedded security language is designed in a way that is easy to use. We illustrate its use through three examples: an electronic purchase, secure reading of database information, and a password checker.
{"title":"A Haskell-embedded DSL for secure information-flow","authors":"Cecilia Manzino, Gonzalo de Latorre","doi":"10.1016/j.scico.2025.103351","DOIUrl":"10.1016/j.scico.2025.103351","url":null,"abstract":"<div><div>This paper presents a domain-specific language, embedded in Haskell (EDSL), for enforcing the information flow property <em>Delimited Release</em>. To build this language we use Haskell extensions that will allow some kind of dependently-typed programming.</div><div>Considering the effort it takes to build a language from scratch, we decided to provide an information-flow security language as an EDSL, using the infrastructure of the host language to support it.</div><div>The decision to use Haskell as the implementation language was driven by its powerful type system that makes it possible to encode the security type system of the embedded language at the type level, as well as by its nature as a general-purpose language.</div><div>The implementation follows an approach in which the type of the abstract syntax of the embedded language is decorated with security type information. In this way, typed programs will correspond to secure programs, and the verification of the security invariants of programs will be reduced to type-checking.</div><div>The embedded security language is designed in a way that is easy to use. We illustrate its use through three examples: an electronic purchase, secure reading of database information, and a password checker.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103351"},"PeriodicalIF":1.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144312700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-09DOI: 10.1016/j.scico.2025.103349
Giuseppe De Palma , Saverio Giallorenzo , Jacopo Mauro , Matteo Trentin , Gianluigi Zavattaro
The Function-as-a-Service (FaaS) paradigm offers a serverless approach that abstracts the management of underlying infrastructure, enabling developers to focus on application logic. However, leveraging infrastructure-aware features can further optimize serverless performance.
We present a software prototype that enhances Apache OpenWhisk serverless platform with a novel architecture incorporating tAPP (topology-aware Allocation Priority Policies), a declarative language designed for specifying topology-aware scheduling policies. Through a case study involving distributed data access across multiple cloud regions, we show that tAPP can significantly reduce latency and minimizes performance variability compared to the standard OpenWhisk implementation.
{"title":"tAPP OpenWhisk: A serverless platform for topology-aware allocation priority policies","authors":"Giuseppe De Palma , Saverio Giallorenzo , Jacopo Mauro , Matteo Trentin , Gianluigi Zavattaro","doi":"10.1016/j.scico.2025.103349","DOIUrl":"10.1016/j.scico.2025.103349","url":null,"abstract":"<div><div>The Function-as-a-Service (FaaS) paradigm offers a serverless approach that abstracts the management of underlying infrastructure, enabling developers to focus on application logic. However, leveraging infrastructure-aware features can further optimize serverless performance.</div><div>We present a software prototype that enhances Apache OpenWhisk serverless platform with a novel architecture incorporating tAPP (topology-aware Allocation Priority Policies), a declarative language designed for specifying topology-aware scheduling policies. Through a case study involving distributed data access across multiple cloud regions, we show that tAPP can significantly reduce latency and minimizes performance variability compared to the standard OpenWhisk implementation.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103349"},"PeriodicalIF":1.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-30DOI: 10.1016/j.scico.2025.103356
Gianluca Aguzzi, Matteo Cerioni, Mirko Viroli
ScaFi-Blocks is a visual, low-code programming environment for designing and implementing swarm algorithms. Built on the ScaFi aggregate computing framework and the Blockly visual programming library, ScaFi-Blocks enables users to visually compose algorithms using intuitive building blocks, abstracting away the complexities of traditional swarm programming frameworks. This approach simplifies the development of collective behaviours for a wide range of swarm systems, including robot swarms, IoT device ensembles, and sensor networks, fostering broader accessibility and innovation within the field. This contribution bridges the gap between visual programming and textual code, lowering the barrier to entry for non-experts while promoting a deeper understanding of aggregate computing principles.
{"title":"Low-code design of collective systems with ScaFi-Blocks","authors":"Gianluca Aguzzi, Matteo Cerioni, Mirko Viroli","doi":"10.1016/j.scico.2025.103356","DOIUrl":"10.1016/j.scico.2025.103356","url":null,"abstract":"<div><div>ScaFi-Blocks is a visual, low-code programming environment for designing and implementing swarm algorithms. Built on the ScaFi aggregate computing framework and the Blockly visual programming library, ScaFi-Blocks enables users to visually compose algorithms using intuitive building blocks, abstracting away the complexities of traditional swarm programming frameworks. This approach simplifies the development of collective behaviours for a wide range of swarm systems, including robot swarms, IoT device ensembles, and sensor networks, fostering broader accessibility and innovation within the field. This contribution bridges the gap between visual programming and textual code, lowering the barrier to entry for non-experts while promoting a deeper understanding of aggregate computing principles.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103356"},"PeriodicalIF":1.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-07-16DOI: 10.1016/j.scico.2025.103361
Ruibang Liu, Hongming Liu, Guoqiang Li
Formal verification plays a critical role in contemporary computer science, offering mathematically rigorous methods to ensure the correctness, reliability, and security of programs. Loops, due to their complexity and uncertainty, have become a major challenge in program verification. Loop invariants are often employed to abstract the properties of loops within a program, making the automatic generation of such invariants a pivotal challenge. Among the various methods, template-based frameworks grounded in Farkas' Lemma are recognized for their effectiveness in generating tight invariants in the realm of constraint solving. Recent advances have identified the conversion from conjunctive normal form (CNF) to disjunctive normal form (DNF) as a major bottleneck, leading to a combinatorial explosion. In this study, we introduce an optimized algorithm to address the combinatorial explosion by trading off space for time efficiency. Our approach employs two key strategies, divide-and-conquer, and pruning, to boost speed. First, we apply a divide-and-conquer strategy to decompose a complex problem into smaller, more manageable subproblems that can be solved quickly and in parallel. Second, we intelligently apply a pruning strategy, navigating the depth-first search process to avoid unnecessary checks. These improvements maintain the accuracy and speed up the analysis. We constructed a small dataset to showcase the superiority of our tool, which achieved an average speedup of 9.27x on this dataset. The experiments demonstrate that our method provides significant acceleration while maintaining accuracy and indicate that our approach outperforms the state-of-the-art methods.
{"title":"Optimization of Farkas' Lemma-based linear invariant generation using divide-and-conquer with pruning","authors":"Ruibang Liu, Hongming Liu, Guoqiang Li","doi":"10.1016/j.scico.2025.103361","DOIUrl":"10.1016/j.scico.2025.103361","url":null,"abstract":"<div><div>Formal verification plays a critical role in contemporary computer science, offering mathematically rigorous methods to ensure the correctness, reliability, and security of programs. Loops, due to their complexity and uncertainty, have become a major challenge in program verification. Loop invariants are often employed to abstract the properties of loops within a program, making the automatic generation of such invariants a pivotal challenge. Among the various methods, template-based frameworks grounded in Farkas' Lemma are recognized for their effectiveness in generating tight invariants in the realm of constraint solving. Recent advances have identified the conversion from conjunctive normal form (CNF) to disjunctive normal form (DNF) as a major bottleneck, leading to a combinatorial explosion. In this study, we introduce an optimized algorithm to address the combinatorial explosion by trading off space for time efficiency. Our approach employs two key strategies, divide-and-conquer, and pruning, to boost speed. First, we apply a divide-and-conquer strategy to decompose a complex problem into smaller, more manageable subproblems that can be solved quickly and in parallel. Second, we intelligently apply a pruning strategy, navigating the depth-first search process to avoid unnecessary checks. These improvements maintain the accuracy and speed up the analysis. We constructed a small dataset to showcase the superiority of our tool, which achieved an average speedup of 9.27x on this dataset. The experiments demonstrate that our method provides significant acceleration while maintaining accuracy and indicate that our approach outperforms the state-of-the-art methods.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"247 ","pages":"Article 103361"},"PeriodicalIF":1.5,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144656371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-27DOI: 10.1016/j.scico.2025.103338
Caterina Urban , Pavle Subotić , Filip Drobnjaković
Data leakage is a well-known problem in machine learning which occurs when the training and testing datasets are not independent. This phenomenon leads to unreliably overly optimistic accuracy estimates at training time, followed by a significant drop in performance when models are deployed in the real world. This can be dangerous, notably when models are used for risk prediction in high-stakes applications. In this paper, we propose an abstract interpretation-based static analysis to prove the absence of data leakage at development time, long before model deployment and even before model training. We implemented it in the NBLyzer framework and we demonstrate its performance and precision on 2111 Jupyter notebooks from the Kaggle competition platform.
{"title":"Static analysis by abstract interpretation against data leakage in machine learning","authors":"Caterina Urban , Pavle Subotić , Filip Drobnjaković","doi":"10.1016/j.scico.2025.103338","DOIUrl":"10.1016/j.scico.2025.103338","url":null,"abstract":"<div><div>Data leakage is a well-known problem in machine learning which occurs when the training and testing datasets are not independent. This phenomenon leads to unreliably overly optimistic accuracy estimates at training time, followed by a significant drop in performance when models are deployed in the real world. This can be dangerous, notably when models are used for risk prediction in high-stakes applications. In this paper, we propose an abstract interpretation-based static analysis to prove the absence of data leakage at development time, long before model deployment and even before model training. We implemented it in the <span>NBLyzer</span> framework and we demonstrate its performance and precision on 2111 Jupyter notebooks from the Kaggle competition platform.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103338"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144167755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-20DOI: 10.1016/j.scico.2025.103333
Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa
Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports, mitigating the necessity for extensive training data while maintaining classification reliability. In our research, we developed an LLM-based approach for accurately labeling issues by selecting two of the most prominent large language models. We then compared their performance across multiple datasets. Our findings show that GPT-4o achieved the best results in classifying issues from the NLBSE 2024 competition. Moreover, GPT-4o outperformed DeepSeek R1, achieving an F1 score 20% higher when both models were trained on the same dataset from the NLBSE 2023 competition, which was ten times larger than the NLBSE 2024 dataset. The fine-tuned GPT-4o model attained an average F1 score of 80.7%, while the fine-tuned DeepSeek R1 model achieved 59.33%. Increasing the dataset size did not improve the F1 score, reducing the dependence on massive datasets for building an efficient solution to issue classification. Notably, in individual repositories, some of our models predicted issue labels with a precision greater than 98%, a recall of 97%, and an F1 score of 90%.
{"title":"Applying large language models to issue classification: Revisiting with extended data and new models","authors":"Gabriel Aracena , Kyle Luster , Fabio Santos , Igor Steinmacher , Marco A. Gerosa","doi":"10.1016/j.scico.2025.103333","DOIUrl":"10.1016/j.scico.2025.103333","url":null,"abstract":"<div><div>Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports, mitigating the necessity for extensive training data while maintaining classification reliability. In our research, we developed an LLM-based approach for accurately labeling issues by selecting two of the most prominent large language models. We then compared their performance across multiple datasets. Our findings show that GPT-4o achieved the best results in classifying issues from the NLBSE 2024 competition. Moreover, GPT-4o outperformed DeepSeek R1, achieving an F1 score 20% higher when both models were trained on the same dataset from the NLBSE 2023 competition, which was ten times larger than the NLBSE 2024 dataset. The fine-tuned GPT-4o model attained an average F1 score of 80.7%, while the fine-tuned DeepSeek R1 model achieved 59.33%. Increasing the dataset size did not improve the F1 score, reducing the dependence on massive datasets for building an efficient solution to issue classification. Notably, in individual repositories, some of our models predicted issue labels with a precision greater than 98%, a recall of 97%, and an F1 score of 90%.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103333"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-13DOI: 10.1016/j.scico.2025.103323
Jan Roßbach, Michael Leuschel
Certified control makes it possible to use artificial intelligence for safety-critical systems. It is a runtime monitoring architecture, which requires an AI to provide certificates for its decisions; these certificates can then be checked by a separate classical system. In this article, we evaluate the practicality of certified control for providing formal guarantees about an AI-based perception system. In this case study, we implemented a certificate checker that uses classical computer vision algorithms to verify railway signs detected by an AI object detection model. We have integrated this prototype with the popular object detection model YOLO. Performance metrics on generated data are promising for the use-case, but further research is needed to generalize certified control for other tasks.
{"title":"Certified control for train sign classification","authors":"Jan Roßbach, Michael Leuschel","doi":"10.1016/j.scico.2025.103323","DOIUrl":"10.1016/j.scico.2025.103323","url":null,"abstract":"<div><div>Certified control makes it possible to use artificial intelligence for safety-critical systems. It is a runtime monitoring architecture, which requires an AI to provide certificates for its decisions; these certificates can then be checked by a separate classical system. In this article, we evaluate the practicality of certified control for providing formal guarantees about an AI-based perception system. In this case study, we implemented a certificate checker that uses classical computer vision algorithms to verify railway signs detected by an AI object detection model. We have integrated this prototype with the popular object detection model YOLO. Performance metrics on generated data are promising for the use-case, but further research is needed to generalize certified control for other tasks.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103323"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-16DOI: 10.1016/j.scico.2025.103332
Jia Lee, Geunyeol Yu, Kyungmin Bae
Signal temporal logic (STL) is a temporal logic used to specify properties of continuous signals. STL has been widely applied in specifying, monitoring, and testing properties of hybrid systems that exhibit both discrete and continuous behavior. However, model checking techniques for hybrid systems have primarily been limited to invariant and reachability properties. This paper introduces bounded model checking algorithms and a tool for general STL properties of hybrid systems. Central to our technique is a novel logical foundation for STL, which includes: (i) syntactic separation, decomposing an STL formula into components, with each component depending exclusively on separate segments of a signal; (ii) signal discretization, ensuring a complete abstraction of a signal through a set of discrete elements; and (iii) ϵ-strengthening, reducing robust STL model checking to Boolean STL model checking. With this new foundation, the robust STL model checking problem can be reduced to the satisfiability of a first-order logic formula. This allows us to develop the first model checking algorithm for STL that can guarantee the correctness of STL up to given bound parameters and robustness threshold, along with a pioneering bounded model checker for hybrid systems, called STLmc. We demonstrate the effectiveness of STLmc on a number of hybrid system case studies.
{"title":"SMT-based robust model checking for signal temporal logic","authors":"Jia Lee, Geunyeol Yu, Kyungmin Bae","doi":"10.1016/j.scico.2025.103332","DOIUrl":"10.1016/j.scico.2025.103332","url":null,"abstract":"<div><div>Signal temporal logic (STL) is a temporal logic used to specify properties of continuous signals. STL has been widely applied in specifying, monitoring, and testing properties of hybrid systems that exhibit both discrete and continuous behavior. However, model checking techniques for hybrid systems have primarily been limited to invariant and reachability properties. This paper introduces bounded model checking algorithms and a tool for general STL properties of hybrid systems. Central to our technique is a novel logical foundation for STL, which includes: (i) syntactic separation, decomposing an STL formula into components, with each component depending exclusively on separate segments of a signal; (ii) signal discretization, ensuring a complete abstraction of a signal through a set of discrete elements; and (iii) <em>ϵ</em>-strengthening, reducing robust STL model checking to Boolean STL model checking. With this new foundation, the robust STL model checking problem can be reduced to the satisfiability of a first-order logic formula. This allows us to develop the first model checking algorithm for STL that can guarantee the correctness of STL up to given bound parameters and robustness threshold, along with a pioneering bounded model checker for hybrid systems, called <span>STLmc</span>. We demonstrate the effectiveness of <span>STLmc</span> on a number of hybrid system case studies.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103332"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144107524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-07DOI: 10.1016/j.scico.2025.103320
Szymon Stradowski , Lech Madeyski
Context
“Your AI is impressive, but my code does not contain any bugs”— such a statement from a software developer is the antithesis of a quality mindset and open communication. What makes it worse is that it is oftentimes true.
Objective
This paper analyses false positives' impact and related challenges in machine learning software defect prediction and describes the mitigation possibilities.
Methods
We propose a broad-picture perspective on dealing with false positive predictions based on what we learned from our industrial implementation study in Nokia 5G.
Results
Accordingly, we draw a new direction in transitioning defect prediction into a well-established industry practice, as well as highlight potential emerging topics in predictive software engineering.
Conclusion
Increasing human buy-in and the business impact of predictions significantly improves the chances of future software defect prediction industry adoptions to succeed.
{"title":"“Your AI is impressive, but my code does not have any bugs” managing false positives in industrial contexts","authors":"Szymon Stradowski , Lech Madeyski","doi":"10.1016/j.scico.2025.103320","DOIUrl":"10.1016/j.scico.2025.103320","url":null,"abstract":"<div><h3>Context</h3><div>“Your AI is impressive, but my code does not contain any bugs”— such a statement from a software developer is the antithesis of a quality mindset and open communication. What makes it worse is that it is oftentimes true.</div></div><div><h3>Objective</h3><div>This paper analyses false positives' impact and related challenges in machine learning software defect prediction and describes the mitigation possibilities.</div></div><div><h3>Methods</h3><div>We propose a broad-picture perspective on dealing with false positive predictions based on what we learned from our industrial implementation study in Nokia 5G.</div></div><div><h3>Results</h3><div>Accordingly, we draw a new direction in transitioning defect prediction into a well-established industry practice, as well as highlight potential emerging topics in predictive software engineering.</div></div><div><h3>Conclusion</h3><div>Increasing human buy-in and the business impact of predictions significantly improves the chances of future software defect prediction industry adoptions to succeed.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103320"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143924121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-05-28DOI: 10.1016/j.scico.2025.103334
Aissam Belghiat
UML sequence diagrams provide a visual notation for modeling the behavior of object interactions in systems. They lack precise formal semantics due to the semi-formal nature of the UML language which hinders their automated analysis and verification. Process algebras have been widely used in the literature in order to deal with such problems. π-calculus is a well-known process algebra recognized for its rich theoretical foundation and high expressivity power. It is also characterized by its capabilities in specifying interleaving and weak sequencing which is considered by the OMG standard as the default semantics for interaction diagrams. Thus, this paper presents a novel approach to formalizing UML 2 sequence diagrams by translating them into π-calculus. The translation captures the semantics of their basic elements as well as their combined fragments. A compositional technique is adopted to gradually build the corresponding π-calculus specification which results in easy induction/recursion of elements and their meaning enabling reasoning about complex dynamic behaviors. The latter task could be done using different analysis tools such as the MWB tool used in this study. The mapping provides a formal semantics as well as formal analysis and verification for UML2 sequence diagrams according to the OMG standard. A case study is shown to illustrate the usefulness of the translation.
{"title":"Interleaving semantics and verification of UML 2 dynamic interactions using process algebra","authors":"Aissam Belghiat","doi":"10.1016/j.scico.2025.103334","DOIUrl":"10.1016/j.scico.2025.103334","url":null,"abstract":"<div><div>UML sequence diagrams provide a visual notation for modeling the behavior of object interactions in systems. They lack precise formal semantics due to the semi-formal nature of the UML language which hinders their automated analysis and verification. Process algebras have been widely used in the literature in order to deal with such problems. <em>π</em>-calculus is a well-known process algebra recognized for its rich theoretical foundation and high expressivity power. It is also characterized by its capabilities in specifying interleaving and weak sequencing which is considered by the OMG standard as the default semantics for interaction diagrams. Thus, this paper presents a novel approach to formalizing UML 2 sequence diagrams by translating them into <em>π</em>-calculus. The translation captures the semantics of their basic elements as well as their combined fragments. A compositional technique is adopted to gradually build the corresponding <em>π</em>-calculus specification which results in easy induction/recursion of elements and their meaning enabling reasoning about complex dynamic behaviors. The latter task could be done using different analysis tools such as the MWB tool used in this study. The mapping provides a formal semantics as well as formal analysis and verification for UML2 sequence diagrams according to the OMG standard. A case study is shown to illustrate the usefulness of the translation.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"246 ","pages":"Article 103334"},"PeriodicalIF":1.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144167754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}