Pub Date : 2024-04-29DOI: 10.1007/s10270-024-01173-1
Xiao He, Yi Liu, Huihong He
Similarity-based model matching is the cornerstone of model versioning. It pairs model elements based on a distance metric (e.g., edit distance). However, calculating the distances between elements is computationally expensive. Consequently, a similarity-based matcher typically suffers from performance issues when the model size increases. Based on observation, there are two main causes of the high computation cost: (1) when matching an element p, the matcher calculates the distance between p and every candidate element q, despite the obvious dissimilarity between p and q; (2) the matcher always calculates the distance between p and (q'), even though q and (q') are very similar and the distance between p and q is already known. This paper proposes a dual-hash-based approach, which employs two entirely different hashing techniques—similarity-preserving hashing and integrity-based hashing—to accelerate similarity-based model matching. With similarity-preserving hashing, our approach can quickly filter out the dissimilar candidate elements according to their similarity hashes computed using our similarity-preserving hash function, which maps an element to a 64-bit binary hash. With integrity-based hashing, our approach can cache and reuse computed distance values by associating them with the checksums of model elements. We also propose an index structure to facilitate hash-based model matching. Our approach has been implemented and integrated into EMF Compare. We evaluate our approach using open-source Ecore and UML models. The results show that our hash function is effective in preserving the similarity between model elements and our matching approach reduces time costs by 20–88% while assuring the matching results consistent with EMF Compare.
基于相似性的模型匹配是模型版本化的基石。它根据距离度量(如编辑距离)对模型元素进行配对。然而,计算元素之间的距离需要耗费大量计算资源。因此,当模型规模增大时,基于相似性的匹配器通常会出现性能问题。根据观察,计算成本高的主要原因有两个:(1)当匹配一个元素 p 时,匹配器会计算 p 和每个候选元素 q 之间的距离,尽管 p 和 q 之间有明显的不相似性;(2)匹配器总是计算 p 和 (q')之间的距离,尽管 q 和 (q')非常相似,并且 p 和 q 之间的距离已经已知。本文提出了一种基于双重散列的方法,它采用了两种完全不同的散列技术--保存相似性散列和基于完整性散列--来加速基于相似性的模型匹配。通过相似性保留哈希算法,我们的方法可以根据使用我们的相似性保留哈希函数计算出的相似性哈希值,快速筛选出不相似的候选元素,该哈希函数将元素映射为 64 位二进制哈希值。通过基于完整性的哈希算法,我们的方法可以将计算出的距离值与模型元素的校验和相关联,从而实现缓存和重用。我们还提出了一种索引结构,以促进基于哈希值的模型匹配。我们的方法已经实现并集成到 EMF Compare 中。我们使用开源 Ecore 和 UML 模型对我们的方法进行了评估。结果表明,我们的哈希函数能有效地保持模型元素之间的相似性,我们的匹配方法能减少 20-88% 的时间成本,同时确保匹配结果与 EMF Compare 一致。
{"title":"Accelerating similarity-based model matching using dual hashing","authors":"Xiao He, Yi Liu, Huihong He","doi":"10.1007/s10270-024-01173-1","DOIUrl":"https://doi.org/10.1007/s10270-024-01173-1","url":null,"abstract":"<p>Similarity-based model matching is the cornerstone of model versioning. It pairs model elements based on a distance metric (e.g., edit distance). However, calculating the distances between elements is computationally expensive. Consequently, a similarity-based matcher typically suffers from performance issues when the model size increases. Based on observation, there are two main causes of the high computation cost: (1) when matching an element <i>p</i>, the matcher calculates the distance between <i>p</i> and every candidate element <i>q</i>, despite the obvious dissimilarity between <i>p</i> and <i>q</i>; (2) the matcher always calculates the distance between <i>p</i> and <span>(q')</span>, even though <i>q</i> and <span>(q')</span> are very similar and the distance between <i>p</i> and <i>q</i> is already known. This paper proposes a dual-hash-based approach, which employs two entirely different hashing techniques—similarity-preserving hashing and integrity-based hashing—to accelerate similarity-based model matching. With similarity-preserving hashing, our approach can quickly filter out the dissimilar candidate elements according to their similarity hashes computed using our similarity-preserving hash function, which maps an element to a 64-bit binary hash. With integrity-based hashing, our approach can cache and reuse computed distance values by associating them with the checksums of model elements. We also propose an index structure to facilitate hash-based model matching. Our approach has been implemented and integrated into EMF Compare. We evaluate our approach using open-source Ecore and UML models. The results show that our hash function is effective in preserving the similarity between model elements and our matching approach reduces time costs by 20–88% while assuring the matching results consistent with EMF Compare.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"54 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-18DOI: 10.1007/s10270-024-01167-z
Hossain Muhammad Muctadir, David A. Manrique Negrin, Raghavendran Gunasekaran, Loek Cleophas, Mark van den Brand, Boudewijn R. Haverkort
Digital twins (DTs) are often defined as a pairing of a physical entity and a corresponding virtual entity (VE), mimicking certain aspects of the former depending on the use-case. In recent years, this concept has facilitated numerous use-cases ranging from design to validation and predictive maintenance of large and small high-tech systems. Various heterogeneous cross-domain models are essential for such systems, and model-driven engineering plays a pivotal role in the design, development, and maintenance of these models. We believe models and model-driven engineering play a similarly crucial role in the context of a VE of a DT. Due to the rapidly growing popularity of DTs and their use in diverse domains and use-cases, the methodologies, tools, and practices for designing, developing, and maintaining the corresponding VEs differ vastly. To better understand these differences and similarities, we performed a semi-structured interview research with 19 professionals from industry and academia who are closely associated with different lifecycle stages of digital twins. In this paper, we present our analysis and findings from this study, which is based on seven research questions. In general, we identified an overall lack of uniformity in terms of the understanding of digital twins and used tools, techniques, and methodologies for the development and maintenance of the corresponding VEs. Furthermore, considering that digital twins are software intensive systems, we recognize a significant growth potential for adopting more software engineering practices, processes, and expertise in various stages of a digital twin’s lifecycle.
数字孪生(DT)通常被定义为物理实体与相应虚拟实体(VE)的配对,根据使用情况模仿前者的某些方面。近年来,这一概念促进了从大型和小型高科技系统的设计、验证和预测性维护等众多用例。对于这些系统来说,各种异构的跨领域模型是必不可少的,而模型驱动工程在这些模型的设计、开发和维护中发挥着举足轻重的作用。我们认为,模型和模型驱动工程在 DT 的虚拟环境中也发挥着同样重要的作用。由于 DT 及其在不同领域和用例中的应用迅速普及,设计、开发和维护相应 VE 的方法、工具和实践也大相径庭。为了更好地了解这些异同,我们对来自产业界和学术界的 19 位与数字孪生不同生命周期阶段密切相关的专业人士进行了半结构式访谈研究。在本文中,我们将根据七个研究问题,介绍我们的分析和研究结果。总的来说,我们发现在对数字孪生的理解以及开发和维护相应虚拟环境所使用的工具、技术和方法方面,总体上缺乏统一性。此外,考虑到数字孪生是软件密集型系统,我们认为在数字孪生生命周期的各个阶段采用更多的软件工程实践、流程和专业知识具有巨大的发展潜力。
{"title":"Current trends in digital twin development, maintenance, and operation: an interview study","authors":"Hossain Muhammad Muctadir, David A. Manrique Negrin, Raghavendran Gunasekaran, Loek Cleophas, Mark van den Brand, Boudewijn R. Haverkort","doi":"10.1007/s10270-024-01167-z","DOIUrl":"https://doi.org/10.1007/s10270-024-01167-z","url":null,"abstract":"<p>Digital twins (DTs) are often defined as a pairing of a physical entity and a corresponding virtual entity (VE), mimicking certain aspects of the former depending on the use-case. In recent years, this concept has facilitated numerous use-cases ranging from design to validation and predictive maintenance of large and small high-tech systems. Various heterogeneous cross-domain models are essential for such systems, and model-driven engineering plays a pivotal role in the design, development, and maintenance of these models. We believe models and model-driven engineering play a similarly crucial role in the context of a VE of a DT. Due to the rapidly growing popularity of DTs and their use in diverse domains and use-cases, the methodologies, tools, and practices for designing, developing, and maintaining the corresponding VEs differ vastly. To better understand these differences and similarities, we performed a semi-structured interview research with 19 professionals from industry and academia who are closely associated with different lifecycle stages of digital twins. In this paper, we present our analysis and findings from this study, which is based on seven research questions. In general, we identified an overall lack of uniformity in terms of the understanding of digital twins and used tools, techniques, and methodologies for the development and maintenance of the corresponding VEs. Furthermore, considering that digital twins are software intensive systems, we recognize a significant growth potential for adopting more software engineering practices, processes, and expertise in various stages of a digital twin’s lifecycle.\u0000</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"2023 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140628921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-18DOI: 10.1007/s10270-024-01171-3
Giacomo Garaccione, Riccardo Coppola, Luca Ardito, Marco Torchiano
Gamification, the practice of using game elements in non-recreational contexts to increase user participation and interest, has been applied more and more throughout the years in software engineering. Business process modeling is a skill considered fundamental for software engineers, with Business Process Modeling Notation (BPMN) being one of the most commonly used notations for this discipline. BPMN modeling is present in different curricula in specific Master’s Degree courses related to software engineering but is usually seen by students as an unappealing or uninteresting activity. Gamification could potentially solve this issue, though there have been no relevant attempts in research yet. This paper aims at collecting preliminary insights on how gamification affects students’ motivation in performing BPMN modeling tasks and—as a consequence—their productivity and learning outcomes. A web application for modeling BPMN diagrams augmented with gamification mechanics such as feedback, rewards, progression, and penalization has been compared with a non-gamified version that provides more limited feedback in an experiment involving 200 students. The diagrams modeled by the students are collected and analyzed after the experiment. Students’ opinions are gathered using a post-experiment questionnaire. Statistical analysis showed that gamification leads students to check more often for their solutions’ correctness, increasing the semantic correctness of their diagrams, thus showing that it can improve students’ modeling skills. The results, however, are mixed and require additional experiments in the future to fine-tune the tool for actual classroom use.
{"title":"Gamification of business process modeling education: an experimental analysis","authors":"Giacomo Garaccione, Riccardo Coppola, Luca Ardito, Marco Torchiano","doi":"10.1007/s10270-024-01171-3","DOIUrl":"https://doi.org/10.1007/s10270-024-01171-3","url":null,"abstract":"<p>Gamification, the practice of using game elements in non-recreational contexts to increase user participation and interest, has been applied more and more throughout the years in software engineering. Business process modeling is a skill considered fundamental for software engineers, with Business Process Modeling Notation (BPMN) being one of the most commonly used notations for this discipline. BPMN modeling is present in different curricula in specific Master’s Degree courses related to software engineering but is usually seen by students as an unappealing or uninteresting activity. Gamification could potentially solve this issue, though there have been no relevant attempts in research yet. This paper aims at collecting preliminary insights on how gamification affects students’ motivation in performing BPMN modeling tasks and—as a consequence—their productivity and learning outcomes. A web application for modeling BPMN diagrams augmented with gamification mechanics such as feedback, rewards, progression, and penalization has been compared with a non-gamified version that provides more limited feedback in an experiment involving 200 students. The diagrams modeled by the students are collected and analyzed after the experiment. Students’ opinions are gathered using a post-experiment questionnaire. Statistical analysis showed that gamification leads students to check more often for their solutions’ correctness, increasing the semantic correctness of their diagrams, thus showing that it can improve students’ modeling skills. The results, however, are mixed and require additional experiments in the future to fine-tune the tool for actual classroom use.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"13 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140629103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Model transformations play an essential role in the model-driven engineering paradigm. However, writing a correct transformation requires the user to understand both what the transformation should do and how to enact that change in the transformation. This easily leads to syntactic and semantic errors in transformations which are time-consuming to locate and fix. In this article, we extend our evolutionary algorithm (EA) approach to automatically repair transformations containing multiple semantic errors. To prevent the fitness plateaus and the single fitness peak limitations from our previous work, we include the notion of social diversity as an objective for our EA to promote repair patches tackling errors that are less covered by the other patches of the population. We evaluate our approach on four ATL transformations, which have been mutated to contain up to five semantic errors simultaneously. Our evaluation shows that integrating social diversity when searching for repair patches improves the quality of those patches and speeds up the convergence even when up to five semantic errors are involved.
{"title":"Improving repair of semantic ATL errors using a social diversity metric","authors":"Zahra VaraminyBahnemiry, Jessie Galasso, Bentley Oakes, Houari Sahraoui","doi":"10.1007/s10270-024-01170-4","DOIUrl":"https://doi.org/10.1007/s10270-024-01170-4","url":null,"abstract":"<p>Model transformations play an essential role in the model-driven engineering paradigm. However, writing a correct transformation requires the user to understand both <i>what</i> the transformation should do and <i>how</i> to enact that change in the transformation. This easily leads to <i>syntactic</i> and <i>semantic</i> errors in transformations which are time-consuming to locate and fix. In this article, we extend our evolutionary algorithm (EA) approach to automatically repair transformations containing <i>multiple semantic errors</i>. To prevent the <i>fitness plateaus</i> and the <i>single fitness peak</i> limitations from our previous work, we include the notion of <i>social diversity</i> as an objective for our EA to promote repair patches tackling errors that are less covered by the other patches of the population. We evaluate our approach on four ATL transformations, which have been mutated to contain up to five semantic errors simultaneously. Our evaluation shows that integrating social diversity when searching for repair patches improves the quality of those patches and speeds up the convergence even when up to five semantic errors are involved.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"70 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140628836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s10270-023-01147-9
Sotirios Liaskos, Saba Zarbaf, John Mylopoulos, Shakil M. Khan
Conceptual modeling plays a central role in planning, designing, developing and maintaining software-intensive systems. One of the goals of conceptual modeling is to enable clear communication among stakeholders involved in said activities. To achieve effective communication, conceptual models must be understood by different people in the same way. To support such shared understanding, conceptual modeling languages are defined, which introduce rules and constraints on how individual models can be built and how they are to be understood. A key component of a modeling language is an ontology, i.e., a set of concepts that modelers must use to describe world phenomena. Once the concepts are chosen, a visual and/or textual vocabulary is adopted for representing the concepts. However, the choices both of the concepts and of the vocabulary used to represent them may affect the quality of the language under consideration: some choices may promote shared understanding better than other choices. To allow evaluation and comparison of alternative choices, we present Peira, a framework for empirically measuring the domain and comprehensibility appropriateness of conceptual modeling language ontologies. Given a language ontology to be evaluated, the framework is based on observing how prospective language users classify domain content under the concepts put forth by said ontology. A set of metrics is then used to analyze the observations and identify and characterize possible issues that the choice of concepts or the way they are represented may have. The metrics are abstract in that they can be operationalized into concrete implementations tailored to specific data collection instruments or study objectives. We evaluate the framework by applying it to compare an existing language against an artificial one that is manufactured to exhibit specific issues. We then test if the metrics indeed detect these issues. We find that the framework does offer the expected indications, but that it also requires good understanding of the metrics prior to committing to interpretations of the observations.
{"title":"Empirically evaluating modeling language ontologies: the Peira framework","authors":"Sotirios Liaskos, Saba Zarbaf, John Mylopoulos, Shakil M. Khan","doi":"10.1007/s10270-023-01147-9","DOIUrl":"https://doi.org/10.1007/s10270-023-01147-9","url":null,"abstract":"<p>Conceptual modeling plays a central role in planning, designing, developing and maintaining software-intensive systems. One of the goals of conceptual modeling is to enable clear communication among stakeholders involved in said activities. To achieve effective communication, conceptual models must be understood by different people in the same way. To support such shared understanding, conceptual modeling languages are defined, which introduce rules and constraints on how individual models can be built and how they are to be understood. A key component of a modeling language is an ontology, i.e., a set of concepts that modelers must use to describe world phenomena. Once the concepts are chosen, a visual and/or textual vocabulary is adopted for representing the concepts. However, the choices both of the concepts and of the vocabulary used to represent them may affect the quality of the language under consideration: some choices may promote shared understanding better than other choices. To allow evaluation and comparison of alternative choices, we present Peira, a framework for empirically measuring the domain and comprehensibility appropriateness of conceptual modeling language ontologies. Given a language ontology to be evaluated, the framework is based on observing how prospective language users classify domain content under the concepts put forth by said ontology. A set of metrics is then used to analyze the observations and identify and characterize possible issues that the choice of concepts or the way they are represented may have. The metrics are abstract in that they can be operationalized into concrete implementations tailored to specific data collection instruments or study objectives. We evaluate the framework by applying it to compare an existing language against an artificial one that is manufactured to exhibit specific issues. We then test if the metrics indeed detect these issues. We find that the framework does offer the expected indications, but that it also requires good understanding of the metrics prior to committing to interpretations of the observations.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"102 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140597273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-08DOI: 10.1007/s10270-024-01156-2
Marien R. Krouwel, Martin Op ’t Land, Henderik A. Proper
Due to hyper-competition, technological advancements, regulatory changes, etc, the conditions under which enterprises need to thrive become increasingly turbulent. Consequently, enterprise agility increasingly determines an enterprise’s chances for success. As software development often is a limiting factor in achieving enterprise agility, enterprise agility and software adaptability become increasingly intertwined. As a consequence, decisions that regard flexibility should not be left to software developers alone. By taking a Model-driven Software Development (MDSD) approach, starting from DEMO ontological enterprise models and explicit (enterprise) implementation design decisions, the aim of this research is to bridge the gap from enterprise agility to software adaptability, in such a way that software development is no longer a limiting factor in achieving enterprise agility. Low-code technology is a growing market trend that builds on MDSD concepts and claims to offer a high degree of software adaptability. Therefore, as a first step to show the potential benefits to use DEMO ontological enterprise models as a base for MDSD, this research shows the design of a mapping from DEMO models to Mendix for the (automated) creation of a low-code application that also intrinsically accommodates run-time implementation design decisions.
{"title":"From enterprise models to low-code applications: mapping DEMO to Mendix; illustrated in the social housing domain","authors":"Marien R. Krouwel, Martin Op ’t Land, Henderik A. Proper","doi":"10.1007/s10270-024-01156-2","DOIUrl":"https://doi.org/10.1007/s10270-024-01156-2","url":null,"abstract":"<p>Due to hyper-competition, technological advancements, regulatory changes, etc, the conditions under which enterprises need to thrive become increasingly turbulent. Consequently, enterprise agility increasingly determines an enterprise’s chances for success. As software development often is a limiting factor in achieving enterprise agility, enterprise agility and software adaptability become increasingly intertwined. As a consequence, decisions that regard flexibility should not be left to software developers alone. By taking a Model-driven Software Development (MDSD) approach, starting from DEMO ontological enterprise models and explicit (enterprise) implementation design decisions, the aim of this research is to bridge the gap from enterprise agility to software adaptability, in such a way that software development is no longer a limiting factor in achieving enterprise agility. Low-code technology is a growing market trend that builds on MDSD concepts and claims to offer a high degree of software adaptability. Therefore, as a first step to show the potential benefits to use DEMO ontological enterprise models as a base for MDSD, this research shows the design of a mapping from DEMO models to Mendix for the (automated) creation of a low-code application that also intrinsically accommodates run-time implementation design decisions.\u0000</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"8 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-23DOI: 10.1007/s10270-024-01158-0
Edi Muškardin, Martin Tappler, Bernhard K. Aichernig, Ingo Pill
Black-box systems are inherently hard to verify. Many verification techniques, like model checking, require formal models as a basis. However, such models often do not exist, or they might be outdated. Active automata learning helps to address this issue by offering to automatically infer formal models from system interactions. Hence, automata learning has been receiving much attention in the verification community in recent years. This led to various efficiency improvements, paving the way toward industrial applications. Most research, however, has been focusing on deterministic systems. In this article, we present an approach to efficiently learn models of stochastic reactive systems. Our approach adapts (L^*)-based learning for Markov decision processes, which we improve and extend to stochastic Mealy machines. When compared with previous work, our evaluation demonstrates that the proposed optimizations and adaptations to stochastic Mealy machines can reduce learning costs by an order of magnitude while improving the accuracy of learned models.
{"title":"Active model learning of stochastic reactive systems (extended version)","authors":"Edi Muškardin, Martin Tappler, Bernhard K. Aichernig, Ingo Pill","doi":"10.1007/s10270-024-01158-0","DOIUrl":"https://doi.org/10.1007/s10270-024-01158-0","url":null,"abstract":"<p>Black-box systems are inherently hard to verify. Many verification techniques, like model checking, require formal models as a basis. However, such models often do not exist, or they might be outdated. Active automata learning helps to address this issue by offering to automatically infer formal models from system interactions. Hence, automata learning has been receiving much attention in the verification community in recent years. This led to various efficiency improvements, paving the way toward industrial applications. Most research, however, has been focusing on deterministic systems. In this article, we present an approach to efficiently learn models of stochastic reactive systems. Our approach adapts <span>(L^*)</span>-based learning for Markov decision processes, which we improve and extend to stochastic Mealy machines. When compared with previous work, our evaluation demonstrates that the proposed optimizations and adaptations to stochastic Mealy machines can reduce learning costs by an order of magnitude while improving the accuracy of learned models.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"46 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140200814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-21DOI: 10.1007/s10270-024-01160-6
Bernhard K. Aichernig, Sandra König, Cristinel Mateis, Andrea Pferscher, Martin Tappler
In this article, we present a novel approach to learning finite automata with the help of recurrent neural networks. Our goal is not only to train a neural network that predicts the observable behavior of an automaton but also to learn its structure, including the set of states and transitions. In contrast to previous work, we constrain the training with a specific regularization term. We iteratively adapt the architecture to learn the minimal automaton, in the case where the number of states is unknown. We evaluate our approach with standard examples from the automata learning literature, but also include a case study of learning the finite-state models of real Bluetooth Low Energy protocol implementations. The results show that we can find an appropriate architecture to learn the correct minimal automata in all considered cases.
{"title":"Learning minimal automata with recurrent neural networks","authors":"Bernhard K. Aichernig, Sandra König, Cristinel Mateis, Andrea Pferscher, Martin Tappler","doi":"10.1007/s10270-024-01160-6","DOIUrl":"https://doi.org/10.1007/s10270-024-01160-6","url":null,"abstract":"<p>In this article, we present a novel approach to learning finite automata with the help of recurrent neural networks. Our goal is not only to train a neural network that predicts the observable behavior of an automaton but also to learn its structure, including the set of states and transitions. In contrast to previous work, we constrain the training with a specific regularization term. We iteratively adapt the architecture to learn the minimal automaton, in the case where the number of states is unknown. We evaluate our approach with standard examples from the automata learning literature, but also include a case study of learning the finite-state models of real Bluetooth Low Energy protocol implementations. The results show that we can find an appropriate architecture to learn the correct minimal automata in all considered cases.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"102 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140200667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-20DOI: 10.1007/s10270-024-01159-z
Clemens Dubslaff, Patrick Wienhöft, Ansgar Fehnker
Recursive state machines (RSMs) are state-based models for procedural programs with wide-ranging applications in program verification and interprocedural analysis. Model-checking algorithms for RSMs and related formalisms have been intensively studied in the literature. In this article, we devise a new model-checking algorithm for RSMs and requirements in computation tree logic (CTL) that exploits the compositional structure of RSMs by ternary model checking in combination with a lazy evaluation scheme. Specifically, a procedural component is only analyzed in those cases in which it might influence the satisfaction of the CTL requirement. We implemented our model-checking algorithms and evaluate them on randomized scalability benchmarks and on an interprocedural data-flow analysis of Java programs, showing both practical applicability and significant speedups in comparison to state-of-the-art model-checking tools for procedural programs.
{"title":"Lazy model checking for recursive state machines","authors":"Clemens Dubslaff, Patrick Wienhöft, Ansgar Fehnker","doi":"10.1007/s10270-024-01159-z","DOIUrl":"https://doi.org/10.1007/s10270-024-01159-z","url":null,"abstract":"<p><i>Recursive state machines (RSMs)</i> are state-based models for procedural programs with wide-ranging applications in program verification and interprocedural analysis. Model-checking algorithms for RSMs and related formalisms have been intensively studied in the literature. In this article, we devise a new model-checking algorithm for RSMs and requirements in <i>computation tree logic (CTL)</i> that exploits the compositional structure of RSMs by ternary model checking in combination with a lazy evaluation scheme. Specifically, a procedural component is only analyzed in those cases in which it might influence the satisfaction of the CTL requirement. We implemented our model-checking algorithms and evaluate them on randomized scalability benchmarks and on an interprocedural data-flow analysis of <span>Java</span> programs, showing both practical applicability and significant speedups in comparison to state-of-the-art model-checking tools for procedural programs.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"28 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140200745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-19DOI: 10.1007/s10270-024-01155-3
Jan Haltermann, Heike Wehrheim
Cooperative software validation aims at having verification and/or testing tools cooperate on the task of correctness checking. Cooperation involves the exchange of information about currently achieved results in the form of (verification) artifacts. These artifacts are typically specialized to the type of analysis performed by the tool, e.g., bounded model checking, abstract interpretation or symbolic execution, and hence require the definition of a new artifact for every new cooperation to be built. In this article, we introduce a unified artifact (called Generalized Information Exchange Automaton, short GIA) supporting the cooperation of over-approximating with under-approximating analyses. It provides information gathered by an analysis to its partner in a cooperation, independent of the type of analysis and usage context within software validation. We provide a formal definition of this artifact in the form of an automaton together with two operators on GIAs. The first operation reduces a program by excluding these parts, where the information that they are already processed is encoded in the GIA. The second operation combines partial results from two GIAs into a single on. We show that computed analysis results are never lost when connecting tools via these operations. To experimentally demonstrate the feasibility, we have implemented two such cooperation: one for verification and one for testing. The obtained results show the feasibility of our novel artifact in different contexts of cooperative software validation, in particular how the new artifact is able to overcome some drawbacks of existing artifacts.
合作式软件验证旨在让验证和/或测试工具合作完成正确性检查任务。合作包括以(验证)工件的形式交换关于当前所取得结果的信息。这些工件通常针对工具执行的分析类型而专门设计,例如有界模型检查、抽象解释或符号执行,因此需要为每一次新的合作定义新的工件。在本文中,我们介绍了一种统一的工具(称为 "广义信息交换自动机",简称 "GIA"),它支持过逼近分析与欠逼近分析之间的合作。它向合作中的伙伴提供分析所收集的信息,与软件验证中的分析类型和使用环境无关。我们以自动机的形式提供了这一工具的正式定义,同时还提供了 GIA 的两个运算符。第一种操作是通过排除这些部分来减少程序,这些部分已被处理的信息已在 GIA 中编码。第二种操作是将两个 GIA 的部分结果合并为一个单一结果。我们证明,通过这些操作连接工具时,计算出的分析结果绝不会丢失。为了在实验中证明其可行性,我们实施了两个这样的合作:一个用于验证,一个用于测试。所获得的结果表明,我们的新工具在不同的合作软件验证环境中都是可行的,特别是新工具如何能够克服现有工具的一些缺点。
{"title":"Exchanging information in cooperative software validation","authors":"Jan Haltermann, Heike Wehrheim","doi":"10.1007/s10270-024-01155-3","DOIUrl":"https://doi.org/10.1007/s10270-024-01155-3","url":null,"abstract":"<p>Cooperative software validation aims at having verification and/or testing tools <i>cooperate</i> on the task of correctness checking. Cooperation involves the exchange of information about currently achieved results in the form of (verification) artifacts. These artifacts are typically specialized to the type of analysis performed by the tool, e.g., bounded model checking, abstract interpretation or symbolic execution, and hence require the definition of a new artifact for every new cooperation to be built. In this article, we introduce a unified artifact (called Generalized Information Exchange Automaton, short GIA) supporting the cooperation of <i>over-approximating</i> with <i>under-approximating</i> analyses. It provides information gathered by an analysis to its partner in a cooperation, independent of the type of analysis and usage context within software validation. We provide a formal definition of this artifact in the form of an automaton together with two operators on GIAs. The first operation <i>reduces</i> a program by excluding these parts, where the information that they are already processed is encoded in the GIA. The second operation combines partial results from two GIAs into a single on. We show that computed analysis results are never lost when connecting tools via these operations. To experimentally demonstrate the feasibility, we have implemented two such cooperation: one for verification and one for testing. The obtained results show the feasibility of our novel artifact in different contexts of cooperative software validation, in particular how the new artifact is able to overcome some drawbacks of existing artifacts.</p>","PeriodicalId":49507,"journal":{"name":"Software and Systems Modeling","volume":"10 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140166983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}