Sander J.J. Leemans , Fabrizio Maria Maggi , Marco Montali
{"title":"Enjoy the silence: Analysis of stochastic Petri nets with silent transitions","authors":"Sander J.J. Leemans , Fabrizio Maria Maggi , Marco Montali","doi":"10.1016/j.is.2024.102383","DOIUrl":null,"url":null,"abstract":"<div><p>Capturing stochastic behaviour in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis to represent stochastic behaviour and support different data-driven and model-driven analysis tasks in this spectrum. However, when capturing business processes, such nets inherently need a labelling that maps between transitions and activities. In many state of the art process mining techniques, this labelling is not 1-on-1, leading to unlabelled transitions and activities represented by multiple transitions. At the same time, they have to be analysed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterisation that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterisation starting from generalised stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviour of interest within an LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"124 ","pages":"Article 102383"},"PeriodicalIF":3.0000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000413/pdfft?md5=2011a29e04496e91e304834ecac1b098&pid=1-s2.0-S0306437924000413-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437924000413","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Capturing stochastic behaviour in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis to represent stochastic behaviour and support different data-driven and model-driven analysis tasks in this spectrum. However, when capturing business processes, such nets inherently need a labelling that maps between transitions and activities. In many state of the art process mining techniques, this labelling is not 1-on-1, leading to unlabelled transitions and activities represented by multiple transitions. At the same time, they have to be analysed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterisation that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterisation starting from generalised stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviour of interest within an LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.
捕捉业务和工作流程中的随机行为对于定量了解流程内决策时如何解决非确定性问题至关重要。这在流程挖掘中具有特殊意义,因为在流程挖掘中,跟踪流程实际执行情况的事件数据与流程模型相关联,从而可以深入了解频率和概率。随机 Petri 网的变体为表示随机行为提供了一个自然的形式基础,并在此范围内支持不同的数据驱动和模型驱动分析任务。然而,在捕捉业务流程时,此类网络本质上需要在过渡和活动之间进行映射的标签。在许多最先进的流程挖掘技术中,这种标记不是一对一的,从而导致未标记的过渡和活动由多个过渡来表示。同时,它们必须以有限轨迹语义进行分析,这与每个流程执行由有限多个步骤组成的事实相匹配。这两个方面阻碍了现有随机 Petri 网技术的直接应用,因此需要一种新颖的表征方法,将标签和无声转换纳入有限轨迹语义。在本文中,我们从广义随机 Petri 网出发,提供了这样一种表征方法,并获得了标签随机过程(LSP)框架。在此框架之上,我们引入了关于 LSPs 轨迹及其概率的不同关键分析任务。我们证明,所有这些分析任务都可以通过分析来解决,特别是将它们简化为一种单一的方法,将基于自动机的技术与基于吸收马尔可夫链的技术相结合,前者用于找出 LSP 中感兴趣的行为,后者用于推理其概率。最后,我们展示了我们的方法在随机一致性检查中的意义,通过概念验证实施及其在不同数据集上的应用,说明了这种方法的实际可行性。
期刊介绍:
Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems.
Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.