过程挖掘中无监督事件日志抽象技术的经验评价

IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Systems Pub Date : 2023-11-25 DOI:10.1016/j.is.2023.102320
Greg Van Houdt , Massimiliano de Leoni , Niels Martin , Benoît Depaire
{"title":"过程挖掘中无监督事件日志抽象技术的经验评价","authors":"Greg Van Houdt ,&nbsp;Massimiliano de Leoni ,&nbsp;Niels Martin ,&nbsp;Benoît Depaire","doi":"10.1016/j.is.2023.102320","DOIUrl":null,"url":null,"abstract":"<div><p>These days, businesses keep track of more and more data in their information systems. Moreover, this data becomes more fine-grained than ever, tracking clicks and mutations in databases at the lowest level possible. Faced with such data, process discovery often struggles with producing comprehensible models, as they instead return spaghetti-like models. Such finely granulated models do not fit the business user’s mental model of the process under investigation. To tackle this, event log abstraction (ELA) techniques can transform the underlying event log to a higher granularity level. However, insights into the performance of these techniques are lacking in literature as results are only based on small-scale experiments and are often inconclusive. Against this background, this paper evaluates state-of-the-art abstraction techniques on 400 event logs. Results show that ELA sacrifices fitness for precision, but complexity reductions heavily depend on the ELA technique used. This study also illustrates the importance of a larger-scale experiment, as sub-sampling of results leads to contradictory conclusions.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"121 ","pages":"Article 102320"},"PeriodicalIF":3.0000,"publicationDate":"2023-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An empirical evaluation of unsupervised event log abstraction techniques in process mining\",\"authors\":\"Greg Van Houdt ,&nbsp;Massimiliano de Leoni ,&nbsp;Niels Martin ,&nbsp;Benoît Depaire\",\"doi\":\"10.1016/j.is.2023.102320\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>These days, businesses keep track of more and more data in their information systems. Moreover, this data becomes more fine-grained than ever, tracking clicks and mutations in databases at the lowest level possible. Faced with such data, process discovery often struggles with producing comprehensible models, as they instead return spaghetti-like models. Such finely granulated models do not fit the business user’s mental model of the process under investigation. To tackle this, event log abstraction (ELA) techniques can transform the underlying event log to a higher granularity level. However, insights into the performance of these techniques are lacking in literature as results are only based on small-scale experiments and are often inconclusive. Against this background, this paper evaluates state-of-the-art abstraction techniques on 400 event logs. Results show that ELA sacrifices fitness for precision, but complexity reductions heavily depend on the ELA technique used. This study also illustrates the importance of a larger-scale experiment, as sub-sampling of results leads to contradictory conclusions.</p></div>\",\"PeriodicalId\":50363,\"journal\":{\"name\":\"Information Systems\",\"volume\":\"121 \",\"pages\":\"Article 102320\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306437923001564\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437923001564","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

如今,企业在其信息系统中跟踪越来越多的数据。此外,这些数据变得比以往任何时候都更细粒度,可以在尽可能低的级别上跟踪数据库中的点击和变化。面对这样的数据,过程发现常常难以产生可理解的模型,因为它们返回的是类似意大利面的模型。这种精细粒度的模型不适合业务用户对所研究流程的心理模型。为了解决这个问题,事件日志抽象(ELA)技术可以将底层事件日志转换到更高的粒度级别。然而,文献中缺乏对这些技术性能的深入了解,因为结果仅基于小规模实验,而且往往不确定。在此背景下,本文对400个事件日志的最新抽象技术进行了评估。结果表明,ELA为了精度牺牲了适应度,但复杂性的降低很大程度上取决于所使用的ELA技术。这项研究还说明了大规模实验的重要性,因为结果的子抽样会导致相互矛盾的结论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An empirical evaluation of unsupervised event log abstraction techniques in process mining

These days, businesses keep track of more and more data in their information systems. Moreover, this data becomes more fine-grained than ever, tracking clicks and mutations in databases at the lowest level possible. Faced with such data, process discovery often struggles with producing comprehensible models, as they instead return spaghetti-like models. Such finely granulated models do not fit the business user’s mental model of the process under investigation. To tackle this, event log abstraction (ELA) techniques can transform the underlying event log to a higher granularity level. However, insights into the performance of these techniques are lacking in literature as results are only based on small-scale experiments and are often inconclusive. Against this background, this paper evaluates state-of-the-art abstraction techniques on 400 event logs. Results show that ELA sacrifices fitness for precision, but complexity reductions heavily depend on the ELA technique used. This study also illustrates the importance of a larger-scale experiment, as sub-sampling of results leads to contradictory conclusions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Systems
Information Systems 工程技术-计算机:信息系统
CiteScore
9.40
自引率
2.70%
发文量
112
审稿时长
53 days
期刊介绍: Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems. Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.
期刊最新文献
STracker: A framework for identifying sentiment changes in customer feedbacks Two-level massive string dictionaries A generative and discriminative model for diversity-promoting recommendation Soundness unknotted: An efficient soundness checking algorithm for arbitrary cyclic process models by loosening loops The composition diagram of a complex process: Enhancing understanding of hierarchical business processes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1