使用事件日志定时信息协助流程场景发现

2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) Pub Date : 2020-12-01 DOI:10.1109/AIKE48582.2020.00017

Zhenyu Zhang, Chunhui Guo, Wenyu Peng, Shangping Ren

{"title":"使用事件日志定时信息协助流程场景发现","authors":"Zhenyu Zhang, Chunhui Guo, Wenyu Peng, Shangping Ren","doi":"10.1109/AIKE48582.2020.00017","DOIUrl":null,"url":null,"abstract":"Event logs contain abundant information, such as activity names, time stamps, activity executors, etc. However, much of existing trace clustering research has been focused on applying activity names to assist process scenarios discovery. In addition, many existing trace clustering algorithms commonly used in the literature, such as k-means clustering approach, require prior knowledge about the number of process scenarios existed in the log, which sometimes are not known aprior. This paper presents a two-phase approach that obtains timing information from event logs and uses the information to assist process scenario discoveries without requiring any prior knowledge about process scenarios. We use five real-life event logs to compare the performance of the proposed two-phase approach for process scenario discoveries with the commonly used k-means clustering approach in terms of model’s harmonic mean of the weighted average fitness and precision, i.e., the F1 score. The experiment data shows that (1) the process scenario models obtained with the additional timing information have both higher fitness and precision scores than the models obtained without the timing information; (2) the two-phase approach not only removes the need for prior information related to k, but also results in a comparable F1 score compared to the optimal k-means approach with the optimal k obtained through exhaustive search.","PeriodicalId":370671,"journal":{"name":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Event Log Timing Information to Assist Process Scenario Discoveries\",\"authors\":\"Zhenyu Zhang, Chunhui Guo, Wenyu Peng, Shangping Ren\",\"doi\":\"10.1109/AIKE48582.2020.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Event logs contain abundant information, such as activity names, time stamps, activity executors, etc. However, much of existing trace clustering research has been focused on applying activity names to assist process scenarios discovery. In addition, many existing trace clustering algorithms commonly used in the literature, such as k-means clustering approach, require prior knowledge about the number of process scenarios existed in the log, which sometimes are not known aprior. This paper presents a two-phase approach that obtains timing information from event logs and uses the information to assist process scenario discoveries without requiring any prior knowledge about process scenarios. We use five real-life event logs to compare the performance of the proposed two-phase approach for process scenario discoveries with the commonly used k-means clustering approach in terms of model’s harmonic mean of the weighted average fitness and precision, i.e., the F1 score. The experiment data shows that (1) the process scenario models obtained with the additional timing information have both higher fitness and precision scores than the models obtained without the timing information; (2) the two-phase approach not only removes the need for prior information related to k, but also results in a comparable F1 score compared to the optimal k-means approach with the optimal k obtained through exhaustive search.\",\"PeriodicalId\":370671,\"journal\":{\"name\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE48582.2020.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE48582.2020.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

事件日志包含丰富的信息，如活动名称、时间戳、活动执行器等。然而，现有的跟踪聚类研究大多集中在应用活动名称来辅助流程场景发现上。此外，文献中常用的许多现有的跟踪聚类算法，如k-means聚类方法，需要事先知道日志中存在的过程场景的数量，而这些过程场景有时是未知的。本文提出了一种两阶段的方法，该方法从事件日志中获取时间信息，并使用这些信息来协助流程场景的发现，而不需要任何关于流程场景的先验知识。我们使用五个现实生活中的事件日志来比较所提出的两阶段方法在过程场景发现方面的性能与常用的k-means聚类方法在模型加权平均适应度和精度的调和平均值方面的性能，即F1分数。实验数据表明:(1)与不含时序信息的过程场景模型相比，加入时序信息的过程场景模型具有更高的适应度和精度分数;(2)两阶段方法不仅消除了对与k相关的先验信息的需要，而且与最优k-means方法相比，通过穷举搜索获得的最优k的F1分数相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Using Event Log Timing Information to Assist Process Scenario Discoveries

Event logs contain abundant information, such as activity names, time stamps, activity executors, etc. However, much of existing trace clustering research has been focused on applying activity names to assist process scenarios discovery. In addition, many existing trace clustering algorithms commonly used in the literature, such as k-means clustering approach, require prior knowledge about the number of process scenarios existed in the log, which sometimes are not known aprior. This paper presents a two-phase approach that obtains timing information from event logs and uses the information to assist process scenario discoveries without requiring any prior knowledge about process scenarios. We use five real-life event logs to compare the performance of the proposed two-phase approach for process scenario discoveries with the commonly used k-means clustering approach in terms of model’s harmonic mean of the weighted average fitness and precision, i.e., the F1 score. The experiment data shows that (1) the process scenario models obtained with the additional timing information have both higher fitness and precision scores than the models obtained without the timing information; (2) the two-phase approach not only removes the need for prior information related to k, but also results in a comparable F1 score compared to the optimal k-means approach with the optimal k obtained through exhaustive search.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)

自引率

0.00%

发文量