首页 > 最新文献

Information Systems最新文献

英文 中文
Temporal graph processing in modern memory hierarchies 现代存储器分层中的时序图处理
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-21 DOI: 10.1016/j.is.2024.102462
Alexander Baumstark, Muhammad Attahir Jibril, Kai-Uwe Sattler
Updates in graph DBMS lead to structural changes in the graph over time with different intermediate states. Capturing these changes and their time is one of the main purposes of temporal DBMS. Most DBMSs built their temporal features based on their non-temporal processing and storage without considering the memory hierarchy of the underlying system. This leads to slower temporal processing and poor storage utilization. In this paper, we propose a storage and processing strategy for (bi-) temporal graphs using temporal materialized views (TMV) while exploiting the memory hierarchy of a modern system. Further, we show a solution to the query containment problem for certain types of temporal graph queries. Finally, we evaluate the overhead and performance of the presented approach. The results show that using TMV reduces the runtime of temporal graph queries while using less memory.
图 DBMS 中的更新会导致图的结构随时间发生变化,并具有不同的中间状态。捕捉这些变化及其时间是时态 DBMS 的主要目的之一。大多数 DBMS 都是在非时态处理和存储的基础上构建其时态特性,而没有考虑底层系统的内存层次结构。这导致时态处理速度较慢,存储利用率较低。在本文中,我们提出了一种使用时态物化视图(TMV)的(双)时态图存储和处理策略,同时利用了现代系统的内存层次结构。此外,我们还展示了针对某些类型时态图查询的查询包含问题的解决方案。最后,我们对所介绍方法的开销和性能进行了评估。结果表明,使用 TMV 可以减少时态图查询的运行时间,同时占用更少的内存。
{"title":"Temporal graph processing in modern memory hierarchies","authors":"Alexander Baumstark,&nbsp;Muhammad Attahir Jibril,&nbsp;Kai-Uwe Sattler","doi":"10.1016/j.is.2024.102462","DOIUrl":"10.1016/j.is.2024.102462","url":null,"abstract":"<div><div>Updates in graph DBMS lead to structural changes in the graph over time with different intermediate states. Capturing these changes and their time is one of the main purposes of temporal DBMS. Most DBMSs built their temporal features based on their non-temporal processing and storage without considering the memory hierarchy of the underlying system. This leads to slower temporal processing and poor storage utilization. In this paper, we propose a storage and processing strategy for (bi-) temporal graphs using temporal materialized views (TMV) while exploiting the memory hierarchy of a modern system. Further, we show a solution to the query containment problem for certain types of temporal graph queries. Finally, we evaluate the overhead and performance of the presented approach. The results show that using TMV reduces the runtime of temporal graph queries while using less memory.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102462"},"PeriodicalIF":3.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging reading and mapping: The role of reading annotations in facilitating feedback while concept mapping 连接阅读和绘图:在绘制概念图时,阅读注释在促进反馈中的作用
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-06 DOI: 10.1016/j.is.2024.102458
Oscar Díaz, Xabier Garmendia

Concept maps are visual tools for organizing knowledge, commonly used in education and design. The process often involves reading and developing conceptual models, where feedback is crucial. Learners (e.g., students, designers) often refer to reading materials, and receive feedback from instructors (e.g., teachers, stakeholders) based on the maps they create. However, annotations made by learners, like highlights, are usually not visible to instructors, limiting tailored feedback. We propose incorporating annotation practices into concept mapping. Learners could highlight text and link these highlights to existing or newly created concepts in their concept map. This way, instructors can access both the concept map and the relevant readings for better feedback. This vision is realized through Concept&Go, a plug-in for the editor CmapCloud. This extension aims at the interplay between mapping, reading, and feedback during concept mapping. The effectiveness of this approach is demonstrated through a focus group (n=5) and a UTAUT evaluation (n=12). Concept&Go is publicly available.

概念图是组织知识的可视化工具,常用于教育和设计领域。这一过程通常涉及阅读和开发概念模型,其中反馈至关重要。学习者(如学生、设计师)通常会参考阅读材料,并根据自己绘制的地图从指导者(如教师、利益相关者)那里获得反馈。然而,学习者所做的注释(如高亮部分)通常不为指导者所见,从而限制了有针对性的反馈。我们建议将注释做法纳入概念图。学习者可以突出显示文本,并将这些突出显示链接到概念图中现有的或新创建的概念。这样,教师就可以同时访问概念图和相关阅读内容,从而获得更好的反馈。Concept&Go 是 CmapCloud 编辑器的一个插件,它实现了这一愿景。该插件旨在实现概念图绘制过程中绘图、阅读和反馈之间的相互作用。通过焦点小组(5 人)和UTAUT 评估(12 人)证明了这种方法的有效性。Concept&Go已公开发布。
{"title":"Bridging reading and mapping: The role of reading annotations in facilitating feedback while concept mapping","authors":"Oscar Díaz,&nbsp;Xabier Garmendia","doi":"10.1016/j.is.2024.102458","DOIUrl":"10.1016/j.is.2024.102458","url":null,"abstract":"<div><p>Concept maps are visual tools for organizing knowledge, commonly used in education and design. The process often involves reading and developing conceptual models, where feedback is crucial. Learners (e.g., students, designers) often refer to reading materials, and receive feedback from instructors (e.g., teachers, stakeholders) based on the maps they create. However, annotations made by learners, like highlights, are usually not visible to instructors, limiting tailored feedback. We propose incorporating annotation practices into concept mapping. Learners could highlight text and link these highlights to existing or newly created concepts in their concept map. This way, instructors can access both the concept map and the relevant readings for better feedback. This vision is realized through <em>Concept&amp;Go</em>, a plug-in for the editor <em>CmapCloud</em>. This extension aims at the interplay between mapping, reading, and feedback during concept mapping. The effectiveness of this approach is demonstrated through a focus group (n=5) and a UTAUT evaluation (n=12). <em>Concept&amp;Go</em> is publicly available.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102458"},"PeriodicalIF":3.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924001169/pdfft?md5=f1df1b7c90dae26d25484ea7d7b77c25&pid=1-s2.0-S0306437924001169-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A universal approach for simplified redundancy-aware cross-model querying 简化冗余感知跨模型查询的通用方法
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-04 DOI: 10.1016/j.is.2024.102456
Pavel Koupil, Daniel Crha, Irena Holubová

Numerous challenges and open problems have appeared with the dawn of multi-model data. In most cases, single-model solutions cannot be straightforwardly extended, and new, efficient approaches must be found. In addition, since there are no standards related to combining and managing multiple models, the situation is even more complicated and confusing for users.

This paper deals with the most important aspect of data management — querying. To enable the user to grasp all the popular models, we base our solution on the abstract categorical representation of multi-model data, which can be viewed as a graph. To unify the querying of multi-model data, we enable the user to query the categorical graph using a SPARQL-based model-agnostic query language called MMQL. The query is then decomposed and translated into languages of the underlying systems. The intermediate results are then combined into the final categorical result that can be expressed in any selected format. The support for cross-model redundancy enables one to create distinct query plans and choose the optimal one. We also introduce a proof-of-concept implementation of our solution called MM-quecat.

随着多模型数据的出现,出现了许多挑战和悬而未决的问题。在大多数情况下,单一模型解决方案无法直接扩展,必须找到新的高效方法。此外,由于没有与组合和管理多模型相关的标准,情况对用户来说更加复杂和混乱。为了让用户掌握所有流行的模型,我们的解决方案基于多模型数据的抽象分类表示法,这种表示法可以看作是一个图。为了统一多模型数据的查询,我们让用户能够使用基于 SPARQL 的模型无关查询语言 MMQL 查询分类图。然后将查询分解并翻译成底层系统的语言。然后将中间结果合并为最终的分类结果,该结果可以任何选定的格式表达。对跨模型冗余的支持使人们能够创建不同的查询计划并选择最优计划。我们还介绍了我们的解决方案的概念验证实现,称为 MM-quecat。
{"title":"A universal approach for simplified redundancy-aware cross-model querying","authors":"Pavel Koupil,&nbsp;Daniel Crha,&nbsp;Irena Holubová","doi":"10.1016/j.is.2024.102456","DOIUrl":"10.1016/j.is.2024.102456","url":null,"abstract":"<div><p>Numerous challenges and open problems have appeared with the dawn of multi-model data. In most cases, single-model solutions cannot be straightforwardly extended, and new, efficient approaches must be found. In addition, since there are no standards related to combining and managing multiple models, the situation is even more complicated and confusing for users.</p><p>This paper deals with the most important aspect of data management — querying. To enable the user to grasp all the popular models, we base our solution on the abstract categorical representation of multi-model data, which can be viewed as a graph. To unify the querying of multi-model data, we enable the user to query the categorical graph using a SPARQL-based model-agnostic query language called MMQL. The query is then decomposed and translated into languages of the underlying systems. The intermediate results are then combined into the final categorical result that can be expressed in any selected format. The support for cross-model redundancy enables one to create distinct query plans and choose the optimal one. We also introduce a proof-of-concept implementation of our solution called <em>MM-quecat</em>.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102456"},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tri-AL: An open source platform for visualization and analysis of clinical trials Tri-AL:用于临床试验可视化和分析的开源平台
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-04 DOI: 10.1016/j.is.2024.102459
Pouyan Nahed , Mina Esmail Zadeh Nojoo Kambar , Kazem Taghva , Lukasz Golab

ClinicalTrials.gov hosts an online database with over 440,000 medical studies (as of 2023) evaluating drugs, supplements, medical devices, and behavioral treatments. Target users include scientists, medical researchers, pharmaceutical companies, and other public and private institutions. Although ClinicalTrials has some filtering ability, it does not provide visualization tools, reporting tools or historical data; only the most recent state of each trial is visible to users. To fill these functionality gaps, we present Tri-AL: an open-source data platform for clinical trial visualization, information extraction, historical analysis, and reporting. This paper describes the design and functionality of Tri-AL, including a programmable module to incorporate machine learning models and extract disease-specific data from unstructured trial reports, which we demonstrate using Alzheimer’s disease reporting as a case study. We also highlight the use of Tri-AL for trial participation analysis in terms of sex, gender, race and ethnicity. The source code is publicly available at https://github.com/pouyan9675/Tri-AL.

ClinicalTrials.gov 是一个在线数据库,收录了超过 440,000 项评估药物、保健品、医疗器械和行为疗法的医学研究(截至 2023 年)。目标用户包括科学家、医学研究人员、制药公司以及其他公共和私营机构。尽管 ClinicalTrials 具有一定的筛选功能,但它不提供可视化工具、报告工具或历史数据;用户只能看到每个试验的最新状态。为了填补这些功能空白,我们提出了 Tri-AL:一个用于临床试验可视化、信息提取、历史分析和报告的开源数据平台。本文介绍了 Tri-AL 的设计和功能,包括一个可编程模块,用于整合机器学习模型,并从非结构化试验报告中提取特定疾病的数据。我们还重点介绍了如何使用 Tri-AL 从性别、种族和民族角度分析试验参与情况。源代码可通过 https://github.com/pouyan9675/Tri-AL 公开获取。
{"title":"Tri-AL: An open source platform for visualization and analysis of clinical trials","authors":"Pouyan Nahed ,&nbsp;Mina Esmail Zadeh Nojoo Kambar ,&nbsp;Kazem Taghva ,&nbsp;Lukasz Golab","doi":"10.1016/j.is.2024.102459","DOIUrl":"10.1016/j.is.2024.102459","url":null,"abstract":"<div><p>ClinicalTrials.gov hosts an online database with over 440,000 medical studies (as of 2023) evaluating drugs, supplements, medical devices, and behavioral treatments. Target users include scientists, medical researchers, pharmaceutical companies, and other public and private institutions. Although ClinicalTrials has some filtering ability, it does not provide visualization tools, reporting tools or historical data; only the most recent state of each trial is visible to users. To fill these functionality gaps, we present <em>Tri-AL</em>: an open-source data platform for clinical trial visualization, information extraction, historical analysis, and reporting. This paper describes the design and functionality of <em>Tri-AL</em>, including a programmable module to incorporate machine learning models and extract disease-specific data from unstructured trial reports, which we demonstrate using Alzheimer’s disease reporting as a case study. We also highlight the use of <em>Tri-AL</em> for trial participation analysis in terms of sex, gender, race and ethnicity. The source code is publicly available at <span><span>https://github.com/pouyan9675/Tri-AL</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102459"},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Electricity behaviors anomaly detection based on multi-feature fusion and contrastive learning 基于多特征融合和对比学习的用电行为异常检测
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-02 DOI: 10.1016/j.is.2024.102457
Yongming Guan , Yuliang Shi , Gang Wang , Jian Zhang , Xinjun Wang , Zhiyong Chen , Hui Li

Abnormal electricity usage detection is the process of discovering and diagnosing abnormal electricity usage behavior by monitoring and analyzing the electricity usage in the power system. How to improve the accuracy of anomaly detection is a popular research topic. Most studies use neural networks for anomaly detection, but ignore the effect of missing electricity data on anomaly detection performance. Missing value completion is an important method to improve the quality of electricity data and to optimize the anomaly detection performance. Moreover, most studies have ignored the potential correlation relationship between spatial features by modeling the temporal features of electricity data. Therefore, this paper proposes an electricity anomaly detection model based on multi-feature fusion and contrastive learning. The model integrates the temporal and spatial features to jointly accomplish electricity anomaly detection. In terms of temporal feature representation learning, an improved bi-directional LSTM is designed to achieve the missing value completion of electricity data, and combined with CNN to capture the electricity consumption behavior patterns in the temporal data. In terms of spatial feature representation learning, GCN and Transformer are used to fully explore the complex correlation relationships among data. In addition, in order to improve the performance of anomaly detection, this paper also designs a gated fusion module and combines the idea of contrastive learning to strengthen the representation ability of electricity data. Finally, we demonstrate through experiments that the method proposed in this paper can effectively improve the performance of electricity behavior anomaly detection.

异常用电检测是通过监测和分析电力系统中的用电情况,发现和诊断异常用电行为的过程。如何提高异常检测的准确性是一个热门研究课题。大多数研究采用神经网络进行异常检测,但忽略了缺失电力数据对异常检测性能的影响。缺失值补全是提高电力数据质量、优化异常检测性能的重要方法。此外,大多数研究通过对电力数据的时间特征建模,忽略了空间特征之间潜在的相关关系。因此,本文提出了一种基于多特征融合和对比学习的电力异常检测模型。该模型整合了时间和空间特征,共同完成电力异常检测。在时间特征表征学习方面,设计了改进的双向 LSTM 来实现电力数据的缺失值补全,并结合 CNN 来捕捉时间数据中的用电行为模式。在空间特征表征学习方面,利用 GCN 和 Transformer 充分挖掘数据间复杂的相关关系。此外,为了提高异常检测的性能,本文还设计了一个门控融合模块,并结合对比学习的思想来加强电力数据的表示能力。最后,我们通过实验证明本文提出的方法能有效提高用电行为异常检测的性能。
{"title":"Electricity behaviors anomaly detection based on multi-feature fusion and contrastive learning","authors":"Yongming Guan ,&nbsp;Yuliang Shi ,&nbsp;Gang Wang ,&nbsp;Jian Zhang ,&nbsp;Xinjun Wang ,&nbsp;Zhiyong Chen ,&nbsp;Hui Li","doi":"10.1016/j.is.2024.102457","DOIUrl":"10.1016/j.is.2024.102457","url":null,"abstract":"<div><p>Abnormal electricity usage detection is the process of discovering and diagnosing abnormal electricity usage behavior by monitoring and analyzing the electricity usage in the power system. How to improve the accuracy of anomaly detection is a popular research topic. Most studies use neural networks for anomaly detection, but ignore the effect of missing electricity data on anomaly detection performance. Missing value completion is an important method to improve the quality of electricity data and to optimize the anomaly detection performance. Moreover, most studies have ignored the potential correlation relationship between spatial features by modeling the temporal features of electricity data. Therefore, this paper proposes an electricity anomaly detection model based on multi-feature fusion and contrastive learning. The model integrates the temporal and spatial features to jointly accomplish electricity anomaly detection. In terms of temporal feature representation learning, an improved bi-directional LSTM is designed to achieve the missing value completion of electricity data, and combined with CNN to capture the electricity consumption behavior patterns in the temporal data. In terms of spatial feature representation learning, GCN and Transformer are used to fully explore the complex correlation relationships among data. In addition, in order to improve the performance of anomaly detection, this paper also designs a gated fusion module and combines the idea of contrastive learning to strengthen the representation ability of electricity data. Finally, we demonstrate through experiments that the method proposed in this paper can effectively improve the performance of electricity behavior anomaly detection.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102457"},"PeriodicalIF":3.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for measuring the quality of business process simulation models 衡量业务流程模拟模型质量的框架
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-22 DOI: 10.1016/j.is.2024.102447
David Chapela-Campa , Ismail Benchekroun , Opher Baron , Marlon Dumas , Dmitry Krass , Arik Senderovich

Business Process Simulation (BPS) is an approach to analyze the performance of business processes under different scenarios. For example, BPS allows us to estimate the impact of adding one or more resources on the cycle time of a process. The starting point of BPS is a process model annotated with simulation parameters (a BPS model). BPS models may be manually designed, based on information collected from stakeholders and from empirical observations, or automatically discovered from historical execution data. Regardless of its provenance, a key question when using a BPS model is how to assess its quality. In particular, in a setting where we are able to produce multiple alternative BPS models of the same process, this question becomes: How to determine which model is better, to what extent, and in what respect? In this context, this article studies the question of how to measure the quality of a BPS model with respect to its ability to accurately replicate the observed behavior of a process. Rather than pursuing a one-size-fits-all approach, the article recognizes that a process covers multiple perspectives. Accordingly, the article outlines a framework that can be instantiated in different ways to yield quality measures that tackle different process perspectives. The article defines a number of concrete quality measures and evaluates these measures with respect to their ability to discern the impact of controlled perturbations on a BPS model, and their ability to uncover the relative strengths and weaknesses of two approaches for automated discovery of BPS models. The evaluation shows that the proposed measures not only capture how close a BPS model is to the observed behavior, but they also help us to identify the sources of discrepancies.

业务流程模拟(BPS)是一种分析不同情况下业务流程性能的方法。例如,BPS 可以让我们估算增加一个或多个资源对流程周期时间的影响。BPS 的起点是一个注有模拟参数的流程模型(BPS 模型)。BPS 模型可以根据从利益相关者和经验观察中收集的信息手动设计,也可以从历史执行数据中自动发现。无论其来源如何,使用 BPS 模型时的一个关键问题是如何评估其质量。特别是在我们能够为同一流程生成多个可供选择的 BPS 模型的情况下,这个问题就变得尤为重要:如何确定哪个模型更好,好到什么程度,以及在哪些方面更好?在这种情况下,本文研究的问题是:如何根据 BPS 模型准确复制观察到的过程行为的能力来衡量其质量。文章并不追求一刀切的方法,而是认识到流程涵盖多个角度。因此,文章概述了一个框架,该框架可以不同的方式进行实例化,以产生针对不同流程视角的质量度量。文章定义了一些具体的质量度量,并评估了这些度量在辨别受控扰动对 BPS 模型的影响方面的能力,以及在揭示自动发现 BPS 模型的两种方法的相对优缺点方面的能力。评估结果表明,所提出的测量方法不仅能捕捉 BPS 模型与观测行为的接近程度,还能帮助我们识别差异的来源。
{"title":"A framework for measuring the quality of business process simulation models","authors":"David Chapela-Campa ,&nbsp;Ismail Benchekroun ,&nbsp;Opher Baron ,&nbsp;Marlon Dumas ,&nbsp;Dmitry Krass ,&nbsp;Arik Senderovich","doi":"10.1016/j.is.2024.102447","DOIUrl":"10.1016/j.is.2024.102447","url":null,"abstract":"<div><p>Business Process Simulation (BPS) is an approach to analyze the performance of business processes under different scenarios. For example, BPS allows us to estimate the impact of adding one or more resources on the cycle time of a process. The starting point of BPS is a process model annotated with simulation parameters (a BPS model). BPS models may be manually designed, based on information collected from stakeholders and from empirical observations, or automatically discovered from historical execution data. Regardless of its provenance, a key question when using a BPS model is how to assess its quality. In particular, in a setting where we are able to produce multiple alternative BPS models of the same process, this question becomes: How to determine which model is better, to what extent, and in what respect? In this context, this article studies the question of how to measure the quality of a BPS model with respect to its ability to accurately replicate the observed behavior of a process. Rather than pursuing a one-size-fits-all approach, the article recognizes that a process covers multiple perspectives. Accordingly, the article outlines a framework that can be instantiated in different ways to yield quality measures that tackle different process perspectives. The article defines a number of concrete quality measures and evaluates these measures with respect to their ability to discern the impact of controlled perturbations on a BPS model, and their ability to uncover the relative strengths and weaknesses of two approaches for automated discovery of BPS models. The evaluation shows that the proposed measures not only capture how close a BPS model is to the observed behavior, but they also help us to identify the sources of discrepancies.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102447"},"PeriodicalIF":3.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924001054/pdfft?md5=7958dc6fdab5faf4469760f9d839425a&pid=1-s2.0-S0306437924001054-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142089258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PathEL: A novel collective entity linking method based on relationship paths in heterogeneous information networks PathEL:基于异构信息网络关系路径的新型集体实体链接方法
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-13 DOI: 10.1016/j.is.2024.102433
Lizheng Zu, Lin Lin, Song Fu, Jie Liu, Shiwei Suo, Wenhui He, Jinlei Wu, Yancheng Lv

Collective entity linking always outperforms independent entity linking because it considers the interdependencies among entities. However, the existing collective entity linking methods often have high time complexity, do not fully utilize the relationship information in heterogeneous information networks (HIN) and most of them are largely dependent on the special features associated with Wikipedia. Based on the above problems, this paper proposes a novel collective entity linking method based on relationship path in heterogeneous information networks (PathEL). The PathEL classifies complex relationships in HIN into 1-hop paths and 3 types of 2-hop paths, and measures entity correlation by the path information among entities, ultimately combining textual semantic information to realize collective entity linking. In addition, facing the high complexity of collective entity linking, this paper proposes to solve the problem by combining the variable sliding window data processing method and the two-step pruning strategy. The variable sliding window data processing method limits the number of entity mentions in each window and the pruning strategy reduces the number of candidate entities. Finally, the experimental results of three benchmark datasets verify that the model proposed in this paper performs better in entity linking than the baseline models. On the AIDA CoNLL dataset, compared to the second-ranked model, our model has improved P, R, and F1 scores by 1.61%, 1.54%, and 1.57%, respectively.

集体实体链接总是优于独立实体链接,因为集体实体链接考虑了实体之间的相互依赖关系。然而,现有的集体实体链接方法往往时间复杂度高,不能充分利用异构信息网络(HIN)中的关系信息,而且大多数方法在很大程度上依赖于维基百科的相关特殊功能。基于上述问题,本文提出了一种基于异构信息网络关系路径的新型集体实体链接方法(PathEL)。PathEL 将异构信息网络中的复杂关系分为 1 跳路径和 3 种 2 跳路径,并通过实体间的路径信息度量实体相关性,最终结合文本语义信息实现集体实体链接。此外,面对集体实体链接的高复杂性,本文提出了结合可变滑动窗口数据处理方法和两步剪枝策略来解决这一问题。可变滑动窗口数据处理方法限制了每个窗口中实体提及的数量,而剪枝策略则减少了候选实体的数量。最后,三个基准数据集的实验结果验证了本文提出的模型在实体链接方面的表现优于基准模型。在 AIDA CoNLL 数据集上,与排名第二的模型相比,我们的模型的 P、R 和 F1 分数分别提高了 1.61%、1.54% 和 1.57%。
{"title":"PathEL: A novel collective entity linking method based on relationship paths in heterogeneous information networks","authors":"Lizheng Zu,&nbsp;Lin Lin,&nbsp;Song Fu,&nbsp;Jie Liu,&nbsp;Shiwei Suo,&nbsp;Wenhui He,&nbsp;Jinlei Wu,&nbsp;Yancheng Lv","doi":"10.1016/j.is.2024.102433","DOIUrl":"10.1016/j.is.2024.102433","url":null,"abstract":"<div><p>Collective entity linking always outperforms independent entity linking because it considers the interdependencies among entities. However, the existing collective entity linking methods often have high time complexity, do not fully utilize the relationship information in heterogeneous information networks (HIN) and most of them are largely dependent on the special features associated with Wikipedia. Based on the above problems, this paper proposes a novel collective entity linking method based on relationship path in heterogeneous information networks (PathEL). The PathEL classifies complex relationships in HIN into 1-hop paths and 3 types of 2-hop paths, and measures entity correlation by the path information among entities, ultimately combining textual semantic information to realize collective entity linking. In addition, facing the high complexity of collective entity linking, this paper proposes to solve the problem by combining the variable sliding window data processing method and the two-step pruning strategy. The variable sliding window data processing method limits the number of entity mentions in each window and the pruning strategy reduces the number of candidate entities. Finally, the experimental results of three benchmark datasets verify that the model proposed in this paper performs better in entity linking than the baseline models. On the AIDA CoNLL dataset, compared to the second-ranked model, our model has improved P, R, and F1 scores by 1.61%, 1.54%, and 1.57%, respectively.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"126 ","pages":"Article 102433"},"PeriodicalIF":3.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An incremental algorithm for repairing denial constraint violations 修复拒绝约束违规行为的增量算法
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-05 DOI: 10.1016/j.is.2024.102435
Lingfeng Bian , Weidong Yang , Ting Xu , Zijing Tan

Data repairing algorithms are extensively studied for improving data quality. Denial constraints (DCs) are commonly employed to state quality specifications that data should satisfy and hence facilitate data repairing since DCs are general enough to subsume many other dependencies. Data in practice are usually frequently updated, which motivates the quest for efficient incremental repairing techniques in response to data updates. In this paper, we present the first incremental algorithm for repairing DC violations. Specifically, given a relational instance I consistent with a set Σ of DCs, and a set I of tuple insertions to I, our aim is to find a set I of tuple insertions such that Σ is satisfied on I+ I. We first formalize and prove the complexity of the problem of incremental data repairing with DCs. We then present techniques that combine auxiliary indexing structures to efficiently identify DC violations incurred by I w.r.t. Σ, and further develop an efficient repairing algorithm to compute I by resolving DC violations. Finally, using both real-life and synthetic datasets, we conduct extensive experiments to demonstrate the effectiveness and efficiency of our approach.

为提高数据质量,人们对数据修复算法进行了广泛研究。通常采用拒绝约束(DC)来说明数据应满足的质量规范,从而促进数据修复,因为拒绝约束的通用性足以包含许多其他依赖关系。在实践中,数据通常会频繁更新,这就促使人们寻求高效的增量修复技术来应对数据更新。在本文中,我们提出了第一种用于修复违反 DC 的增量算法。具体来说,给定一个与一组 DC Σ 一致的关系实例 I 和一组插入到 I 中的元组 △ I,我们的目标是找到一组插入元组 △ I′,从而在 I+△ I′ 上满足 Σ。我们首先形式化并证明了使用 DC 进行增量数据修复问题的复杂性。然后,我们提出了结合辅助索引结构的技术,以有效识别△ I 对Σ的DC违反,并进一步开发了一种有效的修复算法,通过解决DC违反来计算△ I′。最后,我们使用真实数据集和合成数据集进行了大量实验,以证明我们的方法的有效性和效率。
{"title":"An incremental algorithm for repairing denial constraint violations","authors":"Lingfeng Bian ,&nbsp;Weidong Yang ,&nbsp;Ting Xu ,&nbsp;Zijing Tan","doi":"10.1016/j.is.2024.102435","DOIUrl":"10.1016/j.is.2024.102435","url":null,"abstract":"<div><p>Data repairing algorithms are extensively studied for improving data quality. Denial constraints (DCs) are commonly employed to state quality specifications that data should satisfy and hence facilitate data repairing since DCs are general enough to subsume many other dependencies. Data in practice are usually frequently updated, which motivates the quest for efficient incremental repairing techniques in response to data updates. In this paper, we present the first incremental algorithm for repairing DC violations. Specifically, given a relational instance <span><math><mi>I</mi></math></span> consistent with a set <span><math><mi>Σ</mi></math></span> of DCs, and a set <span><math><mo>△</mo></math></span> <span><math><mi>I</mi></math></span> of tuple insertions to <span><math><mi>I</mi></math></span>, our aim is to find a set <span><math><mo>△</mo></math></span> <span><math><msup><mrow><mi>I</mi></mrow><mrow><mo>′</mo></mrow></msup></math></span> of tuple insertions such that <span><math><mi>Σ</mi></math></span> is satisfied on <span><math><mrow><mi>I</mi><mo>+</mo><mo>△</mo></mrow></math></span> <span><math><msup><mrow><mi>I</mi></mrow><mrow><mo>′</mo></mrow></msup></math></span>. We first formalize and prove the complexity of the problem of incremental data repairing with DCs. We then present techniques that combine auxiliary indexing structures to efficiently identify DC violations incurred by <span><math><mo>△</mo></math></span> <span><math><mi>I</mi></math></span> <em>w.r.t.</em> <span><math><mi>Σ</mi></math></span>, and further develop an efficient repairing algorithm to compute <span><math><mo>△</mo></math></span> <span><math><msup><mrow><mi>I</mi></mrow><mrow><mo>′</mo></mrow></msup></math></span> by resolving DC violations. Finally, using both real-life and synthetic datasets, we conduct extensive experiments to demonstrate the effectiveness and efficiency of our approach.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"126 ","pages":"Article 102435"},"PeriodicalIF":3.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141963870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling the causes of waiting time in business processes from event logs 从事件日志中揭示业务流程等待时间的原因
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-02 DOI: 10.1016/j.is.2024.102434
Katsiaryna Lashkevich, Fredrik Milani, David Chapela-Campa, Ihar Suvorau, Marlon Dumas

Waiting times in a business process often arise when a case transitions from one activity to another. Accordingly, analyzing the causes of waiting times in activity transitions can help analysts identify opportunities for reducing the cycle time of a process. This paper proposes a process mining approach to decompose observed waiting times in each activity transition into multiple direct causes and to analyze the impact of each identified cause on the process cycle time efficiency. The approach is implemented as a software tool called Kronos that process analysts can use to upload event logs and obtain analysis results of waiting time causes. The proposed approach was empirically evaluated using synthetic event logs to verify its ability to discover different direct causes of waiting times. The applicability of the approach is demonstrated in a real-life process. Interviews with process mining experts confirm that Kronos is useful and easy to use for identifying improvement opportunities related to waiting times.

业务流程中的等待时间往往出现在个案从一项活动过渡到另一项活动时。因此,分析活动转换中等待时间的原因可以帮助分析人员确定缩短流程周期时间的机会。本文提出了一种流程挖掘方法,可将每个活动转换中观察到的等待时间分解为多个直接原因,并分析每个已识别原因对流程周期时间效率的影响。该方法以名为 Kronos 的软件工具的形式实施,流程分析师可使用该工具上传事件日志并获取等待时间原因的分析结果。我们使用合成事件日志对所提出的方法进行了实证评估,以验证其发现造成等待时间的不同直接原因的能力。该方法的适用性在实际流程中得到了验证。与流程挖掘专家的访谈证实,Kronos 在确定与等待时间相关的改进机会方面非常有用且易于使用。
{"title":"Unveiling the causes of waiting time in business processes from event logs","authors":"Katsiaryna Lashkevich,&nbsp;Fredrik Milani,&nbsp;David Chapela-Campa,&nbsp;Ihar Suvorau,&nbsp;Marlon Dumas","doi":"10.1016/j.is.2024.102434","DOIUrl":"10.1016/j.is.2024.102434","url":null,"abstract":"<div><p>Waiting times in a business process often arise when a case transitions from one activity to another. Accordingly, analyzing the causes of waiting times in activity transitions can help analysts identify opportunities for reducing the cycle time of a process. This paper proposes a process mining approach to decompose observed waiting times in each activity transition into multiple direct causes and to analyze the impact of each identified cause on the process cycle time efficiency. The approach is implemented as a software tool called Kronos that process analysts can use to upload event logs and obtain analysis results of waiting time causes. The proposed approach was empirically evaluated using synthetic event logs to verify its ability to discover different direct causes of waiting times. The applicability of the approach is demonstrated in a real-life process. Interviews with process mining experts confirm that Kronos is useful and easy to use for identifying improvement opportunities related to waiting times.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"126 ","pages":"Article 102434"},"PeriodicalIF":3.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000929/pdfft?md5=b33e9c78bfb4c612b6425be5538b1251&pid=1-s2.0-S0306437924000929-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive learning enhanced by graph neural networks for Universal Multivariate Time Series Representation 利用图神经网络增强对比学习,实现通用多变量时间序列表征
IF 3 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-07-25 DOI: 10.1016/j.is.2024.102429
Xinghao Wang, Qiang Xing, Huimin Xiao, Ming Ye

Analyzing multivariate time series data is crucial for many real-world issues, such as power forecasting, traffic flow forecasting, industrial anomaly detection, and more. Recently, universal frameworks for time series representation based on representation learning have received widespread attention due to their ability to capture changes in the distribution of time series data. However, existing time series representation learning models, when confronting multivariate time series data, merely apply contrastive learning methods to construct positive and negative samples for each variable at the timestamp level, and then employ a contrastive loss function to encourage the model to learn the similarities among the positive samples and the dissimilarities among the negative samples for each variable. Despite this, they fail to fully exploit the latent space dependencies between pairs of variables. To address this problem, we propose the Contrastive Learning Enhanced by Graph Neural Networks for Universal Multivariate Time Series Representation (COGNet), which has three distinctive features. (1) COGNet is a comprehensive self-supervised learning model that combines autoencoders and contrastive learning methods. (2) We introduce graph feature representation blocks on top of the backbone encoder, which extract adjacency features of each variable with other variables. (3) COGNet uses graph contrastive loss to learn graph feature representations. Experimental results across multiple public datasets indicate that COGNet outperforms existing methods in time series prediction and anomaly detection tasks.

分析多变量时间序列数据对许多现实问题至关重要,如电力预测、交通流量预测、工业异常检测等。最近,基于表示学习的时间序列表示通用框架因其捕捉时间序列数据分布变化的能力而受到广泛关注。然而,现有的时间序列表示学习模型在面对多变量时间序列数据时,只是应用对比学习方法在时间戳级别为每个变量构建正样本和负样本,然后采用对比损失函数来鼓励模型学习每个变量的正样本之间的相似性和负样本之间的不相似性。尽管如此,这些方法未能充分利用变量对之间的潜在空间依赖关系。为了解决这个问题,我们提出了用于通用多变量时间序列表示的图神经网络增强对比学习(COGNet),它有三个显著特点。(1) COGNet 是一种结合了自动编码器和对比学习方法的综合自监督学习模型。(2) 我们在主干编码器上引入图特征表示块,提取每个变量与其他变量的邻接特征。(3) COGNet 使用图对比损失来学习图特征表示。多个公共数据集的实验结果表明,COGNet 在时间序列预测和异常检测任务中的表现优于现有方法。
{"title":"Contrastive learning enhanced by graph neural networks for Universal Multivariate Time Series Representation","authors":"Xinghao Wang,&nbsp;Qiang Xing,&nbsp;Huimin Xiao,&nbsp;Ming Ye","doi":"10.1016/j.is.2024.102429","DOIUrl":"10.1016/j.is.2024.102429","url":null,"abstract":"<div><p>Analyzing multivariate time series data is crucial for many real-world issues, such as power forecasting, traffic flow forecasting, industrial anomaly detection, and more. Recently, universal frameworks for time series representation based on representation learning have received widespread attention due to their ability to capture changes in the distribution of time series data. However, existing time series representation learning models, when confronting multivariate time series data, merely apply contrastive learning methods to construct positive and negative samples for each variable at the timestamp level, and then employ a contrastive loss function to encourage the model to learn the similarities among the positive samples and the dissimilarities among the negative samples for each variable. Despite this, they fail to fully exploit the latent space dependencies between pairs of variables. To address this problem, we propose the Contrastive Learning Enhanced by Graph Neural Networks for Universal Multivariate Time Series Representation (COGNet), which has three distinctive features. (1) COGNet is a comprehensive self-supervised learning model that combines autoencoders and contrastive learning methods. (2) We introduce graph feature representation blocks on top of the backbone encoder, which extract adjacency features of each variable with other variables. (3) COGNet uses graph contrastive loss to learn graph feature representations. Experimental results across multiple public datasets indicate that COGNet outperforms existing methods in time series prediction and anomaly detection tasks.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"125 ","pages":"Article 102429"},"PeriodicalIF":3.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141851064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1