Information Systems最新文献_第5页

Special Issue of CAiSE 2023 Best Papers CAiSE 2023 最佳论文特刊

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-10-01 DOI: 10.1016/j.is.2024.102469

Iris Reinhartz-Berger , Marta Indulska

引用次数: 0

Finding meaningful paths in heterogeneous graphs with PathWays 利用 PathWays 在异构图中查找有意义的路径

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-30 DOI: 10.1016/j.is.2024.102463

Nelly Barret , Antoine Gauquier , Jia-Jean Law , Ioana Manolescu

Graphs, and notably RDF graphs, are a prominent way of sharing data. As data usage democratizes, users need help figuring out the useful content of a graph dataset. In particular, journalists with whom we collaborate are interested in identifying, in a graph, the connections between entities, e.g., people, organizations, emails, etc. We present a novel method for exploring data graphs through their data paths connecting Named Entities (NEs, in short); each data path leads to a tabular-looking set of results. NEs are extracted from the data through dedicated Information Extraction modules. Our method builds upon the pre-existing ConnectionLens platform and follow-up work in the Abstra project, which builds simple, visual ER-style summaries of semi-structured data. The contribution of the present work, and its novelty, is twofold. First, we propose a novel analysis of entity-to-entity paths contained in datasets of any nature, and propose a new method for ranking paths, leveraging a novel Information Extraction (IE) module we built on top of ChatGPT. Second, we present an efficient approach to enumerate and compute NE paths, based on an algorithm which automatically recommends sub-paths to materialize, and rewrites the path queries using these subpaths. Our experiments demonstrate the interest of NE paths and the efficiency of our method for computing and ranking them.

图形，尤其是 RDF 图形，是一种重要的数据共享方式。随着数据使用的民主化，用户需要有人帮助他们找出图表数据集的有用内容。特别是与我们合作的记者，他们对在图中识别实体（如人、组织、电子邮件等）之间的联系很感兴趣。我们提出了一种通过连接命名实体（Named Entities，简称 NEs）的数据路径来探索数据图的新方法；每条数据路径都会产生一组表格形式的结果。通过专用的信息提取模块从数据中提取 NE。我们的方法建立在已有的 ConnectionLens 平台和 Abstra 项目的后续工作基础之上，后者可为半结构化数据建立简单、可视化的 ER 风格摘要。本工作的贡献及其新颖性体现在两个方面。首先，我们对任何性质的数据集中包含的实体到实体路径提出了一种新的分析方法，并利用我们在 ChatGPT 基础上构建的新颖信息提取（IE）模块，提出了一种新的路径排序方法。其次，我们提出了一种枚举和计算近义词路径的高效方法，该方法基于一种自动推荐子路径并使用这些子路径重写路径查询的算法。我们的实验证明了近邻路径的重要性以及我们计算和排列近邻路径的方法的效率。

{"title":"Finding meaningful paths in heterogeneous graphs with PathWays","authors":"Nelly Barret , Antoine Gauquier , Jia-Jean Law , Ioana Manolescu","doi":"10.1016/j.is.2024.102463","DOIUrl":"10.1016/j.is.2024.102463","url":null,"abstract":"<div><div>Graphs, and notably RDF graphs, are a prominent way of sharing data. As data usage democratizes, users need help figuring out the useful content of a graph dataset. In particular, journalists with whom we collaborate are interested in identifying, in a graph, the <em>connections between entities</em>, e.g., people, organizations, emails, etc. We present a novel method for exploring data graphs through <em>their data paths connecting Named Entities</em> (NEs, in short); each data path leads to a tabular-looking set of results. NEs are extracted from the data through dedicated Information Extraction modules. Our method builds upon the pre-existing ConnectionLens platform and follow-up work in the Abstra project, which builds simple, visual ER-style summaries of semi-structured data. The contribution of the present work, and its novelty, is twofold. First, we propose a novel analysis of entity-to-entity paths contained in datasets of any nature, and propose a new method for ranking paths, leveraging a novel Information Extraction (IE) module we built on top of ChatGPT. Second, we present an efficient approach to enumerate and compute NE paths, based on an algorithm which automatically recommends sub-paths to materialize, and rewrites the path queries using these subpaths. Our experiments demonstrate the interest of NE paths and the efficiency of our method for computing and ranking them.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102463"},"PeriodicalIF":3.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using AI explainable models and handwriting/drawing tasks for psychological well-being 利用人工智能可解释模型和手写/绘画任务促进心理健康

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-28 DOI: 10.1016/j.is.2024.102465

Francesco Prinzi , Pietro Barbiero , Claudia Greco , Terry Amorese , Gennaro Cordasco , Pietro Liò , Salvatore Vitabile , Anna Esposito

This study addresses the increasing threat to Psychological Well-Being (PWB) posed by Depression, Anxiety, and Stress conditions. Machine learning methods have shown promising results for several psychological conditions. However, the lack of transparency in existing models impedes practical application. The study aims to develop explainable machine learning models for depression, anxiety and stress prediction, focusing on features extracted from tasks involving handwriting and drawing.

Two hundred patients completed the Depression, Anxiety, and Stress Scale (DASS-21) and performed seven tasks related to handwriting and drawing. Extracted features, encompassing pressure, stroke pattern, time, space, and pen inclination, were used to train the explainable-by-design Entropy-based Logic Explained Network (e-LEN) model, employing first-order logic rules for explanation. Performance comparison was performed with XGBoost, enhanced by the SHAP explanation method.

The trained models achieved notable accuracy in predicting depression (0.749 ±0.089), anxiety (0.721 ±0.088), and stress (0.761 ±0.086) through 10-fold cross-validation (repeated 20 times). The e-LEN model’s logic rules facilitated clinical validation, uncovering correlations with existing clinical literature. While performance remained consistent for depression and anxiety on an independent test dataset, a slight degradation was observed for stress prediction in the test task.

本研究探讨了抑郁、焦虑和压力对心理健康（PWB）造成的日益严重的威胁。机器学习方法已在几种心理状况方面取得了可喜的成果。然而，现有模型缺乏透明度，妨碍了实际应用。这项研究旨在开发用于预测抑郁、焦虑和压力的可解释机器学习模型，重点是从涉及手写和绘画的任务中提取的特征。200 名患者完成了抑郁、焦虑和压力量表（DASS-21），并完成了七项与手写和绘画有关的任务。提取的特征包括压力、笔画模式、时间、空间和笔的倾斜度，用于训练基于熵的可解释逻辑解释网络（e-LEN）模型，该模型采用一阶逻辑规则进行解释。通过 10 倍交叉验证（重复 20 次），训练出的模型在预测抑郁（0.749 ±0.089 ）、焦虑（0.721 ±0.088 ）和压力（0.761 ±0.086 ）方面取得了显著的准确性。e-LEN 模型的逻辑规则促进了临床验证，发现了与现有临床文献的相关性。在独立的测试数据集上，抑郁和焦虑的表现保持一致，但在测试任务中，压力预测的表现略有下降。

{"title":"Using AI explainable models and handwriting/drawing tasks for psychological well-being","authors":"Francesco Prinzi , Pietro Barbiero , Claudia Greco , Terry Amorese , Gennaro Cordasco , Pietro Liò , Salvatore Vitabile , Anna Esposito","doi":"10.1016/j.is.2024.102465","DOIUrl":"10.1016/j.is.2024.102465","url":null,"abstract":"<div><div>This study addresses the increasing threat to Psychological Well-Being (PWB) posed by Depression, Anxiety, and Stress conditions. Machine learning methods have shown promising results for several psychological conditions. However, the lack of transparency in existing models impedes practical application. The study aims to develop explainable machine learning models for depression, anxiety and stress prediction, focusing on features extracted from tasks involving handwriting and drawing.</div><div>Two hundred patients completed the Depression, Anxiety, and Stress Scale (DASS-21) and performed seven tasks related to handwriting and drawing. Extracted features, encompassing pressure, stroke pattern, time, space, and pen inclination, were used to train the explainable-by-design Entropy-based Logic Explained Network (e-LEN) model, employing first-order logic rules for explanation. Performance comparison was performed with XGBoost, enhanced by the SHAP explanation method.</div><div>The trained models achieved notable accuracy in predicting depression (0.749 ±0.089), anxiety (0.721 ±0.088), and stress (0.761 ±0.086) through 10-fold cross-validation (repeated 20 times). The e-LEN model’s logic rules facilitated clinical validation, uncovering correlations with existing clinical literature. While performance remained consistent for depression and anxiety on an independent test dataset, a slight degradation was observed for stress prediction in the test task.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102465"},"PeriodicalIF":3.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effective data exploration through clustering of local attributive explanations 通过对局部归因解释的聚类进行有效的数据探索

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-28 DOI: 10.1016/j.is.2024.102464

Elodie Escriva , Tom Lefrere , Manon Martin , Julien Aligon , Alexandre Chanson , Jean-Baptiste Excoffier , Nicolas Labroche , Chantal Soulé-Dupuy , Paul Monsarrat

Machine Learning (ML) has become an essential tool for modeling complex phenomena, offering robust predictions and comprehensive data analysis. Nevertheless, the lack of interpretability in these predictions often results in a closed-box effect, which the field of eXplainable Machine Learning (XML) aims to address. Local attributive XML methods, in particular, provide explanations by quantifying the contribution of each attribute to individual predictions, referred to as influences. This type of explanation is the most acute as it focuses on each instance of the dataset and allows the detection of individual differences. Additionally, aggregating local explanations allows for a deeper analysis of the underlying data. In this context, influences can be considered as a new data space to reveal and understand complex data patterns. We hypothesize that these influences, derived from ML explanations, are more informative than the original raw data, especially for identifying homogeneous groups within the data. To identify such groups effectively, we utilize a clustering approach. We compare clusters formed using raw data against those formed using influences computed by various local attributive XML methods. Our findings reveal that clusters based on influences consistently outperform those based on raw data, even when using models with low accuracy.

机器学习（ML）已成为复杂现象建模的重要工具，可提供可靠的预测和全面的数据分析。然而，由于这些预测缺乏可解释性，往往会产生闭箱效应，而可解释机器学习（XML）领域正是要解决这一问题。局部属性 XML 方法尤其通过量化每个属性对单个预测的贡献（称为影响）来提供解释。这种类型的解释最为尖锐，因为它侧重于数据集的每个实例，并允许检测个体差异。此外，汇总局部解释可以对基础数据进行更深入的分析。在这种情况下，影响因素可被视为一种新的数据空间，用于揭示和理解复杂的数据模式。我们假设，这些从 ML 解释中得出的影响因素比原始数据更有参考价值，尤其是在识别数据中的同质群体方面。为了有效识别这类群体，我们采用了聚类方法。我们将使用原始数据形成的聚类与使用各种局部归因 XML 方法计算的影响因素形成的聚类进行了比较。我们的研究结果表明，基于影响因素的聚类始终优于基于原始数据的聚类，即使在使用准确率较低的模型时也是如此。

{"title":"Effective data exploration through clustering of local attributive explanations","authors":"Elodie Escriva , Tom Lefrere , Manon Martin , Julien Aligon , Alexandre Chanson , Jean-Baptiste Excoffier , Nicolas Labroche , Chantal Soulé-Dupuy , Paul Monsarrat","doi":"10.1016/j.is.2024.102464","DOIUrl":"10.1016/j.is.2024.102464","url":null,"abstract":"<div><div>Machine Learning (ML) has become an essential tool for modeling complex phenomena, offering robust predictions and comprehensive data analysis. Nevertheless, the lack of interpretability in these predictions often results in a closed-box effect, which the field of eXplainable Machine Learning (XML) aims to address. Local attributive XML methods, in particular, provide explanations by quantifying the contribution of each attribute to individual predictions, referred to as influences. This type of explanation is the most acute as it focuses on each instance of the dataset and allows the detection of individual differences. Additionally, aggregating local explanations allows for a deeper analysis of the underlying data. In this context, influences can be considered as a new data space to reveal and understand complex data patterns. We hypothesize that these influences, derived from ML explanations, are more informative than the original raw data, especially for identifying homogeneous groups within the data. To identify such groups effectively, we utilize a clustering approach. We compare clusters formed using raw data against those formed using influences computed by various local attributive XML methods. Our findings reveal that clusters based on influences consistently outperform those based on raw data, even when using models with low accuracy.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102464"},"PeriodicalIF":3.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data Lakehouse: A survey and experimental study 数据湖：调查与实验研究

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-26 DOI: 10.1016/j.is.2024.102460

Ahmed A. Harby , Farhana Zulkernine

Efficient big data management is a dire necessity to manage the exponential growth in data generated by digital information systems to produce usable knowledge. Structured databases, data lakes, and warehouses have each provided a solution with varying degrees of success. However, a new and superior solution, the data Lakehouse, has emerged to extract actionable insights from unstructured data ingested from distributed sources. By combining the strengths of data warehouses and data lakes, the data Lakehouse can process and merge data quickly while ingesting and storing high-speed unstructured data with post-storage transformation and analytics capabilities. The Lakehouse architecture offers the necessary features for optimal functionality and has gained significant attention in the big data management research community. In this paper, we compare data lake, warehouse, and lakehouse systems, highlight their strengths and shortcomings, identify the desired features to handle the evolving challenges in big data management and analysis and propose an advanced data Lakehouse architecture. We also demonstrate the performance of three state-of-the-art data management systems namely HDFS data lake, Hive data warehouse, and Delta lakehouse in managing data for analytical query responses through an experimental study.

高效的大数据管理是管理数字信息系统产生的指数级增长数据以产生可用知识的迫切需要。结构化数据库、数据湖和仓库都提供了不同程度的解决方案。然而，一种新的、更优越的解决方案--数据湖，已经出现，它可以从从分布式来源获取的非结构化数据中提取可操作的见解。通过结合数据仓库和数据湖的优势，数据湖可以快速处理和合并数据，同时利用存储后转换和分析功能摄取和存储高速非结构化数据。Lakehouse 架构提供了实现最佳功能的必要特性，在大数据管理研究界获得了极大关注。在本文中，我们比较了数据湖、仓库和 Lakehouse 系统，强调了它们的优势和不足，确定了应对大数据管理和分析中不断变化的挑战所需的功能，并提出了一种先进的数据 Lakehouse 架构。我们还通过一项实验研究，展示了三种最先进的数据管理系统（即 HDFS 数据湖、Hive 数据仓库和 Delta Lakehouse）在管理数据以进行分析查询响应方面的性能。

{"title":"Data Lakehouse: A survey and experimental study","authors":"Ahmed A. Harby , Farhana Zulkernine","doi":"10.1016/j.is.2024.102460","DOIUrl":"10.1016/j.is.2024.102460","url":null,"abstract":"<div><div>Efficient big data management is a dire necessity to manage the exponential growth in data generated by digital information systems to produce usable knowledge. Structured databases, data lakes, and warehouses have each provided a solution with varying degrees of success. However, a new and superior solution, the data Lakehouse, has emerged to extract actionable insights from unstructured data ingested from distributed sources. By combining the strengths of data warehouses and data lakes, the data Lakehouse can process and merge data quickly while ingesting and storing high-speed unstructured data with post-storage transformation and analytics capabilities. The Lakehouse architecture offers the necessary features for optimal functionality and has gained significant attention in the big data management research community. In this paper, we compare data lake, warehouse, and lakehouse systems, highlight their strengths and shortcomings, identify the desired features to handle the evolving challenges in big data management and analysis and propose an advanced data Lakehouse architecture. We also demonstrate the performance of three state-of-the-art data management systems namely HDFS data lake, Hive data warehouse, and Delta lakehouse in managing data for analytical query responses through an experimental study.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102460"},"PeriodicalIF":3.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proactive conformance checking: An approach for predicting deviations in business processes 主动一致性检查：预测业务流程偏差的方法

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-23 DOI: 10.1016/j.is.2024.102461

Michael Grohs , Peter Pfeiffer , Jana-Rebecca Rehse

Modern business processes are subject to an increasing number of external and internal regulations. Compliance with these regulations is crucial for the success of organizations. To ensure this compliance, process managers can identify and mitigate deviations between the predefined process behavior and the executed process instances by means of conformance checking techniques. However, these techniques are inherently reactive, meaning that they can only detect deviations after they have occurred. It would be desirable to detect and mitigate deviations before they occur, enabling managers to proactively ensure compliance of running process instances. In this paper, we propose Business Process Deviation Prediction (BPDP), a novel predictive approach that relies on a supervised machine learning model to predict which deviations can be expected in the future of running process instances. BPDP is able to predict individual deviations as well as deviation patterns. Further, it provides the user with a list of potential reasons for predicted deviations. Our evaluation shows that BPDP outperforms existing methods for deviation prediction. Following the idea of action-oriented process mining, BPDP thus enables process managers to prevent deviations in early stages of running process instances.

现代业务流程受制于越来越多的外部和内部法规。遵守这些规定对企业的成功至关重要。为确保这种合规性，流程管理者可以通过一致性检查技术来识别和减少预定义流程行为与已执行流程实例之间的偏差。然而，这些技术本质上是被动的，也就是说，它们只能在偏差发生后才能检测到偏差。我们希望在偏差发生之前就能发现并减少偏差，从而使管理人员能够主动确保运行中的流程实例符合要求。在本文中，我们提出了业务流程偏差预测（BPDP），这是一种新颖的预测方法，它依靠有监督的机器学习模型来预测运行流程实例未来可能出现的偏差。BPDP 既能预测单个偏差，也能预测偏差模式。此外，它还能为用户提供预测偏差的潜在原因列表。我们的评估结果表明，BPDP 在偏差预测方面优于现有方法。因此，按照面向行动的流程挖掘理念，BPDP 能够让流程管理者在流程实例运行的早期阶段就防止出现偏差。

{"title":"Proactive conformance checking: An approach for predicting deviations in business processes","authors":"Michael Grohs , Peter Pfeiffer , Jana-Rebecca Rehse","doi":"10.1016/j.is.2024.102461","DOIUrl":"10.1016/j.is.2024.102461","url":null,"abstract":"<div><div>Modern business processes are subject to an increasing number of external and internal regulations. Compliance with these regulations is crucial for the success of organizations. To ensure this compliance, process managers can identify and mitigate deviations between the predefined process behavior and the executed process instances by means of conformance checking techniques. However, these techniques are inherently reactive, meaning that they can only detect deviations after they have occurred. It would be desirable to detect and mitigate deviations before they occur, enabling managers to proactively ensure compliance of running process instances. In this paper, we propose Business Process Deviation Prediction (BPDP), a novel predictive approach that relies on a supervised machine learning model to predict which deviations can be expected in the future of running process instances. BPDP is able to predict individual deviations as well as deviation patterns. Further, it provides the user with a list of potential reasons for predicted deviations. Our evaluation shows that BPDP outperforms existing methods for deviation prediction. Following the idea of action-oriented process mining, BPDP thus enables process managers to prevent deviations in early stages of running process instances.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102461"},"PeriodicalIF":3.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal graph processing in modern memory hierarchies 现代存储器分层中的时序图处理

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-21 DOI: 10.1016/j.is.2024.102462

Alexander Baumstark, Muhammad Attahir Jibril, Kai-Uwe Sattler

Updates in graph DBMS lead to structural changes in the graph over time with different intermediate states. Capturing these changes and their time is one of the main purposes of temporal DBMS. Most DBMSs built their temporal features based on their non-temporal processing and storage without considering the memory hierarchy of the underlying system. This leads to slower temporal processing and poor storage utilization. In this paper, we propose a storage and processing strategy for (bi-) temporal graphs using temporal materialized views (TMV) while exploiting the memory hierarchy of a modern system. Further, we show a solution to the query containment problem for certain types of temporal graph queries. Finally, we evaluate the overhead and performance of the presented approach. The results show that using TMV reduces the runtime of temporal graph queries while using less memory.

图 DBMS 中的更新会导致图的结构随时间发生变化，并具有不同的中间状态。捕捉这些变化及其时间是时态 DBMS 的主要目的之一。大多数 DBMS 都是在非时态处理和存储的基础上构建其时态特性，而没有考虑底层系统的内存层次结构。这导致时态处理速度较慢，存储利用率较低。在本文中，我们提出了一种使用时态物化视图（TMV）的（双）时态图存储和处理策略，同时利用了现代系统的内存层次结构。此外，我们还展示了针对某些类型时态图查询的查询包含问题的解决方案。最后，我们对所介绍方法的开销和性能进行了评估。结果表明，使用 TMV 可以减少时态图查询的运行时间，同时占用更少的内存。

引用次数: 0

Bridging reading and mapping: The role of reading annotations in facilitating feedback while concept mapping 连接阅读和绘图：在绘制概念图时，阅读注释在促进反馈中的作用

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-06 DOI: 10.1016/j.is.2024.102458

Oscar Díaz, Xabier Garmendia

Concept maps are visual tools for organizing knowledge, commonly used in education and design. The process often involves reading and developing conceptual models, where feedback is crucial. Learners (e.g., students, designers) often refer to reading materials, and receive feedback from instructors (e.g., teachers, stakeholders) based on the maps they create. However, annotations made by learners, like highlights, are usually not visible to instructors, limiting tailored feedback. We propose incorporating annotation practices into concept mapping. Learners could highlight text and link these highlights to existing or newly created concepts in their concept map. This way, instructors can access both the concept map and the relevant readings for better feedback. This vision is realized through Concept&Go, a plug-in for the editor CmapCloud. This extension aims at the interplay between mapping, reading, and feedback during concept mapping. The effectiveness of this approach is demonstrated through a focus group (n=5) and a UTAUT evaluation (n=12). Concept&Go is publicly available.

概念图是组织知识的可视化工具，常用于教育和设计领域。这一过程通常涉及阅读和开发概念模型，其中反馈至关重要。学习者（如学生、设计师）通常会参考阅读材料，并根据自己绘制的地图从指导者（如教师、利益相关者）那里获得反馈。然而，学习者所做的注释（如高亮部分）通常不为指导者所见，从而限制了有针对性的反馈。我们建议将注释做法纳入概念图。学习者可以突出显示文本，并将这些突出显示链接到概念图中现有的或新创建的概念。这样，教师就可以同时访问概念图和相关阅读内容，从而获得更好的反馈。Concept&Go 是 CmapCloud 编辑器的一个插件，它实现了这一愿景。该插件旨在实现概念图绘制过程中绘图、阅读和反馈之间的相互作用。通过焦点小组（5 人）和UTAUT 评估（12 人）证明了这种方法的有效性。Concept&Go已公开发布。

{"title":"Bridging reading and mapping: The role of reading annotations in facilitating feedback while concept mapping","authors":"Oscar Díaz, Xabier Garmendia","doi":"10.1016/j.is.2024.102458","DOIUrl":"10.1016/j.is.2024.102458","url":null,"abstract":"<div><p>Concept maps are visual tools for organizing knowledge, commonly used in education and design. The process often involves reading and developing conceptual models, where feedback is crucial. Learners (e.g., students, designers) often refer to reading materials, and receive feedback from instructors (e.g., teachers, stakeholders) based on the maps they create. However, annotations made by learners, like highlights, are usually not visible to instructors, limiting tailored feedback. We propose incorporating annotation practices into concept mapping. Learners could highlight text and link these highlights to existing or newly created concepts in their concept map. This way, instructors can access both the concept map and the relevant readings for better feedback. This vision is realized through <em>Concept&Go</em>, a plug-in for the editor <em>CmapCloud</em>. This extension aims at the interplay between mapping, reading, and feedback during concept mapping. The effectiveness of this approach is demonstrated through a focus group (n=5) and a UTAUT evaluation (n=12). <em>Concept&Go</em> is publicly available.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102458"},"PeriodicalIF":3.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924001169/pdfft?md5=f1df1b7c90dae26d25484ea7d7b77c25&pid=1-s2.0-S0306437924001169-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A universal approach for simplified redundancy-aware cross-model querying 简化冗余感知跨模型查询的通用方法

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-04 DOI: 10.1016/j.is.2024.102456

Pavel Koupil, Daniel Crha, Irena Holubová

Numerous challenges and open problems have appeared with the dawn of multi-model data. In most cases, single-model solutions cannot be straightforwardly extended, and new, efficient approaches must be found. In addition, since there are no standards related to combining and managing multiple models, the situation is even more complicated and confusing for users.

This paper deals with the most important aspect of data management — querying. To enable the user to grasp all the popular models, we base our solution on the abstract categorical representation of multi-model data, which can be viewed as a graph. To unify the querying of multi-model data, we enable the user to query the categorical graph using a SPARQL-based model-agnostic query language called MMQL. The query is then decomposed and translated into languages of the underlying systems. The intermediate results are then combined into the final categorical result that can be expressed in any selected format. The support for cross-model redundancy enables one to create distinct query plans and choose the optimal one. We also introduce a proof-of-concept implementation of our solution called MM-quecat.

随着多模型数据的出现，出现了许多挑战和悬而未决的问题。在大多数情况下，单一模型解决方案无法直接扩展，必须找到新的高效方法。此外，由于没有与组合和管理多模型相关的标准，情况对用户来说更加复杂和混乱。为了让用户掌握所有流行的模型，我们的解决方案基于多模型数据的抽象分类表示法，这种表示法可以看作是一个图。为了统一多模型数据的查询，我们让用户能够使用基于 SPARQL 的模型无关查询语言 MMQL 查询分类图。然后将查询分解并翻译成底层系统的语言。然后将中间结果合并为最终的分类结果，该结果可以任何选定的格式表达。对跨模型冗余的支持使人们能够创建不同的查询计划并选择最优计划。我们还介绍了我们的解决方案的概念验证实现，称为 MM-quecat。

{"title":"A universal approach for simplified redundancy-aware cross-model querying","authors":"Pavel Koupil, Daniel Crha, Irena Holubová","doi":"10.1016/j.is.2024.102456","DOIUrl":"10.1016/j.is.2024.102456","url":null,"abstract":"<div><p>Numerous challenges and open problems have appeared with the dawn of multi-model data. In most cases, single-model solutions cannot be straightforwardly extended, and new, efficient approaches must be found. In addition, since there are no standards related to combining and managing multiple models, the situation is even more complicated and confusing for users.</p><p>This paper deals with the most important aspect of data management — querying. To enable the user to grasp all the popular models, we base our solution on the abstract categorical representation of multi-model data, which can be viewed as a graph. To unify the querying of multi-model data, we enable the user to query the categorical graph using a SPARQL-based model-agnostic query language called MMQL. The query is then decomposed and translated into languages of the underlying systems. The intermediate results are then combined into the final categorical result that can be expressed in any selected format. The support for cross-model redundancy enables one to create distinct query plans and choose the optimal one. We also introduce a proof-of-concept implementation of our solution called <em>MM-quecat</em>.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102456"},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tri-AL: An open source platform for visualization and analysis of clinical trials Tri-AL：用于临床试验可视化和分析的开源平台

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-09-04 DOI: 10.1016/j.is.2024.102459

Pouyan Nahed , Mina Esmail Zadeh Nojoo Kambar , Kazem Taghva , Lukasz Golab

ClinicalTrials.gov hosts an online database with over 440,000 medical studies (as of 2023) evaluating drugs, supplements, medical devices, and behavioral treatments. Target users include scientists, medical researchers, pharmaceutical companies, and other public and private institutions. Although ClinicalTrials has some filtering ability, it does not provide visualization tools, reporting tools or historical data; only the most recent state of each trial is visible to users. To fill these functionality gaps, we present Tri-AL: an open-source data platform for clinical trial visualization, information extraction, historical analysis, and reporting. This paper describes the design and functionality of Tri-AL, including a programmable module to incorporate machine learning models and extract disease-specific data from unstructured trial reports, which we demonstrate using Alzheimer’s disease reporting as a case study. We also highlight the use of Tri-AL for trial participation analysis in terms of sex, gender, race and ethnicity. The source code is publicly available at https://github.com/pouyan9675/Tri-AL.

ClinicalTrials.gov 是一个在线数据库，收录了超过 440,000 项评估药物、保健品、医疗器械和行为疗法的医学研究（截至 2023 年）。目标用户包括科学家、医学研究人员、制药公司以及其他公共和私营机构。尽管 ClinicalTrials 具有一定的筛选功能，但它不提供可视化工具、报告工具或历史数据；用户只能看到每个试验的最新状态。为了填补这些功能空白，我们提出了 Tri-AL：一个用于临床试验可视化、信息提取、历史分析和报告的开源数据平台。本文介绍了 Tri-AL 的设计和功能，包括一个可编程模块，用于整合机器学习模型，并从非结构化试验报告中提取特定疾病的数据。我们还重点介绍了如何使用 Tri-AL 从性别、种族和民族角度分析试验参与情况。源代码可通过 https://github.com/pouyan9675/Tri-AL 公开获取。

{"title":"Tri-AL: An open source platform for visualization and analysis of clinical trials","authors":"Pouyan Nahed , Mina Esmail Zadeh Nojoo Kambar , Kazem Taghva , Lukasz Golab","doi":"10.1016/j.is.2024.102459","DOIUrl":"10.1016/j.is.2024.102459","url":null,"abstract":"<div><p>ClinicalTrials.gov hosts an online database with over 440,000 medical studies (as of 2023) evaluating drugs, supplements, medical devices, and behavioral treatments. Target users include scientists, medical researchers, pharmaceutical companies, and other public and private institutions. Although ClinicalTrials has some filtering ability, it does not provide visualization tools, reporting tools or historical data; only the most recent state of each trial is visible to users. To fill these functionality gaps, we present <em>Tri-AL</em>: an open-source data platform for clinical trial visualization, information extraction, historical analysis, and reporting. This paper describes the design and functionality of <em>Tri-AL</em>, including a programmable module to incorporate machine learning models and extract disease-specific data from unstructured trial reports, which we demonstrate using Alzheimer’s disease reporting as a case study. We also highlight the use of <em>Tri-AL</em> for trial participation analysis in terms of sex, gender, race and ethnicity. The source code is publicly available at <span><span>https://github.com/pouyan9675/Tri-AL</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102459"},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0