首页 > 最新文献

Information Systems最新文献

英文 中文
Blockchain technology for requirement traceability in systems engineering 区块链技术在系统工程中的需求可追溯性
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-05 DOI: 10.1016/j.is.2024.102384
Mohan S.R. Elapolu , Rahul Rai , David J. Gorsich , Denise Rizzo , Stephen Rapp , Matthew P. Castanier

Requirement engineering (RE), a systematic process of eliciting, defining, analyzing, and managing requirements, is a vital phase in systems engineering. In RE, requirement traceability establishes the relationship between the artifacts and supports requirement validation, change management, and impact analysis. Establishing requirement traceability is challenging, especially in the early stages of a complex system design, as requirements constantly evolve and change. Moreover, the involvement of distributed stakeholders in system development introduces collaboration and trust issues. This paper outlines a novel blockchain-based requirement traceability framework that includes a data acquisition template and graph-based visualization. The template enables dual-level traceability (artifact and object) in the RE processes. The traceability information acquired through the templates is stored in the blockchain, where traces are embedded in blocks’ metadata and data. Furthermore, the blockchain is represented as a Neo4J property graph where traces can be retrieved using Cypher queries, thus enabling a mechanism to query and examine the history of requirements. The framework’s efficacy is showcased by documenting the RE process of an autonomous automotive system. Our results indicated that the framework can record the history of artifacts with constantly changing requirements and can yield secure decentralized ledgers of requirement artifacts. The proposed distributed traceability framework has shown promise to enhance stakeholder collaboration and trust. However, additional user studies should be conducted to bolster our results.

需求工程(Requirement Engineering,RE)是一个系统性的需求征集、定义、分析和管理过程,是系统工程的一个重要阶段。在需求工程中,需求可追溯性建立了工件之间的关系,并支持需求验证、变更管理和影响分析。建立需求可追溯性具有挑战性,尤其是在复杂系统设计的早期阶段,因为需求会不断发展和变化。此外,分布式利益相关者参与系统开发会带来协作和信任问题。本文概述了一种基于区块链的新型需求可追溯性框架,其中包括数据采集模板和基于图形的可视化。该模板实现了可再生能源流程中的双层可追溯性(工件和对象)。通过模板获取的可追溯信息存储在区块链中,其中的痕迹嵌入到区块的元数据和数据中。此外,区块链被表示为一个 Neo4J 属性图,可使用 Cypher 查询检索痕迹,从而实现查询和检查需求历史的机制。通过记录自主汽车系统的 RE 流程,展示了该框架的功效。我们的研究结果表明,该框架可以记录需求不断变化的工件历史,并能产生安全的分散式需求工件分类账。所提出的分布式可追溯性框架有望增强利益相关者之间的协作和信任。不过,还需要进行更多的用户研究,以巩固我们的成果。
{"title":"Blockchain technology for requirement traceability in systems engineering","authors":"Mohan S.R. Elapolu ,&nbsp;Rahul Rai ,&nbsp;David J. Gorsich ,&nbsp;Denise Rizzo ,&nbsp;Stephen Rapp ,&nbsp;Matthew P. Castanier","doi":"10.1016/j.is.2024.102384","DOIUrl":"https://doi.org/10.1016/j.is.2024.102384","url":null,"abstract":"<div><p>Requirement engineering (RE), a systematic process of eliciting, defining, analyzing, and managing requirements, is a vital phase in systems engineering. In RE, requirement traceability establishes the relationship between the artifacts and supports requirement validation, change management, and impact analysis. Establishing requirement traceability is challenging, especially in the early stages of a complex system design, as requirements constantly evolve and change. Moreover, the involvement of distributed stakeholders in system development introduces collaboration and trust issues. This paper outlines a novel blockchain-based requirement traceability framework that includes a data acquisition template and graph-based visualization. The template enables dual-level traceability (artifact and object) in the RE processes. The traceability information acquired through the templates is stored in the blockchain, where traces are embedded in blocks’ metadata and data. Furthermore, the blockchain is represented as a <em>Neo4J</em> property graph where traces can be retrieved using <em>Cypher</em> queries, thus enabling a mechanism to query and examine the history of requirements. The framework’s efficacy is showcased by documenting the RE process of an autonomous automotive system. Our results indicated that the framework can record the history of artifacts with constantly changing requirements and can yield secure decentralized ledgers of requirement artifacts. The proposed distributed traceability framework has shown promise to enhance stakeholder collaboration and trust. However, additional user studies should be conducted to bolster our results.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102384"},"PeriodicalIF":3.7,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140554947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enjoy the silence: Analysis of stochastic Petri nets with silent transitions 享受无声无声转换的随机 Petri 网分析
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-04 DOI: 10.1016/j.is.2024.102383
Sander J.J. Leemans , Fabrizio Maria Maggi , Marco Montali

Capturing stochastic behaviour in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis to represent stochastic behaviour and support different data-driven and model-driven analysis tasks in this spectrum. However, when capturing business processes, such nets inherently need a labelling that maps between transitions and activities. In many state of the art process mining techniques, this labelling is not 1-on-1, leading to unlabelled transitions and activities represented by multiple transitions. At the same time, they have to be analysed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterisation that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterisation starting from generalised stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviour of interest within an LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.

捕捉业务和工作流程中的随机行为对于定量了解流程内决策时如何解决非确定性问题至关重要。这在流程挖掘中具有特殊意义,因为在流程挖掘中,跟踪流程实际执行情况的事件数据与流程模型相关联,从而可以深入了解频率和概率。随机 Petri 网的变体为表示随机行为提供了一个自然的形式基础,并在此范围内支持不同的数据驱动和模型驱动分析任务。然而,在捕捉业务流程时,此类网络本质上需要在过渡和活动之间进行映射的标签。在许多最先进的流程挖掘技术中,这种标记不是一对一的,从而导致未标记的过渡和活动由多个过渡来表示。同时,它们必须以有限轨迹语义进行分析,这与每个流程执行由有限多个步骤组成的事实相匹配。这两个方面阻碍了现有随机 Petri 网技术的直接应用,因此需要一种新颖的表征方法,将标签和无声转换纳入有限轨迹语义。在本文中,我们从广义随机 Petri 网出发,提供了这样一种表征方法,并获得了标签随机过程(LSP)框架。在此框架之上,我们引入了关于 LSPs 轨迹及其概率的不同关键分析任务。我们证明,所有这些分析任务都可以通过分析来解决,特别是将它们简化为一种单一的方法,将基于自动机的技术与基于吸收马尔可夫链的技术相结合,前者用于找出 LSP 中感兴趣的行为,后者用于推理其概率。最后,我们展示了我们的方法在随机一致性检查中的意义,通过概念验证实施及其在不同数据集上的应用,说明了这种方法的实际可行性。
{"title":"Enjoy the silence: Analysis of stochastic Petri nets with silent transitions","authors":"Sander J.J. Leemans ,&nbsp;Fabrizio Maria Maggi ,&nbsp;Marco Montali","doi":"10.1016/j.is.2024.102383","DOIUrl":"10.1016/j.is.2024.102383","url":null,"abstract":"<div><p>Capturing stochastic behaviour in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis to represent stochastic behaviour and support different data-driven and model-driven analysis tasks in this spectrum. However, when capturing business processes, such nets inherently need a labelling that maps between transitions and activities. In many state of the art process mining techniques, this labelling is not 1-on-1, leading to unlabelled transitions and activities represented by multiple transitions. At the same time, they have to be analysed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterisation that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterisation starting from generalised stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviour of interest within an LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"124 ","pages":"Article 102383"},"PeriodicalIF":3.7,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000413/pdfft?md5=2011a29e04496e91e304834ecac1b098&pid=1-s2.0-S0306437924000413-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140762458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A chance for models to show their quality: Stochastic process model-log dimensions 模型展示其质量的机会随机过程模型-对数维度
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-02 DOI: 10.1016/j.is.2024.102382
Adam T. Burke , Sander J.J. Leemans , Moe T. Wynn , Wil M.P. van der Aalst , Arthur H.M. ter Hofstede

Process models describe the desired or observed behaviour of organisations. In stochastic process mining, computational analysis of trace data yields process models which describe process paths and their probability of execution. To understand the quality of these models, and to compare them, quantitative quality measures are used.

This research investigates model comparison empirically, using stochastic process models built from real-life logs. The experimental design collects a large number of models generated randomly and using process discovery techniques. Twenty-five different metrics are taken on these models, using both existing process model metrics and new, exploratory ones. The results are analysed quantitatively, making particular use of principal component analysis.

Based on this analysis, we suggest three stochastic process model dimensions: adhesion, relevance and simplicity. We also suggest possible metrics for these dimensions, and demonstrate their use on example models.

流程模型描述组织的预期或观察到的行为。在随机流程挖掘中,通过对跟踪数据进行计算分析,可以得到描述流程路径及其执行概率的流程模型。为了解这些模型的质量并对其进行比较,我们使用了定量质量度量方法。本研究使用从真实日志中建立的随机流程模型,对模型比较进行了实证研究。实验设计收集了大量使用流程发现技术随机生成的模型。使用现有的流程模型指标和新的探索性指标,对这些模型进行了 25 种不同指标的测量。在此基础上,我们提出了三个随机流程模型维度:粘性、相关性和简单性。我们还为这些维度提出了可能的度量标准,并在示例模型中进行了演示。
{"title":"A chance for models to show their quality: Stochastic process model-log dimensions","authors":"Adam T. Burke ,&nbsp;Sander J.J. Leemans ,&nbsp;Moe T. Wynn ,&nbsp;Wil M.P. van der Aalst ,&nbsp;Arthur H.M. ter Hofstede","doi":"10.1016/j.is.2024.102382","DOIUrl":"https://doi.org/10.1016/j.is.2024.102382","url":null,"abstract":"<div><p>Process models describe the desired or observed behaviour of organisations. In stochastic process mining, computational analysis of trace data yields process models which describe process paths and their probability of execution. To understand the quality of these models, and to compare them, quantitative quality measures are used.</p><p>This research investigates model comparison empirically, using stochastic process models built from real-life logs. The experimental design collects a large number of models generated randomly and using process discovery techniques. Twenty-five different metrics are taken on these models, using both existing process model metrics and new, exploratory ones. The results are analysed quantitatively, making particular use of principal component analysis.</p><p>Based on this analysis, we suggest three stochastic process model dimensions: adhesion, relevance and simplicity. We also suggest possible metrics for these dimensions, and demonstrate their use on example models.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"124 ","pages":"Article 102382"},"PeriodicalIF":3.7,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000401/pdfft?md5=6831ca8dc2e3712e67135ed5946d6e27&pid=1-s2.0-S0306437924000401-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140552022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The rise of nonnegative matrix factorization: Algorithms and applications 非负矩阵因式分解的兴起:算法与应用
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-03-21 DOI: 10.1016/j.is.2024.102379
Yi-Ting Guo , Qin-Qin Li , Chun-Sheng Liang

Although nonnegative matrix factorization (NMF) is widely used, some matrix factorization methods result in misleading results and waste of computing resources due to lack of timely optimization and case-by-case consideration. Therefore, an up-to-date and comprehensive review on its algorithms and applications is needed to promote improvement and applications for NMF. Here, we start with introducing background and gathering the principles and formulae of NMF algorithms. There have been dozens of new algorithms since its birth in the 1990s. Generally, several or even more algorithms are adopted in a single software package written in R, Python, C/C++, etc. Besides, the applications of NMF are analyzed. NMF is not only most widely used in modern subjects or techniques such as computer science, telecommunications, imaging science, and remote sensing but also increasingly used in traditional subjects such as physics, chemistry, biology, medicine, and psychology, being accepted by around 130 fields (disciplines) in about 20 years. Finally, the features and performance of different categories of NMF are summarized and evaluated. The summarized advantages and disadvantages and proposed suggestions for improvements are expected to enlighten the future efforts to polish the mathematical principles and procedures of NMF to realize higher accuracy and productivity in practical use.

尽管非负矩阵因式分解(NMF)得到了广泛应用,但一些矩阵因式分解方法由于缺乏及时优化和个案考虑,导致了误导性结果和计算资源的浪费。因此,需要对其算法和应用进行最新、全面的评述,以促进 NMF 的改进和应用。在此,我们首先介绍背景,并收集 NMF 算法的原理和公式。自 20 世纪 90 年代诞生以来,已有数十种新算法。一般来说,在一个用 R、Python、C/C++ 等语言编写的软件包中会采用几种甚至更多的算法。此外,还分析了 NMF 的应用。NMF 不仅在计算机科学、电信、成像科学和遥感等现代学科或技术中得到了最广泛的应用,而且在物理、化学、生物、医学和心理学等传统学科中也得到了越来越多的应用,在约 20 年的时间里被约 130 个领域(学科)所接受。最后,总结并评估了不同类别 NMF 的特点和性能。总结的优缺点和提出的改进建议,希望能对今后完善 NMF 的数学原理和程序,以实现更高精度和更高生产率的实际应用有所启发。
{"title":"The rise of nonnegative matrix factorization: Algorithms and applications","authors":"Yi-Ting Guo ,&nbsp;Qin-Qin Li ,&nbsp;Chun-Sheng Liang","doi":"10.1016/j.is.2024.102379","DOIUrl":"10.1016/j.is.2024.102379","url":null,"abstract":"<div><p>Although nonnegative matrix factorization (NMF) is widely used, some matrix factorization methods result in misleading results and waste of computing resources due to lack of timely optimization and case-by-case consideration. Therefore, an up-to-date and comprehensive review on its algorithms and applications is needed to promote improvement and applications for NMF. Here, we start with introducing background and gathering the principles and formulae of NMF algorithms. There have been dozens of new algorithms since its birth in the 1990s. Generally, several or even more algorithms are adopted in a single software package written in R, Python, C/C++, etc. Besides, the applications of NMF are analyzed. NMF is not only most widely used in modern subjects or techniques such as computer science, telecommunications, imaging science, and remote sensing but also increasingly used in traditional subjects such as physics, chemistry, biology, medicine, and psychology, being accepted by around 130 fields (disciplines) in about 20 years. Finally, the features and performance of different categories of NMF are summarized and evaluated. The summarized advantages and disadvantages and proposed suggestions for improvements are expected to enlighten the future efforts to polish the mathematical principles and procedures of NMF to realize higher accuracy and productivity in practical use.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102379"},"PeriodicalIF":3.7,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140276114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cube query interestingness: Novelty, relevance, peculiarity and surprise 立方体查询的趣味性:新颖性、相关性、特殊性和惊奇性
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-03-21 DOI: 10.1016/j.is.2024.102381
Dimos Gkitsakis , Spyridon Kaloudis , Eirini Mouselli , Veronika Peralta , Patrick Marcel , Panos Vassiliadis

In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review of related work in the fields of human behavior studies and computer science. We define the interestingness of a query as a vector of scores along different aspects, like novelty, relevance, surprise and peculiarity and complement this definition with a taxonomy of the information that can be used to assess each of these aspects of interestingness. We provide both syntactic (result-independent) and extensional (result-dependent) checks, measures and algorithms for assessing the different aspects of interestingness in a quantitative fashion. We also report our findings from a user study that we conducted, analyzing the significance of each aspect, its evolution over time and the behavior of the study’s participants.

本文讨论了在数据立方体环境中评估查询趣味性的方法。我们假定有一个分层多维数据库,其中存储着数据立方体和层次结构。我们首先全面回顾了人类行为研究和计算机科学领域的相关工作。我们将查询的趣味性定义为不同方面的分数向量,如新颖性、相关性、惊奇性和特殊性,并用可用于评估趣味性各方面的信息分类法对这一定义进行补充。我们提供了句法(与结果无关)和扩展(与结果有关)检查、测量方法和算法,用于定量评估趣味性的不同方面。我们还报告了我们进行的一项用户研究的结果,分析了每个方面的重要性、其随时间的演变以及研究参与者的行为。
{"title":"Cube query interestingness: Novelty, relevance, peculiarity and surprise","authors":"Dimos Gkitsakis ,&nbsp;Spyridon Kaloudis ,&nbsp;Eirini Mouselli ,&nbsp;Veronika Peralta ,&nbsp;Patrick Marcel ,&nbsp;Panos Vassiliadis","doi":"10.1016/j.is.2024.102381","DOIUrl":"https://doi.org/10.1016/j.is.2024.102381","url":null,"abstract":"<div><p>In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review of related work in the fields of human behavior studies and computer science. We define the interestingness of a query as a vector of scores along different aspects, like novelty, relevance, surprise and peculiarity and complement this definition with a taxonomy of the information that can be used to assess each of these aspects of interestingness. We provide both syntactic (result-independent) and extensional (result-dependent) checks, measures and algorithms for assessing the different aspects of interestingness in a quantitative fashion. We also report our findings from a user study that we conducted, analyzing the significance of each aspect, its evolution over time and the behavior of the study’s participants.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102381"},"PeriodicalIF":3.7,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140290602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A graph neural network with topic relation heterogeneous multi-level cross-item information for session-based recommendation 基于会话推荐的具有主题关系异构多级跨项信息的图神经网络
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-03-20 DOI: 10.1016/j.is.2024.102380
Fan Yang, Dunlu Peng

The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.1

基于会话的推荐(SBR)的目的主要是通过分析匿名用户的历史行为记录来预测下一个可能的交互项目,并将结果推荐给用户。然而,由于用户的匿名性和行为记录的稀疏性,推荐结果往往不准确。现有的 SBR 模型主要考虑会话中项目的先后顺序,很少分析项目之间复杂的转换关系,此外,它们在挖掘不同会话之间的高阶隐藏关系方面也存在不足。针对这些问题,我们提出了一种主题关系异构多级跨项信息图神经网络(TRHMCI-GNN)来提高推荐性能。该模型试图通过主题分类捕捉项之间的隐藏关系,并构建一个主题关系异构跨项全局图。该图包含会话间跨项信息以及会话间隐藏的主题关系。此外,还建立了自循环星形图来学习会话内的跨项信息,并添加自连接属性来融合每个项自身的信息。利用通道混合注意机制,通过最大池化和平均池化两个通道汇集不同层次的项目信息,从而有效融合跨项目全局图和自环星图的项目信息。这样,该模型就能捕捉到目标物品的全局信息及其个体特征,并增加了标签平滑操作,从而实现推荐。大量实验结果表明,在 Diginetica、Yoochoose1/64 和 Tmall 三个真实数据集上,TRHMCI-GNN 模型的推荐性能优于同类基线模型。代码现已发布。
{"title":"A graph neural network with topic relation heterogeneous multi-level cross-item information for session-based recommendation","authors":"Fan Yang,&nbsp;Dunlu Peng","doi":"10.1016/j.is.2024.102380","DOIUrl":"https://doi.org/10.1016/j.is.2024.102380","url":null,"abstract":"<div><p>The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.<span><sup>1</sup></span></p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102380"},"PeriodicalIF":3.7,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140209509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection 基于跨模态注意力的深度学习框架,使用统一模态进行多模态假新闻、仇恨言论和攻击性语言检测
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-03-16 DOI: 10.1016/j.is.2024.102378
Eniafe Festus Ayetiran , Özlem Özgöbek

Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.

假新闻、仇恨言论和攻击性语言是当前影响现代社会的相关邪恶三要素。文本模式已被广泛应用于这些现象的计算检测。近来,这方面的多模态研究吸引了很多人的兴趣,因为其他模态在帮助检测这些威胁方面具有潜力。然而,多模态内容理解中的一个主要问题是如何有效地模拟不同模态的互补性,因为它们的特点和特征各不相同。从多模态的角度来看,对这三个任务的研究主要使用图像和文本模态。如何提高不同多模态方法的有效性仍是一个有待研究的课题。除了传统的文本和图像模式外,我们还考虑了图像文本,这种文本在以往的研究中很少使用,但其中包含的有用信息可以提高预测模型的有效性。为了简化多模态内容理解和增强预测,我们利用计算机视觉和深度学习的最新进展来完成这些任务。首先,除了主要文本外,我们还创建了图像和图像文本的文本表示,从而统一了各种模态。其次,我们提出了一种具有跨模态关注机制的多层深度神经网络,以模拟这些模态之间的互补性。我们进行了广泛的实验,涉及涵盖这三个任务的三个标准数据集。实验结果表明,假新闻、仇恨言论和攻击性语言的检测都能受益于这种方法。此外,我们还进行了鲁棒消融实验,以显示我们方法的有效性。在所有数据集上,我们的模型都明显优于之前的作品。
{"title":"An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection","authors":"Eniafe Festus Ayetiran ,&nbsp;Özlem Özgöbek","doi":"10.1016/j.is.2024.102378","DOIUrl":"https://doi.org/10.1016/j.is.2024.102378","url":null,"abstract":"<div><p>Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102378"},"PeriodicalIF":3.7,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S030643792400036X/pdfft?md5=a31db78e16613aefde39a1acfcbb50af&pid=1-s2.0-S030643792400036X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140163766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SuperGuardian: Superspreader removal for cardinality estimation in data streaming 超级守护者在数据流中消除超级散布器以估算心率
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-02-17 DOI: 10.1016/j.is.2024.102351
Jie Lu , Hongchang Chen , Penghao Sun , Tao Hu , Zhen Zhang , Quan Ren

Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.

数据流的万有引力是数据流挖掘的基本问题之一,数据流被建模为来自不同数据流的项目序列,而数据流的万有引力就是数据流中不同项目的数量。现有的许多基于估计器共享的草图都是为了处理数据流中的巨大流量而提出的。然而,这些草图存在内存使用效率低的问题,因为它们为每个估算器分配了相同的内存大小,却没有考虑到有偏差的万有引力分布。为了解决这个问题,我们提出了超级守护者(SuperGuardian)来提高现有草图的内存效率。SuperGuardian 能智能地从数据流中分离出心率高的数据流,并用大估计器保留这些数据流的信息,同时用小估计器的现有草图来记录心率低的数据流。我们对 SuperGuardian 的心率估计误差进行了数学分析。为了验证我们的建议,我们实施了 SuperGuardian,并使用真实流量跟踪进行了实验评估。实验结果表明,使用超级守护者的现有草图可减少 79% - 96% 的误差,并将吞吐量提高 0.3-2.3 倍。
{"title":"SuperGuardian: Superspreader removal for cardinality estimation in data streaming","authors":"Jie Lu ,&nbsp;Hongchang Chen ,&nbsp;Penghao Sun ,&nbsp;Tao Hu ,&nbsp;Zhen Zhang ,&nbsp;Quan Ren","doi":"10.1016/j.is.2024.102351","DOIUrl":"https://doi.org/10.1016/j.is.2024.102351","url":null,"abstract":"<div><p>Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102351"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey for managing temporal data in RDF 用 RDF 管理时态数据的调查
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-02-17 DOI: 10.1016/j.is.2024.102368
Di Wu , Hsien-Tseng Wang , Abdullah Uz Tansel

The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of <subjectpredicateobject>. Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.

互联网不仅是通信、交易和云存储的平台,也是一个巨大的知识库,人和机器都可以在这里创建、操作、推断和利用数据和知识。开发语义网就是为了促进这一目的,使机器能够理解数据和知识的含义,以便用于决策。资源描述框架(Resource Description Framework,RDF)构成了语义网的基础,语义网被组织成不同的层,称为 "语义网层蛋糕"(Semantic Web Layer Cake)。不过,RDF 的基本构造是一种二元关系,格式为<主谓宾>。用 RDF 表示高阶关系需要重新量化,这可能会很麻烦。时变数据非常普遍,但仅用二元关系无法充分表示。我们详细回顾了有关用时态数据扩展 RDF 的文献,比较了表示、查询、存储、实现和评估的方法。此外,我们还简要回顾了结合时态数据从空间、概率和其他维度扩展 RDF 的方法。
{"title":"A survey for managing temporal data in RDF","authors":"Di Wu ,&nbsp;Hsien-Tseng Wang ,&nbsp;Abdullah Uz Tansel","doi":"10.1016/j.is.2024.102368","DOIUrl":"10.1016/j.is.2024.102368","url":null,"abstract":"<div><p>The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of <span><math><mrow><mo>&lt;</mo><mi>s</mi><mi>u</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mspace></mspace><mi>p</mi><mi>r</mi><mi>e</mi><mi>d</mi><mi>i</mi><mi>c</mi><mi>a</mi><mi>t</mi><mi>e</mi><mspace></mspace><mi>o</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mo>&gt;</mo></mrow></math></span>. Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102368"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139923467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal representation and reasoning in data-intensive systems 数据密集型系统中的时态表示和推理
IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-02-06 DOI: 10.1016/j.is.2024.102350
Alexander Artikis, Roberto Posenato, Stefano Tonetta
{"title":"Temporal representation and reasoning in data-intensive systems","authors":"Alexander Artikis,&nbsp;Roberto Posenato,&nbsp;Stefano Tonetta","doi":"10.1016/j.is.2024.102350","DOIUrl":"https://doi.org/10.1016/j.is.2024.102350","url":null,"abstract":"","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102350"},"PeriodicalIF":3.7,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139719596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1