Information Systems最新文献_第10页

Cube query interestingness: Novelty, relevance, peculiarity and surprise 立方体查询的趣味性：新颖性、相关性、特殊性和惊奇性

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-03-21 DOI: 10.1016/j.is.2024.102381

Dimos Gkitsakis , Spyridon Kaloudis , Eirini Mouselli , Veronika Peralta , Patrick Marcel , Panos Vassiliadis

In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review of related work in the fields of human behavior studies and computer science. We define the interestingness of a query as a vector of scores along different aspects, like novelty, relevance, surprise and peculiarity and complement this definition with a taxonomy of the information that can be used to assess each of these aspects of interestingness. We provide both syntactic (result-independent) and extensional (result-dependent) checks, measures and algorithms for assessing the different aspects of interestingness in a quantitative fashion. We also report our findings from a user study that we conducted, analyzing the significance of each aspect, its evolution over time and the behavior of the study’s participants.

本文讨论了在数据立方体环境中评估查询趣味性的方法。我们假定有一个分层多维数据库，其中存储着数据立方体和层次结构。我们首先全面回顾了人类行为研究和计算机科学领域的相关工作。我们将查询的趣味性定义为不同方面的分数向量，如新颖性、相关性、惊奇性和特殊性，并用可用于评估趣味性各方面的信息分类法对这一定义进行补充。我们提供了句法（与结果无关）和扩展（与结果有关）检查、测量方法和算法，用于定量评估趣味性的不同方面。我们还报告了我们进行的一项用户研究的结果，分析了每个方面的重要性、其随时间的演变以及研究参与者的行为。

引用次数: 0

A graph neural network with topic relation heterogeneous multi-level cross-item information for session-based recommendation 基于会话推荐的具有主题关系异构多级跨项信息的图神经网络

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-03-20 DOI: 10.1016/j.is.2024.102380

Fan Yang, Dunlu Peng

The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.¹

基于会话的推荐（SBR）的目的主要是通过分析匿名用户的历史行为记录来预测下一个可能的交互项目，并将结果推荐给用户。然而，由于用户的匿名性和行为记录的稀疏性，推荐结果往往不准确。现有的 SBR 模型主要考虑会话中项目的先后顺序，很少分析项目之间复杂的转换关系，此外，它们在挖掘不同会话之间的高阶隐藏关系方面也存在不足。针对这些问题，我们提出了一种主题关系异构多级跨项信息图神经网络（TRHMCI-GNN）来提高推荐性能。该模型试图通过主题分类捕捉项之间的隐藏关系，并构建一个主题关系异构跨项全局图。该图包含会话间跨项信息以及会话间隐藏的主题关系。此外，还建立了自循环星形图来学习会话内的跨项信息，并添加自连接属性来融合每个项自身的信息。利用通道混合注意机制，通过最大池化和平均池化两个通道汇集不同层次的项目信息，从而有效融合跨项目全局图和自环星图的项目信息。这样，该模型就能捕捉到目标物品的全局信息及其个体特征，并增加了标签平滑操作，从而实现推荐。大量实验结果表明，在 Diginetica、Yoochoose1/64 和 Tmall 三个真实数据集上，TRHMCI-GNN 模型的推荐性能优于同类基线模型。代码现已发布。

{"title":"A graph neural network with topic relation heterogeneous multi-level cross-item information for session-based recommendation","authors":"Fan Yang, Dunlu Peng","doi":"10.1016/j.is.2024.102380","DOIUrl":"https://doi.org/10.1016/j.is.2024.102380","url":null,"abstract":"<div><p>The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.<span><sup>1</sup></span></p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102380"},"PeriodicalIF":3.7,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140209509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection 基于跨模态注意力的深度学习框架，使用统一模态进行多模态假新闻、仇恨言论和攻击性语言检测

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-03-16 DOI: 10.1016/j.is.2024.102378

Eniafe Festus Ayetiran , Özlem Özgöbek

Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.

假新闻、仇恨言论和攻击性语言是当前影响现代社会的相关邪恶三要素。文本模式已被广泛应用于这些现象的计算检测。近来，这方面的多模态研究吸引了很多人的兴趣，因为其他模态在帮助检测这些威胁方面具有潜力。然而，多模态内容理解中的一个主要问题是如何有效地模拟不同模态的互补性，因为它们的特点和特征各不相同。从多模态的角度来看，对这三个任务的研究主要使用图像和文本模态。如何提高不同多模态方法的有效性仍是一个有待研究的课题。除了传统的文本和图像模式外，我们还考虑了图像文本，这种文本在以往的研究中很少使用，但其中包含的有用信息可以提高预测模型的有效性。为了简化多模态内容理解和增强预测，我们利用计算机视觉和深度学习的最新进展来完成这些任务。首先，除了主要文本外，我们还创建了图像和图像文本的文本表示，从而统一了各种模态。其次，我们提出了一种具有跨模态关注机制的多层深度神经网络，以模拟这些模态之间的互补性。我们进行了广泛的实验，涉及涵盖这三个任务的三个标准数据集。实验结果表明，假新闻、仇恨言论和攻击性语言的检测都能受益于这种方法。此外，我们还进行了鲁棒消融实验，以显示我们方法的有效性。在所有数据集上，我们的模型都明显优于之前的作品。

{"title":"An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection","authors":"Eniafe Festus Ayetiran , Özlem Özgöbek","doi":"10.1016/j.is.2024.102378","DOIUrl":"https://doi.org/10.1016/j.is.2024.102378","url":null,"abstract":"<div><p>Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102378"},"PeriodicalIF":3.7,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S030643792400036X/pdfft?md5=a31db78e16613aefde39a1acfcbb50af&pid=1-s2.0-S030643792400036X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140163766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SuperGuardian: Superspreader removal for cardinality estimation in data streaming 超级守护者在数据流中消除超级散布器以估算心率

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-02-17 DOI: 10.1016/j.is.2024.102351

Jie Lu , Hongchang Chen , Penghao Sun , Tao Hu , Zhen Zhang , Quan Ren

Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.

数据流的万有引力是数据流挖掘的基本问题之一，数据流被建模为来自不同数据流的项目序列，而数据流的万有引力就是数据流中不同项目的数量。现有的许多基于估计器共享的草图都是为了处理数据流中的巨大流量而提出的。然而，这些草图存在内存使用效率低的问题，因为它们为每个估算器分配了相同的内存大小，却没有考虑到有偏差的万有引力分布。为了解决这个问题，我们提出了超级守护者（SuperGuardian）来提高现有草图的内存效率。SuperGuardian 能智能地从数据流中分离出心率高的数据流，并用大估计器保留这些数据流的信息，同时用小估计器的现有草图来记录心率低的数据流。我们对 SuperGuardian 的心率估计误差进行了数学分析。为了验证我们的建议，我们实施了 SuperGuardian，并使用真实流量跟踪进行了实验评估。实验结果表明，使用超级守护者的现有草图可减少 79% - 96% 的误差，并将吞吐量提高 0.3-2.3 倍。

{"title":"SuperGuardian: Superspreader removal for cardinality estimation in data streaming","authors":"Jie Lu , Hongchang Chen , Penghao Sun , Tao Hu , Zhen Zhang , Quan Ren","doi":"10.1016/j.is.2024.102351","DOIUrl":"https://doi.org/10.1016/j.is.2024.102351","url":null,"abstract":"<div><p>Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102351"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey for managing temporal data in RDF 用 RDF 管理时态数据的调查

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-02-17 DOI: 10.1016/j.is.2024.102368

Di Wu , Hsien-Tseng Wang , Abdullah Uz Tansel

The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of $< s u b j e c t p r e d i c a t e o b j e c t >$ . Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.

互联网不仅是通信、交易和云存储的平台，也是一个巨大的知识库，人和机器都可以在这里创建、操作、推断和利用数据和知识。开发语义网就是为了促进这一目的，使机器能够理解数据和知识的含义，以便用于决策。资源描述框架（Resource Description Framework，RDF）构成了语义网的基础，语义网被组织成不同的层，称为 "语义网层蛋糕"（Semantic Web Layer Cake）。不过，RDF 的基本构造是一种二元关系，格式为<主谓宾>。用 RDF 表示高阶关系需要重新量化，这可能会很麻烦。时变数据非常普遍，但仅用二元关系无法充分表示。我们详细回顾了有关用时态数据扩展 RDF 的文献，比较了表示、查询、存储、实现和评估的方法。此外，我们还简要回顾了结合时态数据从空间、概率和其他维度扩展 RDF 的方法。

{"title":"A survey for managing temporal data in RDF","authors":"Di Wu , Hsien-Tseng Wang , Abdullah Uz Tansel","doi":"10.1016/j.is.2024.102368","DOIUrl":"10.1016/j.is.2024.102368","url":null,"abstract":"<div><p>The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of <span><math><mrow><mo><</mo><mi>s</mi><mi>u</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mspace></mspace><mi>p</mi><mi>r</mi><mi>e</mi><mi>d</mi><mi>i</mi><mi>c</mi><mi>a</mi><mi>t</mi><mi>e</mi><mspace></mspace><mi>o</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mo>></mo></mrow></math></span>. Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102368"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139923467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal representation and reasoning in data-intensive systems 数据密集型系统中的时态表示和推理

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-02-06 DOI: 10.1016/j.is.2024.102350

Alexander Artikis, Roberto Posenato, Stefano Tonetta

引用次数: 0

ArZiGo: A recommendation system for scientific articles ArZiGo：科学文章推荐系统

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-02-06 DOI: 10.1016/j.is.2024.102367

Iratxe Pinedo, Mikel Larrañaga, Ana Arruarte

The large number of scientific publications around the world is increasing at a rate of approximately 4%–5% per year. This fact has resulted in the need for tools that deal with relevant and high-quality publications. To address this necessity, search and reference management tools that include some recommendation algorithms have been developed. However, many of these solutions are proprietary tools and the full potential of recommender systems is rarely exploited. There are some solutions which provide recommendations for specific domains, by using ad-hoc resources. Furthermore, some other systems do not consider any personalization strategy to generate the recommendations. This paper presents ArZiGo, a web-based full prototype system for the search, management, and recommendation of scientific articles, which feeds on the Semantic Scholar Open Research Corpus, a corpus that is growing continually with more than 190M papers from all fields of science so far. ArZiGo combines different recommendation approaches within a hybrid system, in a configurable way, to recommend those papers that best suit the preferences of the users. A group of 30 human experts has participated in the evaluation of 500 recommendations in 10 research areas, 7 of which belong to the area of Computer Science and 3 to the area of Medicine, obtaining quite satisfactory results. Besides the appropriateness of the articles recommended, the execution time of the implemented algorithms has also been analyzed.

全球大量科学出版物正以每年约 4%-5% 的速度增长。因此，我们需要能处理相关和高质量出版物的工具。为了满足这一需求，人们开发了包含一些推荐算法的搜索和参考文献管理工具。然而，其中许多解决方案都是专有工具，很少能充分挖掘推荐系统的潜力。有些解决方案通过使用临时资源为特定领域提供推荐。此外，还有一些系统在生成推荐时不考虑任何个性化策略。本文介绍的ArZiGo是一个基于网络的科学文章搜索、管理和推荐全原型系统，它以语义学者开放研究语料库（Semantic Scholar Open Research Corpus）为基础。ArZiGo 在一个混合系统中以可配置的方式结合了不同的推荐方法，以推荐最符合用户偏好的论文。一个由 30 名人类专家组成的小组参与了对 10 个研究领域 500 篇推荐文章的评估，其中 7 篇属于计算机科学领域，3 篇属于医学领域，评估结果相当令人满意。除了分析推荐文章的适当性，还分析了所实施算法的执行时间。

{"title":"ArZiGo: A recommendation system for scientific articles","authors":"Iratxe Pinedo, Mikel Larrañaga, Ana Arruarte","doi":"10.1016/j.is.2024.102367","DOIUrl":"https://doi.org/10.1016/j.is.2024.102367","url":null,"abstract":"<div><p>The large number of scientific publications around the world is increasing at a rate of approximately 4%–5% per year. This fact has resulted in the need for tools that deal with relevant and high-quality publications. To address this necessity, search and reference management tools that include some recommendation algorithms have been developed. However, many of these solutions are proprietary tools and the full potential of recommender systems is rarely exploited. There are some solutions which provide recommendations for specific domains, by using ad-hoc resources. Furthermore, some other systems do not consider any personalization strategy to generate the recommendations. This paper presents <em>ArZiGo</em>, a web-based full prototype system for the search, management, and recommendation of scientific articles, which feeds on the Semantic Scholar Open Research Corpus, a corpus that is growing continually with more than 190M papers from all fields of science so far. <em>ArZiGo</em> combines different recommendation approaches within a hybrid system, in a configurable way, to recommend those papers that best suit the preferences of the users. A group of 30 human experts has participated in the evaluation of 500 recommendations in 10 research areas, 7 of which belong to the area of Computer Science and 3 to the area of Medicine, obtaining quite satisfactory results. Besides the appropriateness of the articles recommended, the execution time of the implemented algorithms has also been analyzed.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102367"},"PeriodicalIF":3.7,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000255/pdfft?md5=1cc6db90e90efa1af108cb01ca199a19&pid=1-s2.0-S0306437924000255-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139731503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An improved context-aware weighted matrix factorization algorithm for point of interest recommendation in LBSN 用于 LBSN 中兴趣点推荐的改进型上下文感知加权矩阵因式分解算法

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-02-05 DOI: 10.1016/j.is.2024.102366

Xu Zhou , Zhuoran Wang , Xuejie Liu , Yanheng Liu , Geng Sun

The point of interest (POI) recommendation algorithm in location based social network (LBSN) can assist people to find more appealing locations and satisfy their specific demands. However, it is challengeable to infer user’s preference due to the sparsity of the user’s check-in data. To address the problem and improve recommendation performance, this paper proposes an improved context-aware weighted matrix factorization algorithm for POI recommendation (ICWMF). It takes advantage of time factor, geographical information, and social relationship to obtain user’s preference for locations. Firstly, the Ebbinghaus forgetting curve is employed to model the influence of time attenuation, so as to reflect that user preferences change over time. In order to assign dynamic weights to unvisited POI and infer user preference, we build the implicit feedback term by modeling the geographical influence from user perspective and the social relationship. In addition, the Gaussian model is employed to construct proximity location relationship to represent the probability of locations being discovered by users. Then, it is taken as the regularization term to avoid overfitting. Finally, the objective function of weighted matrix factorization is reconstructed with the implicit feedback term and the regularization term we designed. ICWMF naturally learns two potential feature matrices during weighted matrix decomposition based on new designed objective function to achieve better recommendation results. The results of simulation experiments on Brightkite and Gowalla dataset indicate that ICWMF outperforms other four comparison methods in terms of precision and recall.

基于位置的社交网络（LBSN）中的兴趣点（POI）推荐算法可以帮助人们找到更有吸引力的地点，满足他们的特定需求。然而，由于用户签到数据的稀疏性，推断用户的偏好是一个难题。为解决这一问题并提高推荐性能，本文提出了一种用于 POI 推荐的改进型情境感知加权矩阵因式分解算法（ICWMF）。它利用时间因素、地理信息和社会关系来获取用户对地点的偏好。首先，采用艾宾浩斯遗忘曲线来模拟时间衰减的影响，以反映用户偏好随时间的变化而变化。为了给未访问的 POI 分配动态权重并推断用户偏好，我们从用户视角和社会关系的角度对地理影响建模，从而建立隐式反馈项。此外，我们还采用高斯模型来构建邻近位置关系，以表示用户发现位置的概率。然后，将其作为正则化项，以避免过拟合。最后，利用我们设计的隐式反馈项和正则化项重构加权矩阵因式分解的目标函数。ICWMF 基于新设计的目标函数，在加权矩阵分解过程中自然学习两个潜在特征矩阵，从而获得更好的推荐结果。在 Brightkite 和 Gowalla 数据集上的模拟实验结果表明，ICWMF 在精确度和召回率方面都优于其他四种比较方法。

{"title":"An improved context-aware weighted matrix factorization algorithm for point of interest recommendation in LBSN","authors":"Xu Zhou , Zhuoran Wang , Xuejie Liu , Yanheng Liu , Geng Sun","doi":"10.1016/j.is.2024.102366","DOIUrl":"https://doi.org/10.1016/j.is.2024.102366","url":null,"abstract":"<div><p>The point of interest (POI) recommendation algorithm in location based social network (LBSN) can assist people to find more appealing locations and satisfy their specific demands. However, it is challengeable to infer user’s preference due to the sparsity of the user’s check-in data. To address the problem and improve recommendation performance, this paper proposes an improved context-aware weighted matrix factorization algorithm for POI recommendation (ICWMF). It takes advantage of time factor, geographical information, and social relationship to obtain user’s preference for locations. Firstly, the Ebbinghaus forgetting curve is employed to model the influence of time attenuation, so as to reflect that user preferences change over time. In order to assign dynamic weights to unvisited POI and infer user preference, we build the implicit feedback term by modeling the geographical influence from user perspective and the social relationship. In addition, the Gaussian model is employed to construct proximity location relationship to represent the probability of locations being discovered by users. Then, it is taken as the regularization term to avoid overfitting. Finally, the objective function of weighted matrix factorization is reconstructed with the implicit feedback term and the regularization term we designed. ICWMF naturally learns two potential feature matrices during weighted matrix decomposition based on new designed objective function to achieve better recommendation results. The results of simulation experiments on Brightkite and Gowalla dataset indicate that ICWMF outperforms other four comparison methods in terms of precision and recall.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102366"},"PeriodicalIF":3.7,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000243/pdfft?md5=820111c51793f02a204ed785f84b746b&pid=1-s2.0-S0306437924000243-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139719595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Model repair supported by frequent anomalous local instance graphs 频繁异常局部实例图支持模型修复

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-01-29 DOI: 10.1016/j.is.2024.102349

Laura Genga , Fabio Rossi , Claudia Diamantini , Emanuele Storti , Domenico Potena

Model repair techniques aim at automatically updating a process model to incorporate behaviors that are observed in reality but are not compliant with the original model. Most state-of-the-art techniques focus on the fitness of the repaired models, with the goal of including single anomalous behaviors observed in a log in the form of the events. This often hampers the precision of the obtained models, which end up allowing much more behaviors than intended. In the quest of techniques avoiding this over-generalization pitfall, some notion of higher-level anomalous structure is taken into account. The type of structure considered is however typically limited to sequences of low-level events. In this work, we introduce a novel repair approach targeting more general high-level anomalous structures. To do this, we exploit instance graph representations of anomalous behaviors, that can be derived from the event log and the original process model. Our experiments show that considering high-level anomalies allows to generate repaired models that incorporate the behaviors of interest while maintaining precision and simplicity closer to the original model.

模型修复技术旨在自动更新流程模型，以纳入在现实中观察到的但不符合原始模型的行为。大多数最先进的技术都侧重于修复模型的适配性，目的是将日志中观察到的单个异常行为以事件的形式纳入模型。这往往会影响所获模型的精确度，最终导致允许的行为比预期的要多得多。在寻求避免这种过度概括的技术时，我们会考虑一些更高层次的异常结构概念。然而，所考虑的结构类型通常仅限于低级事件序列。在这项工作中，我们引入了一种新的修复方法，针对更一般的高层异常结构。为此，我们利用了异常行为的实例图表示，这些实例图可以从事件日志和原始流程模型中导出。我们的实验表明，考虑到高层次的异常情况，可以生成修复后的模型，其中包含了感兴趣的行为，同时保持了更接近原始模型的精度和简洁性。

{"title":"Model repair supported by frequent anomalous local instance graphs","authors":"Laura Genga , Fabio Rossi , Claudia Diamantini , Emanuele Storti , Domenico Potena","doi":"10.1016/j.is.2024.102349","DOIUrl":"10.1016/j.is.2024.102349","url":null,"abstract":"<div><p>Model repair techniques aim at automatically updating a process model to incorporate behaviors that are observed in reality but are not compliant with the original model. Most state-of-the-art techniques focus on the fitness of the repaired models, with the goal of including single anomalous behaviors observed in a log in the form of the events. This often hampers the precision of the obtained models, which end up allowing much more behaviors than intended. In the quest of techniques avoiding this over-generalization pitfall, some notion of higher-level anomalous structure is taken into account. The type of structure considered is however typically limited to sequences of low-level events. In this work, we introduce a novel repair approach targeting more general high-level anomalous structures. To do this, we exploit instance graph representations of anomalous behaviors, that can be derived from the event log and the original process model. Our experiments show that considering high-level anomalies allows to generate repaired models that incorporate the behaviors of interest while maintaining precision and simplicity closer to the original model.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102349"},"PeriodicalIF":3.7,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000073/pdfft?md5=e6a9000a2b961598d7bb3c3022cb43d2&pid=1-s2.0-S0306437924000073-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Corrigendum to “BPMN 2.0 OR-Join Semantics: Global and local characterisation” [Information Systems 105 (2022), 101934] BPMN 2.0 OR-Join 语义：全局和局部特征" [Information Systems 105 (2022), 101934] （信息系统 105 (2022), 101934

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2024-01-26 DOI: 10.1016/j.is.2023.102319

Asvin Goel

引用次数: 0