In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review of related work in the fields of human behavior studies and computer science. We define the interestingness of a query as a vector of scores along different aspects, like novelty, relevance, surprise and peculiarity and complement this definition with a taxonomy of the information that can be used to assess each of these aspects of interestingness. We provide both syntactic (result-independent) and extensional (result-dependent) checks, measures and algorithms for assessing the different aspects of interestingness in a quantitative fashion. We also report our findings from a user study that we conducted, analyzing the significance of each aspect, its evolution over time and the behavior of the study’s participants.
{"title":"Cube query interestingness: Novelty, relevance, peculiarity and surprise","authors":"Dimos Gkitsakis , Spyridon Kaloudis , Eirini Mouselli , Veronika Peralta , Patrick Marcel , Panos Vassiliadis","doi":"10.1016/j.is.2024.102381","DOIUrl":"https://doi.org/10.1016/j.is.2024.102381","url":null,"abstract":"<div><p>In this paper, we discuss methods to assess the interestingness of a query in an environment of data cubes. We assume a hierarchical multidimensional database, storing data cubes and level hierarchies. We start with a comprehensive review of related work in the fields of human behavior studies and computer science. We define the interestingness of a query as a vector of scores along different aspects, like novelty, relevance, surprise and peculiarity and complement this definition with a taxonomy of the information that can be used to assess each of these aspects of interestingness. We provide both syntactic (result-independent) and extensional (result-dependent) checks, measures and algorithms for assessing the different aspects of interestingness in a quantitative fashion. We also report our findings from a user study that we conducted, analyzing the significance of each aspect, its evolution over time and the behavior of the study’s participants.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102381"},"PeriodicalIF":3.7,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140290602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-20DOI: 10.1016/j.is.2024.102380
Fan Yang, Dunlu Peng
The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.1
{"title":"A graph neural network with topic relation heterogeneous multi-level cross-item information for session-based recommendation","authors":"Fan Yang, Dunlu Peng","doi":"10.1016/j.is.2024.102380","DOIUrl":"https://doi.org/10.1016/j.is.2024.102380","url":null,"abstract":"<div><p>The aim of session-based recommendation (SBR) mainly analyzes the anonymous user’s historical behavior records to predict the next possible interaction item and recommend the result to the user. However, due to the anonymity of users and the sparsity of behavior records, recommendation results are often inaccurate. The existing SBR models mainly consider the order of items within a session and rarely analyze the complex transition relationship between items, and additionally, they are inadequate at mining higher-order hidden relationship between different sessions. To address these issues, we propose a topic relation heterogeneous multi-level cross-item information graph neural network (TRHMCI-GNN) to improve the performance of recommendation. The model attempts to capture hidden relationship between items through topic classification and build a topic relation heterogeneous cross-item global graph. The graph contains inter-session cross-item information as well as hidden topic relation among sessions. In addition, a self-loop star graph is established to learn the intra-session cross-item information, and the self-connection attributes are added to fuse the information of each item itself. By using channel-hybrid attention mechanism, the item information of different levels is pooled by two channels: max-pooling and mean-pooling, which effectively fuse the item information of cross-item global graph and self-loop star graph. In this way, the model captures the global information of the target item and its individual features, and the label smoothing operation is added for recommendation. Extensive experimental results demonstrate that the recommendation performance of TRHMCI-GNN model is superior to the comparable baseline models on the three real datasets Diginetica, Yoochoose1/64 and Tmall. The code is available now.<span><sup>1</sup></span></p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102380"},"PeriodicalIF":3.7,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140209509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-16DOI: 10.1016/j.is.2024.102378
Eniafe Festus Ayetiran , Özlem Özgöbek
Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.
{"title":"An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection","authors":"Eniafe Festus Ayetiran , Özlem Özgöbek","doi":"10.1016/j.is.2024.102378","DOIUrl":"https://doi.org/10.1016/j.is.2024.102378","url":null,"abstract":"<div><p>Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"123 ","pages":"Article 102378"},"PeriodicalIF":3.7,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S030643792400036X/pdfft?md5=a31db78e16613aefde39a1acfcbb50af&pid=1-s2.0-S030643792400036X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140163766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-17DOI: 10.1016/j.is.2024.102351
Jie Lu , Hongchang Chen , Penghao Sun , Tao Hu , Zhen Zhang , Quan Ren
Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.
{"title":"SuperGuardian: Superspreader removal for cardinality estimation in data streaming","authors":"Jie Lu , Hongchang Chen , Penghao Sun , Tao Hu , Zhen Zhang , Quan Ren","doi":"10.1016/j.is.2024.102351","DOIUrl":"https://doi.org/10.1016/j.is.2024.102351","url":null,"abstract":"<div><p>Measuring flow cardinality is one of the fundamental problems in data stream mining, where a data stream is modeled as a sequence of items from different flows and the cardinality of a flow is the number of distinct items in the flow. Many existing sketches based on estimator sharing have been proposed to deal with huge flows in data streams. However, these sketches suffer from inefficient memory usage due to allocating the same memory size for each estimator without considering the skewed cardinality distribution. To address this issue, we propose SuperGuardian to improve the memory efficiency of existing sketches. SuperGuardian intelligently separates flows with high-cardinality from the data stream, and keeps the information of these flows with the large estimator, while using existing sketches with small estimators to record low-cardinality flows. We carry out a mathematical analysis for the cardinality estimation error of SuperGuardian. To validate our proposal, we have implemented SuperGuardian and conducted experimental evaluations using real traffic traces. The experimental results show that existing sketches using SuperGuardian reduce error by 79 % - 96 % and increase the throughput by 0.3–2.3 times.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102351"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-17DOI: 10.1016/j.is.2024.102368
Di Wu , Hsien-Tseng Wang , Abdullah Uz Tansel
The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of . Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.
{"title":"A survey for managing temporal data in RDF","authors":"Di Wu , Hsien-Tseng Wang , Abdullah Uz Tansel","doi":"10.1016/j.is.2024.102368","DOIUrl":"10.1016/j.is.2024.102368","url":null,"abstract":"<div><p>The Internet serves not only as a platform for communication, transactions, and cloud storage, but also as a vast knowledge store where both people and machines can create, manipulate, infer, and utilize data and knowledge. The Semantic Web was developed to facilitate this purpose, enabling machines to understand the meaning of data and knowledge for use in decision-making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web, which is organized into layers known as the Semantic Web Layer Cake. However, RDF’s basic construct is a binary relationship in the format of <span><math><mrow><mo><</mo><mi>s</mi><mi>u</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mspace></mspace><mi>p</mi><mi>r</mi><mi>e</mi><mi>d</mi><mi>i</mi><mi>c</mi><mi>a</mi><mi>t</mi><mi>e</mi><mspace></mspace><mi>o</mi><mi>b</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi><mo>></mo></mrow></math></span>. Representing higher-order relationships with RDF requires reification, which can be cumbersome. Time-varying data is prevalent, but cannot be adequately represented using only binary relationships. We conducted a detailed review of the literature on extending RDF with temporal data, comparing approaches for representation, querying, storage, implementation, and evaluation. In addition, we briefly reviewed approaches for extending RDF with spatial, probability, and other dimensions in conjunction with temporal data.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102368"},"PeriodicalIF":3.7,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139923467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-06DOI: 10.1016/j.is.2024.102367
Iratxe Pinedo, Mikel Larrañaga, Ana Arruarte
The large number of scientific publications around the world is increasing at a rate of approximately 4%–5% per year. This fact has resulted in the need for tools that deal with relevant and high-quality publications. To address this necessity, search and reference management tools that include some recommendation algorithms have been developed. However, many of these solutions are proprietary tools and the full potential of recommender systems is rarely exploited. There are some solutions which provide recommendations for specific domains, by using ad-hoc resources. Furthermore, some other systems do not consider any personalization strategy to generate the recommendations. This paper presents ArZiGo, a web-based full prototype system for the search, management, and recommendation of scientific articles, which feeds on the Semantic Scholar Open Research Corpus, a corpus that is growing continually with more than 190M papers from all fields of science so far. ArZiGo combines different recommendation approaches within a hybrid system, in a configurable way, to recommend those papers that best suit the preferences of the users. A group of 30 human experts has participated in the evaluation of 500 recommendations in 10 research areas, 7 of which belong to the area of Computer Science and 3 to the area of Medicine, obtaining quite satisfactory results. Besides the appropriateness of the articles recommended, the execution time of the implemented algorithms has also been analyzed.
全球大量科学出版物正以每年约 4%-5% 的速度增长。因此,我们需要能处理相关和高质量出版物的工具。为了满足这一需求,人们开发了包含一些推荐算法的搜索和参考文献管理工具。然而,其中许多解决方案都是专有工具,很少能充分挖掘推荐系统的潜力。有些解决方案通过使用临时资源为特定领域提供推荐。此外,还有一些系统在生成推荐时不考虑任何个性化策略。本文介绍的ArZiGo是一个基于网络的科学文章搜索、管理和推荐全原型系统,它以语义学者开放研究语料库(Semantic Scholar Open Research Corpus)为基础。ArZiGo 在一个混合系统中以可配置的方式结合了不同的推荐方法,以推荐最符合用户偏好的论文。一个由 30 名人类专家组成的小组参与了对 10 个研究领域 500 篇推荐文章的评估,其中 7 篇属于计算机科学领域,3 篇属于医学领域,评估结果相当令人满意。除了分析推荐文章的适当性,还分析了所实施算法的执行时间。
{"title":"ArZiGo: A recommendation system for scientific articles","authors":"Iratxe Pinedo, Mikel Larrañaga, Ana Arruarte","doi":"10.1016/j.is.2024.102367","DOIUrl":"https://doi.org/10.1016/j.is.2024.102367","url":null,"abstract":"<div><p>The large number of scientific publications around the world is increasing at a rate of approximately 4%–5% per year. This fact has resulted in the need for tools that deal with relevant and high-quality publications. To address this necessity, search and reference management tools that include some recommendation algorithms have been developed. However, many of these solutions are proprietary tools and the full potential of recommender systems is rarely exploited. There are some solutions which provide recommendations for specific domains, by using ad-hoc resources. Furthermore, some other systems do not consider any personalization strategy to generate the recommendations. This paper presents <em>ArZiGo</em>, a web-based full prototype system for the search, management, and recommendation of scientific articles, which feeds on the Semantic Scholar Open Research Corpus, a corpus that is growing continually with more than 190M papers from all fields of science so far. <em>ArZiGo</em> combines different recommendation approaches within a hybrid system, in a configurable way, to recommend those papers that best suit the preferences of the users. A group of 30 human experts has participated in the evaluation of 500 recommendations in 10 research areas, 7 of which belong to the area of Computer Science and 3 to the area of Medicine, obtaining quite satisfactory results. Besides the appropriateness of the articles recommended, the execution time of the implemented algorithms has also been analyzed.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102367"},"PeriodicalIF":3.7,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000255/pdfft?md5=1cc6db90e90efa1af108cb01ca199a19&pid=1-s2.0-S0306437924000255-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139731503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1016/j.is.2024.102366
Xu Zhou , Zhuoran Wang , Xuejie Liu , Yanheng Liu , Geng Sun
The point of interest (POI) recommendation algorithm in location based social network (LBSN) can assist people to find more appealing locations and satisfy their specific demands. However, it is challengeable to infer user’s preference due to the sparsity of the user’s check-in data. To address the problem and improve recommendation performance, this paper proposes an improved context-aware weighted matrix factorization algorithm for POI recommendation (ICWMF). It takes advantage of time factor, geographical information, and social relationship to obtain user’s preference for locations. Firstly, the Ebbinghaus forgetting curve is employed to model the influence of time attenuation, so as to reflect that user preferences change over time. In order to assign dynamic weights to unvisited POI and infer user preference, we build the implicit feedback term by modeling the geographical influence from user perspective and the social relationship. In addition, the Gaussian model is employed to construct proximity location relationship to represent the probability of locations being discovered by users. Then, it is taken as the regularization term to avoid overfitting. Finally, the objective function of weighted matrix factorization is reconstructed with the implicit feedback term and the regularization term we designed. ICWMF naturally learns two potential feature matrices during weighted matrix decomposition based on new designed objective function to achieve better recommendation results. The results of simulation experiments on Brightkite and Gowalla dataset indicate that ICWMF outperforms other four comparison methods in terms of precision and recall.
基于位置的社交网络(LBSN)中的兴趣点(POI)推荐算法可以帮助人们找到更有吸引力的地点,满足他们的特定需求。然而,由于用户签到数据的稀疏性,推断用户的偏好是一个难题。为解决这一问题并提高推荐性能,本文提出了一种用于 POI 推荐的改进型情境感知加权矩阵因式分解算法(ICWMF)。它利用时间因素、地理信息和社会关系来获取用户对地点的偏好。首先,采用艾宾浩斯遗忘曲线来模拟时间衰减的影响,以反映用户偏好随时间的变化而变化。为了给未访问的 POI 分配动态权重并推断用户偏好,我们从用户视角和社会关系的角度对地理影响建模,从而建立隐式反馈项。此外,我们还采用高斯模型来构建邻近位置关系,以表示用户发现位置的概率。然后,将其作为正则化项,以避免过拟合。最后,利用我们设计的隐式反馈项和正则化项重构加权矩阵因式分解的目标函数。ICWMF 基于新设计的目标函数,在加权矩阵分解过程中自然学习两个潜在特征矩阵,从而获得更好的推荐结果。在 Brightkite 和 Gowalla 数据集上的模拟实验结果表明,ICWMF 在精确度和召回率方面都优于其他四种比较方法。
{"title":"An improved context-aware weighted matrix factorization algorithm for point of interest recommendation in LBSN","authors":"Xu Zhou , Zhuoran Wang , Xuejie Liu , Yanheng Liu , Geng Sun","doi":"10.1016/j.is.2024.102366","DOIUrl":"https://doi.org/10.1016/j.is.2024.102366","url":null,"abstract":"<div><p>The point of interest (POI) recommendation algorithm in location based social network (LBSN) can assist people to find more appealing locations and satisfy their specific demands. However, it is challengeable to infer user’s preference due to the sparsity of the user’s check-in data. To address the problem and improve recommendation performance, this paper proposes an improved context-aware weighted matrix factorization algorithm for POI recommendation (ICWMF). It takes advantage of time factor, geographical information, and social relationship to obtain user’s preference for locations. Firstly, the Ebbinghaus forgetting curve is employed to model the influence of time attenuation, so as to reflect that user preferences change over time. In order to assign dynamic weights to unvisited POI and infer user preference, we build the implicit feedback term by modeling the geographical influence from user perspective and the social relationship. In addition, the Gaussian model is employed to construct proximity location relationship to represent the probability of locations being discovered by users. Then, it is taken as the regularization term to avoid overfitting. Finally, the objective function of weighted matrix factorization is reconstructed with the implicit feedback term and the regularization term we designed. ICWMF naturally learns two potential feature matrices during weighted matrix decomposition based on new designed objective function to achieve better recommendation results. The results of simulation experiments on Brightkite and Gowalla dataset indicate that ICWMF outperforms other four comparison methods in terms of precision and recall.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102366"},"PeriodicalIF":3.7,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000243/pdfft?md5=820111c51793f02a204ed785f84b746b&pid=1-s2.0-S0306437924000243-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139719595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Model repair techniques aim at automatically updating a process model to incorporate behaviors that are observed in reality but are not compliant with the original model. Most state-of-the-art techniques focus on the fitness of the repaired models, with the goal of including single anomalous behaviors observed in a log in the form of the events. This often hampers the precision of the obtained models, which end up allowing much more behaviors than intended. In the quest of techniques avoiding this over-generalization pitfall, some notion of higher-level anomalous structure is taken into account. The type of structure considered is however typically limited to sequences of low-level events. In this work, we introduce a novel repair approach targeting more general high-level anomalous structures. To do this, we exploit instance graph representations of anomalous behaviors, that can be derived from the event log and the original process model. Our experiments show that considering high-level anomalies allows to generate repaired models that incorporate the behaviors of interest while maintaining precision and simplicity closer to the original model.
{"title":"Model repair supported by frequent anomalous local instance graphs","authors":"Laura Genga , Fabio Rossi , Claudia Diamantini , Emanuele Storti , Domenico Potena","doi":"10.1016/j.is.2024.102349","DOIUrl":"10.1016/j.is.2024.102349","url":null,"abstract":"<div><p>Model repair techniques aim at automatically updating a process model to incorporate behaviors that are observed in reality but are not compliant with the original model. Most state-of-the-art techniques focus on the fitness of the repaired models, with the goal of including single anomalous behaviors observed in a log in the form of the events. This often hampers the precision of the obtained models, which end up allowing much more behaviors than intended. In the quest of techniques avoiding this over-generalization pitfall, some notion of higher-level anomalous structure is taken into account. The type of structure considered is however typically limited to sequences of low-level events. In this work, we introduce a novel repair approach targeting more general high-level anomalous structures. To do this, we exploit instance graph representations of anomalous behaviors, that can be derived from the event log and the original process model. Our experiments show that considering high-level anomalies allows to generate repaired models that incorporate the behaviors of interest while maintaining precision and simplicity closer to the original model.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102349"},"PeriodicalIF":3.7,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924000073/pdfft?md5=e6a9000a2b961598d7bb3c3022cb43d2&pid=1-s2.0-S0306437924000073-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-26DOI: 10.1016/j.is.2023.102319
Asvin Goel
{"title":"Corrigendum to “BPMN 2.0 OR-Join Semantics: Global and local characterisation” [Information Systems 105 (2022), 101934]","authors":"Asvin Goel","doi":"10.1016/j.is.2023.102319","DOIUrl":"10.1016/j.is.2023.102319","url":null,"abstract":"","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"122 ","pages":"Article 102319"},"PeriodicalIF":3.7,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437923001552/pdfft?md5=8c59069d95a02afdf51ccd60d561d4df&pid=1-s2.0-S0306437923001552-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}