Information Systems最新文献

英文中文

An interpretable deep fusion framework for event log repair

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-03-02 DOI: 10.1016/j.is.2025.102548

Yongwang Yuan , Xianwen Fang , Ke Lu , ZhenHu Zhang

In executing business processes, issues like information system failures or manual recording errors may lead to data loss in event logs, resulting in missing event logs. Utilizing such missing logs could seriously impact the quality of business process analysis results. To address this scenario, current advanced repair methods rely primarily on deep learning technology to provide intelligent solutions for business processes. However, deep learning technology is often considered a "black-box" model, lacking sufficient interpretability. No method is currently available to provide particular interpretability, especially in repairing specific missing values within the logs. This paper proposes the deep fusion interpretability framework based on artificial intelligence technology to address this issue. In the task of event log repair, this framework gradually transitions from the overall framework's local to global interpretability. It provides local interpretability from the attribute-level data flow perspective, semi-local interpretability from the event-level behavioral control-flow perspective, and global interpretability from the trace-level perspective. Next, we present various modes of multi-head attention within the framework and visualize the process of attention distribution calculation to explain how the framework repairs missing values through the profound combination of multi-head attention mode and context. Finally, Experimental results in real public event logs show that the DFI framework can effectively repair the missing values in event logs and explain the missing value repair process.

{"title":"An interpretable deep fusion framework for event log repair","authors":"Yongwang Yuan , Xianwen Fang , Ke Lu , ZhenHu Zhang","doi":"10.1016/j.is.2025.102548","DOIUrl":"10.1016/j.is.2025.102548","url":null,"abstract":"<div><div>In executing business processes, issues like information system failures or manual recording errors may lead to data loss in event logs, resulting in missing event logs. Utilizing such missing logs could seriously impact the quality of business process analysis results. To address this scenario, current advanced repair methods rely primarily on deep learning technology to provide intelligent solutions for business processes. However, deep learning technology is often considered a \"black-box\" model, lacking sufficient interpretability. No method is currently available to provide particular interpretability, especially in repairing specific missing values within the logs. This paper proposes the deep fusion interpretability framework based on artificial intelligence technology to address this issue. In the task of event log repair, this framework gradually transitions from the overall framework's local to global interpretability. It provides local interpretability from the attribute-level data flow perspective, semi-local interpretability from the event-level behavioral control-flow perspective, and global interpretability from the trace-level perspective. Next, we present various modes of multi-head attention within the framework and visualize the process of attention distribution calculation to explain how the framework repairs missing values through the profound combination of multi-head attention mode and context. Finally, Experimental results in real public event logs show that the DFI framework can effectively repair the missing values in event logs and explain the missing value repair process.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"132 ","pages":"Article 102548"},"PeriodicalIF":3.0,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143562706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A hierarchical transformer-based network for multivariate time series classification

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-02-24 DOI: 10.1016/j.is.2025.102536

Yingxia Tang , Yanxuan Wei , Teng Li , Xiangwei Zheng , Cun Ji

In recent years, Transformer has demonstrated considerable potential in multivariate time series classification due to its exceptional strength in capturing global dependencies. However, as a generalized approach, it still faces challenges in processing time series data, such as insufficient temporal sensitivity and inadequate ability to capture local features. In this paper, a hierarchical Transformer-based network (Hformer) is innovatively proposed to address these problems. Hformer handles time series progressively at various stages to aggregate multi-scale representations. At the start of each stage, Hformer segments the input sequence and extracts features independently on each temporal slice. Furthermore, to better accommodate multivariate time series data, a more efficient absolute position encoding as well as relative position encoding are employed by Hformer. Experimental results on 30 multivariate time series datasets of the UEA archive demonstrate that the proposed method outperforms most state-of-the-art methods.

引用次数: 0

Special issue: BPM 2022 Selected papers in Foundations and Engineering

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-02-15 DOI: 10.1016/j.is.2025.102535

Claudio Di Ciccio, Remco Dijkman, Adela del Río Ortega, Stefanie Rinderle-Ma, Manfred Reichert

引用次数: 0

Advancing EHR analysis: Predictive medication modeling using LLMs

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-02-08 DOI: 10.1016/j.is.2025.102528

Hanan Alghamdi , Abeer Mostafa

In modern healthcare systems, the analysis of Electronic Health Records (EHR) is fundamental for uncovering patient health trends and enhancing clinical practices. This study aims to advance EHR analysis by developing predictive models for prescribed medication prediction using the MIMIC-IV dataset. We address data preparation challenges through comprehensive cleaning and feature selection, transforming structured patient health data into coherent sentences suitable for natural language processing (NLP). We fine-tuned several state-of-the-art large language models (LLMs), including Llama2, Llama3, Gemma, GPT-3.5 Turbo, Meditron, Claude 3.5-Sonnet, DeepSeek-R1, Falcon and Mistral, using tailored prompts derived from EHR data. The models were rigorously evaluated based on Cosine similarity, recall, precision, and F1-score, with BERTScore as the evaluation metric to address limitations of traditional n-gram-based metrics. BERTScore utilizes contextualized token embeddings for semantic similarity, providing a more accurate and human-aligned evaluation. Our findings demonstrate that the integration of advanced NLP techniques with detailed EHR data significantly improves medication management predictions. This research highlights the potential of LLMs in clinical settings and underscores the importance of seamless integration with EHR systems to improve patient safety and healthcare delivery.

{"title":"Advancing EHR analysis: Predictive medication modeling using LLMs","authors":"Hanan Alghamdi , Abeer Mostafa","doi":"10.1016/j.is.2025.102528","DOIUrl":"10.1016/j.is.2025.102528","url":null,"abstract":"<div><div>In modern healthcare systems, the analysis of Electronic Health Records (EHR) is fundamental for uncovering patient health trends and enhancing clinical practices. This study aims to advance EHR analysis by developing predictive models for prescribed medication prediction using the MIMIC-IV dataset. We address data preparation challenges through comprehensive cleaning and feature selection, transforming structured patient health data into coherent sentences suitable for natural language processing (NLP). We fine-tuned several state-of-the-art large language models (LLMs), including Llama2, Llama3, Gemma, GPT-3.5 Turbo, Meditron, Claude 3.5-Sonnet, DeepSeek-R1, Falcon and Mistral, using tailored prompts derived from EHR data. The models were rigorously evaluated based on Cosine similarity, recall, precision, and F1-score, with BERTScore as the evaluation metric to address limitations of traditional n-gram-based metrics. BERTScore utilizes contextualized token embeddings for semantic similarity, providing a more accurate and human-aligned evaluation. Our findings demonstrate that the integration of advanced NLP techniques with detailed EHR data significantly improves medication management predictions. This research highlights the potential of LLMs in clinical settings and underscores the importance of seamless integration with EHR systems to improve patient safety and healthcare delivery.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102528"},"PeriodicalIF":3.0,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143395833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Process-driven design of cloud data platforms

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-02-04 DOI: 10.1016/j.is.2025.102527

Matteo Francia, Matteo Golfarelli, Manuele Pasini

Data platforms are state-of-the-art solutions for implementing data-driven applications and analytics. They facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Microsoft Azure). However, when it comes to engineering data platforms, no unifying strategy and methodology is available yet, and the design is mainly left to the expertise of practitioners in the field. Service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective design approach starts with knowledge of the data transformation and exploitation processes that the platform should support. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We show that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.

{"title":"Process-driven design of cloud data platforms","authors":"Matteo Francia, Matteo Golfarelli, Manuele Pasini","doi":"10.1016/j.is.2025.102527","DOIUrl":"10.1016/j.is.2025.102527","url":null,"abstract":"<div><div>Data platforms are state-of-the-art solutions for implementing data-driven applications and analytics. They facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Microsoft Azure). However, when it comes to engineering data platforms, no unifying strategy and methodology is available yet, and the design is mainly left to the expertise of practitioners in the field. Service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective design approach starts with knowledge of the data transformation and exploitation processes that the platform should support. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We show that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102527"},"PeriodicalIF":3.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing cross-lingual text classification through linguistic and interpretability-guided attack strategies

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-01-27 DOI: 10.1016/j.is.2025.102526

Abdelmounaim Kerkri , Mohamed Amine Madani , Aya Qeraouch , Kaoutar Zouin

While adversarial attacks on natural language processing systems have been extensively studied in English, their impact on morphologically complex languages remains poorly understood. We investigate how text classification systems respond to adversarial attacks across Arabic, English, and French — languages chosen for their distinct linguistic properties. Building on the DeepWordBug framework, we develop multilingual attack strategies that combine random perturbations with targeted modifications guided by model interpretability. We also introduce novel attack methods that exploit language-specific features like orthographic variations and syntactic patterns. Testing these approaches on a diverse dataset of news articles (9,030 Arabic, 14,501 English) and movie reviews (200,000 French), we find that interpretability-guided attacks are particularly effective, achieving misclassification rates of 58%–62% across languages. Language-specific perturbations also proved potent, degrading model performance to F1-scores between 0.38 and 0.63. However, incorporating adversarial examples during training markedly improved model robustness, with F1-scores recovering to above 0.82 across all test conditions. Beyond the immediate findings, this work reveals how adversarial vulnerability manifests differently across languages with varying morphological complexity, offering key insights for building more resilient multilingual NLP systems.

{"title":"Enhancing cross-lingual text classification through linguistic and interpretability-guided attack strategies","authors":"Abdelmounaim Kerkri , Mohamed Amine Madani , Aya Qeraouch , Kaoutar Zouin","doi":"10.1016/j.is.2025.102526","DOIUrl":"10.1016/j.is.2025.102526","url":null,"abstract":"<div><div>While adversarial attacks on natural language processing systems have been extensively studied in English, their impact on morphologically complex languages remains poorly understood. We investigate how text classification systems respond to adversarial attacks across Arabic, English, and French — languages chosen for their distinct linguistic properties. Building on the DeepWordBug framework, we develop multilingual attack strategies that combine random perturbations with targeted modifications guided by model interpretability. We also introduce novel attack methods that exploit language-specific features like orthographic variations and syntactic patterns. Testing these approaches on a diverse dataset of news articles (9,030 Arabic, 14,501 English) and movie reviews (200,000 French), we find that interpretability-guided attacks are particularly effective, achieving misclassification rates of 58%–62% across languages. Language-specific perturbations also proved potent, degrading model performance to F1-scores between 0.38 and 0.63. However, incorporating adversarial examples during training markedly improved model robustness, with F1-scores recovering to above 0.82 across all test conditions. Beyond the immediate findings, this work reveals how adversarial vulnerability manifests differently across languages with varying morphological complexity, offering key insights for building more resilient multilingual NLP systems.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102526"},"PeriodicalIF":3.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Federated conformance checking

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-01-25 DOI: 10.1016/j.is.2025.102525

Majid Rafiei, Mahsa Pourbafrani, Wil M.P. van der Aalst

Conformance checking is a crucial aspect of process mining, where the main objective is to compare the actual execution of a process, as recorded in an event log, with a reference process model, e.g., in the form of a Petri net or a BPMN. Conformance checking enables identifying deviations, anomalies, or non-compliance instances. It offers different perspectives on problems in processes, bottlenecks, or process instances that are not compliant with the model. Performing conformance checking in federated (inter-organizational) settings allows organizations to gain insights into the overall process execution and to identify compliance issues across organizational boundaries, which facilitates process improvement efforts among collaborating entities. In this paper, we propose a privacy-aware federated conformance-checking approach that allows for evaluating the correctness of overall cross-organizational process models, identifying miscommunications, and quantifying their costs. For evaluation, we design and simulate a supply chain process with three organizations engaged in purchase-to-pay, order-to-cash, and shipment processes. We generate synthetic event logs for each organization as well as the complete process, and we apply our approach to identify and evaluate the cost of pre-injected miscommunications.

{"title":"Federated conformance checking","authors":"Majid Rafiei, Mahsa Pourbafrani, Wil M.P. van der Aalst","doi":"10.1016/j.is.2025.102525","DOIUrl":"10.1016/j.is.2025.102525","url":null,"abstract":"<div><div>Conformance checking is a crucial aspect of process mining, where the main objective is to compare the actual execution of a process, as recorded in an event log, with a reference process model, e.g., in the form of a Petri net or a BPMN. Conformance checking enables identifying deviations, anomalies, or non-compliance instances. It offers different perspectives on problems in processes, bottlenecks, or process instances that are not compliant with the model. Performing conformance checking in federated (inter-organizational) settings allows organizations to gain insights into the overall process execution and to identify compliance issues across organizational boundaries, which facilitates process improvement efforts among collaborating entities. In this paper, we propose <em>a privacy-aware federated conformance-checking approach</em> that allows for evaluating the correctness of overall cross-organizational process models, identifying miscommunications, and quantifying their costs. For evaluation, we design and simulate a supply chain process with three organizations engaged in purchase-to-pay, order-to-cash, and shipment processes. We generate synthetic event logs for each organization as well as the complete process, and we apply our approach to identify and evaluate the cost of pre-injected miscommunications.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102525"},"PeriodicalIF":3.0,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scalable and accurate online multivariate anomaly detection

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-01-21 DOI: 10.1016/j.is.2025.102524

Rebecca Salles , Benoit Lange , Reza Akbarinia , Florent Masseglia , Eduardo Ogasawara , Esther Pacitti

The continuous monitoring of dynamic processes generates vast amounts of streaming multivariate time series data. Detecting anomalies within them is crucial for real-time identification of significant events, such as environmental phenomena, security breaches, or system failures, which can critically impact sensitive applications. Despite significant advances in univariate time series anomaly detection, scalable and efficient solutions for online detection in multivariate streams remain underexplored. This challenge becomes increasingly prominent with the growing volume and complexity of multivariate time series data in streaming scenarios. In this paper, we provide the first structured survey primarily focused on scalable and online anomaly detection techniques for multivariate time series, offering a comprehensive taxonomy. Additionally, we introduce the Online Distributed Outlier Detection (2OD) methodology, a novel well-defined and repeatable process designed to benchmark the online and distributed execution of anomaly detection methods. Experimental results with both synthetic and real-world datasets, covering up to hundreds of millions of observations, demonstrate that a distributed approach can enable centralized algorithms to achieve significant computational efficiency gains, averaging tens and reaching up to hundreds in speedup, without compromising detection accuracy.

{"title":"Scalable and accurate online multivariate anomaly detection","authors":"Rebecca Salles , Benoit Lange , Reza Akbarinia , Florent Masseglia , Eduardo Ogasawara , Esther Pacitti","doi":"10.1016/j.is.2025.102524","DOIUrl":"10.1016/j.is.2025.102524","url":null,"abstract":"<div><div>The continuous monitoring of dynamic processes generates vast amounts of streaming multivariate time series data. Detecting anomalies within them is crucial for real-time identification of significant events, such as environmental phenomena, security breaches, or system failures, which can critically impact sensitive applications. Despite significant advances in univariate time series anomaly detection, scalable and efficient solutions for online detection in multivariate streams remain underexplored. This challenge becomes increasingly prominent with the growing volume and complexity of multivariate time series data in streaming scenarios. In this paper, we provide the first structured survey primarily focused on scalable and online anomaly detection techniques for multivariate time series, offering a comprehensive taxonomy. Additionally, we introduce the Online Distributed Outlier Detection (2OD) methodology, a novel well-defined and repeatable process designed to benchmark the online and distributed execution of anomaly detection methods. Experimental results with both synthetic and real-world datasets, covering up to hundreds of millions of observations, demonstrate that a distributed approach can enable centralized algorithms to achieve significant computational efficiency gains, averaging tens and reaching up to hundreds in speedup, without compromising detection accuracy.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102524"},"PeriodicalIF":3.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the use of trajectory data for tackling data scarcity

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-01-13 DOI: 10.1016/j.is.2025.102523

Gerard Pons , Besim Bilalli , Alberto Abelló , Santiago Blanco Sánchez

In recent years, the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies have enabled the ubiquitous capturing of the location of moving objects. As a result, trajectory data are abundantly available and there is an increasing trend in analyzing them in the context of mobility data science. However, the abundant availability of trajectory data makes them compelling for other tasks too. In this paper, we propose the use of these data to tackle the data scarcity problem in data analysis by appropriately transforming them to extract relevant knowledge. The challenge lies not just in leveraging these abundant trajectory data, but in accurately deriving information from them that closely approximates the target variable of interest. Such knowledge can be used to generate or supplement the scarcely available datasets in a data analytics problem, thereby enhancing model learning. We showcase the feasibility of our approach in the domain of fishing where there is an abundance of trajectory data but a scarcity of detailed catch information. By using environmental data as explanatory variables, we build and compare models to predict fishing productivity using the actual catches from fishing reports and/or the inferred knowledge from the vessel’s trajectories. The results show that, mainly due to trajectory data being larger in volume than fishing data, models trained with the former obtain a precision 7.9% higher, despite the simplicity of the applied transformations.

{"title":"On the use of trajectory data for tackling data scarcity","authors":"Gerard Pons , Besim Bilalli , Alberto Abelló , Santiago Blanco Sánchez","doi":"10.1016/j.is.2025.102523","DOIUrl":"10.1016/j.is.2025.102523","url":null,"abstract":"<div><div>In recent years, the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies have enabled the ubiquitous capturing of the location of moving objects. As a result, trajectory data are abundantly available and there is an increasing trend in analyzing them in the context of mobility data science. However, the abundant availability of trajectory data makes them compelling for other tasks too. In this paper, we propose the use of these data to tackle the data scarcity problem in data analysis by appropriately transforming them to extract relevant knowledge. The challenge lies not just in leveraging these abundant trajectory data, but in accurately deriving information from them that closely approximates the target variable of interest. Such knowledge can be used to generate or supplement the scarcely available datasets in a data analytics problem, thereby enhancing model learning. We showcase the feasibility of our approach in the domain of fishing where there is an abundance of trajectory data but a scarcity of detailed catch information. By using environmental data as explanatory variables, we build and compare models to predict fishing productivity using the actual catches from fishing reports and/or the inferred knowledge from the vessel’s trajectories. The results show that, mainly due to trajectory data being larger in volume than fishing data, models trained with the former obtain a precision 7.9% higher, despite the simplicity of the applied transformations.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102523"},"PeriodicalIF":3.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting explicit recommendation with DC-GCN: Divide-and-Conquer Graph Convolution Network

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems

Pub Date : 2025-01-01 DOI: 10.1016/j.is.2024.102513

Furong Peng , Fujin Liao , Xuan Lu , Jianxing Zheng , Ru Li

In recent years, Graph Convolutional Networks (GCNs) have primarily been applied to implicit feedback recommendation, with limited exploration in explicit scenarios. Although explicit recommendations can yield promising results, the conflict between the sparsity of data and the data starvation of deep learning hinders its development. Unlike implicit scenarios, explicit recommendation provides less evidence for predictions and requires distinguishing weights of edges (ratings) in the user-item graph.

To exploit high-order relations by GCN in explicit scenarios, we propose dividing the explicit rating graph into sub-graphs, each containing only one type of rating. We then employ GCN to capture user and item representations within each sub-graph, allowing the model to focus on rating-related user-item relations, and aggregate the representations of all subgraphs by MLP for the final recommendation. This approach, named Divide-and-Conquer Graph Convolution Network (DC-GCN), simplifies each model’s mission and highlights the strengths of individual modules. Considering that creating GCNs for each sub-graph may result in over-fitting and faces more serious data sparsity, we propose to share node embeddings for all GCNs to reduce the number of parameters, and create rating-aware embedding for each sub-graph to model rating-related relations. Moreover, to alleviate over-smoothing, we utilize random column mask to randomly select columns of node features to update in GCN layers. This technique can prevent node representations from becoming homogeneous in deep GCN networks. DC-GCN is evaluated on four public datasets and achieves the SOTA experimentally. Furthermore, DC-GCN is analyzed in cold-start and popularity bias scenarios, exhibiting competitive performance in various scenarios.

{"title":"Revisiting explicit recommendation with DC-GCN: Divide-and-Conquer Graph Convolution Network","authors":"Furong Peng , Fujin Liao , Xuan Lu , Jianxing Zheng , Ru Li","doi":"10.1016/j.is.2024.102513","DOIUrl":"10.1016/j.is.2024.102513","url":null,"abstract":"<div><div>In recent years, Graph Convolutional Networks (GCNs) have primarily been applied to implicit feedback recommendation, with limited exploration in explicit scenarios. Although explicit recommendations can yield promising results, the conflict between the sparsity of data and the data starvation of deep learning hinders its development. Unlike implicit scenarios, explicit recommendation provides less evidence for predictions and requires distinguishing weights of edges (ratings) in the user-item graph.</div><div>To exploit high-order relations by GCN in explicit scenarios, we propose dividing the explicit rating graph into sub-graphs, each containing only one type of rating. We then employ GCN to capture user and item representations within each sub-graph, allowing the model to focus on rating-related user-item relations, and aggregate the representations of all subgraphs by MLP for the final recommendation. This approach, named Divide-and-Conquer Graph Convolution Network (DC-GCN), simplifies each model’s mission and highlights the strengths of individual modules. Considering that creating GCNs for each sub-graph may result in over-fitting and faces more serious data sparsity, we propose to share node embeddings for all GCNs to reduce the number of parameters, and create rating-aware embedding for each sub-graph to model rating-related relations. Moreover, to alleviate over-smoothing, we utilize random column mask to randomly select columns of node features to update in GCN layers. This technique can prevent node representations from becoming homogeneous in deep GCN networks. DC-GCN is evaluated on four public datasets and achieves the SOTA experimentally. Furthermore, DC-GCN is analyzed in cold-start and popularity bias scenarios, exhibiting competitive performance in various scenarios.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102513"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Information Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀