Information Processing & Management最新文献_第2页

Exploring long- and short-term knowledge state graph representations with adaptive fusion for knowledge tracing

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-25 DOI: 10.1016/j.ipm.2025.104074

Ganfeng Yu , Zhiwen Xie , Guangyou Zhou , Zhuo Zhao , Jimmy Xiangji Huang

Knowledge Tracing (KT) is an important research area in online education that focuses on predicting future academic performance based on students’ historical exercise records. The key to solving the KT problem lies in assessing students’ knowledge states through their responses to concept-related exercises. However, analyzing exercise records from a single perspective does not provide a comprehensive model of student knowledge. The truth is that students’ knowledge states often exhibit long- and short-term phenomena, corresponding to long-term knowledge systems and short-term real-time learning, both of which are closely related to learning quality and preferences. Existing studies have often neglected the learning preferences implied by long-term knowledge states and their impact on student performance. Therefore, we introduce a hybrid knowledge tracing model that utilizes both long- and short-term knowledge state representations (L-SKSKT). It enhances KT by fusing these two types of knowledge state representations and measuring their impact on learning quality. L-SKSKT includes a graph construction method designed to model students’ long- and short-term knowledge states. In addition, L-SKSKT incorporates a knowledge state graph embedding model that can effectively capture long- and short-term dependencies, generating corresponding knowledge state representations. Furthermore, we propose a fusion mechanism to integrate these representations and trace their impact on learning outcomes. Extensive empirical results on four benchmark datasets show that our approach achieves the best performance for KT, and beats various strong baselines with a large margin.

{"title":"Exploring long- and short-term knowledge state graph representations with adaptive fusion for knowledge tracing","authors":"Ganfeng Yu , Zhiwen Xie , Guangyou Zhou , Zhuo Zhao , Jimmy Xiangji Huang","doi":"10.1016/j.ipm.2025.104074","DOIUrl":"10.1016/j.ipm.2025.104074","url":null,"abstract":"<div><div>Knowledge Tracing (KT) is an important research area in online education that focuses on predicting future academic performance based on students’ historical exercise records. The key to solving the KT problem lies in assessing students’ knowledge states through their responses to concept-related exercises. However, analyzing exercise records from a single perspective does not provide a comprehensive model of student knowledge. The truth is that students’ knowledge states often exhibit long- and short-term phenomena, corresponding to long-term knowledge systems and short-term real-time learning, both of which are closely related to learning quality and preferences. Existing studies have often neglected the learning preferences implied by long-term knowledge states and their impact on student performance. Therefore, we introduce a hybrid knowledge tracing model that utilizes both long- and short-term knowledge state representations (L-SKSKT). It enhances KT by fusing these two types of knowledge state representations and measuring their impact on learning quality. L-SKSKT includes a graph construction method designed to model students’ long- and short-term knowledge states. In addition, L-SKSKT incorporates a knowledge state graph embedding model that can effectively capture long- and short-term dependencies, generating corresponding knowledge state representations. Furthermore, we propose a fusion mechanism to integrate these representations and trace their impact on learning outcomes. Extensive empirical results on four benchmark datasets show that our approach achieves the best performance for KT, and beats various strong baselines with a large margin.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104074"},"PeriodicalIF":7.4,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dual stream fusion link prediction for sparse graph based on variational graph autoencoder and pairwise learning

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-24 DOI: 10.1016/j.ipm.2025.104073

Xun Li, Hongyun Cai, Chuan Feng, Ao Zhao

Recently, link prediction methods based graph neural networks have garnered significant attention and achieved great success on large datasets. However, existing methods usually rely on explicit graph structures, which is hard to obtain in sparse graphs. In addition, the incomplete graph data used for model training may lead to distribution shift between training and testing sets. To address these issues, this paper proposes a novel link prediction method for sparse graphs based on variational graph autoencoder and pairwise learning. By incorporating noise perturbation variational autoencoders, the proposed method can enhance robustness during sparse graph training. Instead of relying on explicit graph features, we reconstruct the original adjacency matrix by disturbing node feature mean encoding or variance encoding. To mitigate the impact of insufficient topological information, we introduce pairwise learning scheme, which obtains pairwise edges through negative sampling and iteratively optimize the positive and negative complementary probability adjacency matrix. Furthermore, we integrate the probability adjacency matrix and node similarity prediction based on message passing networks into a dual-stream framework to predict unknown links. Experimental results on multiple sparse networks demonstrate the superior link prediction performance of our proposed method over baseline approaches. Our method improves AUC from 0.3% to 1.5% and Precision from 1.4% to 5.2% across seven datasets.

{"title":"Dual stream fusion link prediction for sparse graph based on variational graph autoencoder and pairwise learning","authors":"Xun Li, Hongyun Cai, Chuan Feng, Ao Zhao","doi":"10.1016/j.ipm.2025.104073","DOIUrl":"10.1016/j.ipm.2025.104073","url":null,"abstract":"<div><div>Recently, link prediction methods based graph neural networks have garnered significant attention and achieved great success on large datasets. However, existing methods usually rely on explicit graph structures, which is hard to obtain in sparse graphs. In addition, the incomplete graph data used for model training may lead to distribution shift between training and testing sets. To address these issues, this paper proposes a novel link prediction method for sparse graphs based on variational graph autoencoder and pairwise learning. By incorporating noise perturbation variational autoencoders, the proposed method can enhance robustness during sparse graph training. Instead of relying on explicit graph features, we reconstruct the original adjacency matrix by disturbing node feature mean encoding or variance encoding. To mitigate the impact of insufficient topological information, we introduce pairwise learning scheme, which obtains pairwise edges through negative sampling and iteratively optimize the positive and negative complementary probability adjacency matrix. Furthermore, we integrate the probability adjacency matrix and node similarity prediction based on message passing networks into a dual-stream framework to predict unknown links. Experimental results on multiple sparse networks demonstrate the superior link prediction performance of our proposed method over baseline approaches. Our method improves AUC from 0.3% to 1.5% and Precision from 1.4% to 5.2% across seven datasets.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104073"},"PeriodicalIF":7.4,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identification of interdisciplinary research patterns based on the functional structures of IMRaD

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-22 DOI: 10.1016/j.ipm.2025.104063

Xinyi Yang , Lerong Ding , Wei Wang , Jianlin Yang

Interdisciplinary research has emerged as an important approach to tackling complex issues that cut across disciplines. Previous research assessed the interdisciplinarity of a paper without considering differences in functional structures. This study proposes a method to identify interdisciplinary research patterns by measuring the level of interdisciplinarity in research articles across four sections: Introduction, Methods, Results, and Discussion. With 19,712 articles in Bioinformatics, we revealed that interdisciplinarity typically arranges in the sequence of Introduction, Methods, Results, and Discussion. We also identified six patterns, each featuring specific high-interdisciplinary sections, including All-round Integration, Multidisciplinary Application Exploration, Multidisciplinary Background Research, Multidisciplinary Approach, Interdisciplinary Analysis, and Non-Interdisciplinary Research. We further investigated the academic value of interdisciplinary research through citation impact and novel insights. Even with low citation counts, the number of high-level interdisciplinary research continues to grow. The topic analysis also demonstrated that different interdisciplinary research patterns prioritize certain aspects to solve the core problems of a research field. Moreover, the research focus of each pattern is consistent with the function of its highly interdisciplinary sections. For example, in protein structure research, the Multidisciplinary Approach pattern prioritizes accurate modelling and techniques, while the Multidisciplinary Application Exploration pattern emphasizes biological applications such as vaccine development. These findings provide management with guidance on how to encourage interdisciplinary research that genuinely contributes to innovation.

{"title":"Identification of interdisciplinary research patterns based on the functional structures of IMRaD","authors":"Xinyi Yang , Lerong Ding , Wei Wang , Jianlin Yang","doi":"10.1016/j.ipm.2025.104063","DOIUrl":"10.1016/j.ipm.2025.104063","url":null,"abstract":"<div><div>Interdisciplinary research has emerged as an important approach to tackling complex issues that cut across disciplines. Previous research assessed the interdisciplinarity of a paper without considering differences in functional structures. This study proposes a method to identify interdisciplinary research patterns by measuring the level of interdisciplinarity in research articles across four sections: Introduction, Methods, Results, and Discussion. With 19,712 articles in Bioinformatics, we revealed that interdisciplinarity typically arranges in the sequence of Introduction, Methods, Results, and Discussion. We also identified six patterns, each featuring specific high-interdisciplinary sections, including All-round Integration, Multidisciplinary Application Exploration, Multidisciplinary Background Research, Multidisciplinary Approach, Interdisciplinary Analysis, and Non-Interdisciplinary Research. We further investigated the academic value of interdisciplinary research through citation impact and novel insights. Even with low citation counts, the number of high-level interdisciplinary research continues to grow. The topic analysis also demonstrated that different interdisciplinary research patterns prioritize certain aspects to solve the core problems of a research field. Moreover, the research focus of each pattern is consistent with the function of its highly interdisciplinary sections. For example, in protein structure research, the Multidisciplinary Approach pattern prioritizes accurate modelling and techniques, while the Multidisciplinary Application Exploration pattern emphasizes biological applications such as vaccine development. These findings provide management with guidance on how to encourage interdisciplinary research that genuinely contributes to innovation.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104063"},"PeriodicalIF":7.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Linguistic patterns in social media content from crisis and non-crisis zones: A case study of Hurricane Ian

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-20 DOI: 10.1016/j.ipm.2025.104061

Ly Dinh, Steven Walczak

Social media platforms, particularly Twitter, play a vital role in crisis response by delivering real-time information about affected populations. To enhance the accurate detection of crisis-relevant content, this study investigates linguistic distinctions between crisis and non-crisis zones. By analyzing over 263,000 tweets from within and outside the 2022 Hurricane Ian’s impact zone, we examine normalized word frequency, syntactic categories (nouns, verbs, adjectives, adverbs), sentiment, and user interaction patterns in the tweet networks. Our findings reveal a consistent power-law distribution in the relative differences of word use between crisis and non-crisis zones. Syntactic categories differences, particularly in adjectives, highlight the crisis zone’s emphasis on the hurricane’s path and impact, while the non-crisis zone’s vocabulary centers on current news topics such as sports, politics, and leisure. Syntactic analyses show that 36% (N = 20,967) of words are used in both crisis and non-crisis zones, 29% (N = 17,168) are unique to the crisis zone, and another 35% (N = 20,101) are unique to the non-crisis zone, highlighting the broader range of topics discussed in the non-crisis zone compared to the crisis zone. Sentiment analysis indicates comparable distributions of neutral words (

\sim

99%), followed by negative words (

\sim

0.4%) and positive words (

\sim

0.4%). However, the use of profanity, indicating strong negative sentiment, occurred 19% more frequently in non-crisis zone tweets than in crisis zone tweets. Network analysis and network modeling show that the crisis zone network is denser and more cohesive, reflecting tight-knit communities during crises, whereas the non-crisis zone network is larger and more fragmented, indicating diverse user engagements. Our study’s contributions include providing insights into the distinctive usage of words in crisis and non-crisis zones, hence facilitating the evaluation of crisis-relevant language patterns. Ultimately, the findings may be used to aid responders in prioritizing urgent tweets originating from a crisis zone.

{"title":"Linguistic patterns in social media content from crisis and non-crisis zones: A case study of Hurricane Ian","authors":"Ly Dinh, Steven Walczak","doi":"10.1016/j.ipm.2025.104061","DOIUrl":"10.1016/j.ipm.2025.104061","url":null,"abstract":"<div><div>Social media platforms, particularly Twitter, play a vital role in crisis response by delivering real-time information about affected populations. To enhance the accurate detection of crisis-relevant content, this study investigates linguistic distinctions between crisis and non-crisis zones. By analyzing over 263,000 tweets from within and outside the 2022 Hurricane Ian’s impact zone, we examine normalized word frequency, syntactic categories (nouns, verbs, adjectives, adverbs), sentiment, and user interaction patterns in the tweet networks. Our findings reveal a consistent power-law distribution in the relative differences of word use between crisis and non-crisis zones. Syntactic categories differences, particularly in adjectives, highlight the crisis zone’s emphasis on the hurricane’s path and impact, while the non-crisis zone’s vocabulary centers on current news topics such as sports, politics, and leisure. Syntactic analyses show that 36% (N = 20,967) of words are used in both crisis and non-crisis zones, 29% (N = 17,168) are unique to the crisis zone, and another 35% (N = 20,101) are unique to the non-crisis zone, highlighting the broader range of topics discussed in the non-crisis zone compared to the crisis zone. Sentiment analysis indicates comparable distributions of neutral words (<span><math><mo>∼</mo></math></span> 99%), followed by negative words (<span><math><mo>∼</mo></math></span> 0.4%) and positive words (<span><math><mo>∼</mo></math></span> 0.4%). However, the use of profanity, indicating strong negative sentiment, occurred 19% more frequently in non-crisis zone tweets than in crisis zone tweets. Network analysis and network modeling show that the crisis zone network is denser and more cohesive, reflecting tight-knit communities during crises, whereas the non-crisis zone network is larger and more fragmented, indicating diverse user engagements. Our study’s contributions include providing insights into the distinctive usage of words in crisis and non-crisis zones, hence facilitating the evaluation of crisis-relevant language patterns. Ultimately, the findings may be used to aid responders in prioritizing urgent tweets originating from a crisis zone.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104061"},"PeriodicalIF":7.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143138933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploiting diffusion-based structured learning for item interactions representations in multimodal recommender systems

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-20 DOI: 10.1016/j.ipm.2025.104075

Nikhat Khan, Dilip Singh Sisodia

Multimodal Recommender Systems (MRS) enhance the performance of recommendations by utilizing different item information, such as text, images, and audio. Existing non-graph-based MRS techniques combine embeddings (i.e., id and multimodal embedding) but ignore indirect and higher-order interactions. Graph-based MRS approaches use graph sparsification (GS) to construct item graphs and graph convolutional networks (GCNs) for higher-order interactions. However, GS reduces the item graph size, while GCNs ignore specific information due to their predefined weights. Hence, to mitigate the mentioned issues, this study proposes a Diffusion-based Structured Learning technique for Multimodal Recommender Systems (DSL-MRS) that improves the latent item graph information flow while maintaining its structure. Additionally, we used a graph attention neural network (GANN) to represent complex higher-order item-item interactions and implemented an attention mechanism to prioritize relevant nodes by assigning weights to neighbour. Also, for optimization, a Weighted Approximate-Rank pairwise (WARP) loss function has been used to prioritize predictions for observed items over those for unspecified items. To demonstrate the advantage of DSL-MRS, we conducted extensive experiments on three publicly available categories of Amazon datasets. The experimental findings showed that the proposed approach led to an average improvement of 5.8 % in R@20, 8.7 % in precision@20,7.8 % in NDCG@20 and 8.8 % in F-score@20 compared to the baseline model. Ablation studies demonstrate the value and efficacy of DSL-MRS, as its components degrade performance when removed.

{"title":"Exploiting diffusion-based structured learning for item interactions representations in multimodal recommender systems","authors":"Nikhat Khan, Dilip Singh Sisodia","doi":"10.1016/j.ipm.2025.104075","DOIUrl":"10.1016/j.ipm.2025.104075","url":null,"abstract":"<div><div>Multimodal Recommender Systems (MRS) enhance the performance of recommendations by utilizing different item information, such as text, images, and audio. Existing non-graph-based MRS techniques combine embeddings (i.e., id and multimodal embedding) but ignore indirect and higher-order interactions. Graph-based MRS approaches use graph sparsification (GS) to construct item graphs and graph convolutional networks (GCNs) for higher-order interactions. However, GS reduces the item graph size, while GCNs ignore specific information due to their predefined weights. Hence, to mitigate the mentioned issues, this study proposes a Diffusion-based Structured Learning technique for Multimodal Recommender Systems (DSL-MRS) that improves the latent item graph information flow while maintaining its structure. Additionally, we used a graph attention neural network (GANN) to represent complex higher-order item-item interactions and implemented an attention mechanism to prioritize relevant nodes by assigning weights to neighbour. Also, for optimization, a Weighted Approximate-Rank pairwise (WARP) loss function has been used to prioritize predictions for observed items over those for unspecified items. To demonstrate the advantage of DSL-MRS, we conducted extensive experiments on three publicly available categories of Amazon datasets. The experimental findings showed that the proposed approach led to an average improvement of 5.8 % in R@20, 8.7 % in precision@20,7.8 % in NDCG@20 and 8.8 % in F-score@20 compared to the baseline model. Ablation studies demonstrate the value and efficacy of DSL-MRS, as its components degrade performance when removed.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104075"},"PeriodicalIF":7.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Amplifying commonsense knowledge via bi-directional relation integrated graph-based contrastive pre-training from large language models

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-20 DOI: 10.1016/j.ipm.2025.104068

Liu Yu, Fenghui Tian, Ping Kuang, Fan Zhou

Commonsense knowledge graph acquisition (CKGA) is vital in numerous knowledge-intensive applications such as question-answering and knowledge reasoning. Conventional CKGA methods rely on node-level and unidirectional relations, making them suffer from a shallow grasp of between entities and relations. Moreover, they also demand expensive, labor-intensive human annotations, and the yielding CK lacks diversity and quality. Existing commonsense knowledge bases such as ConceptNet or ATOMIC often struggle with significant scarcity and pose a major challenge in meeting the high demand for a vast amount of commonsense information. Given the recent momentum of large language models (LLMs), there is growing interest in leveraging them to overcome the above challenges.

In this study, we propose a new paradigm to amplify commonsense knowledge via bi-directional relation integrated graph-based contrastive pre-training (BIRGHT) from the newest foundation models. BRIGHT is an integral and closed-loop framework composed of corpora construction, further contrastive pre-training, task-driven instruction tuning, filtering strategy, and an evaluation system. The key of BRIGHT is to leverage reverse relations to create a symmetric graph and transform the bi-directional relations into sentence-level ones. The reverse sentences are considered positive examples for forward sentences, and three types of negatives are introduced to ensure efficient contrastive learning, which mitigates the “reversal curse” issue as evidenced in experiments. Empirical results demonstrate that BRIGHT is able to generate novel knowledge (up to 397K) and that the GPT-4 acceptance rate is high quality, with up to 90.51% (ATOMIC) and 85.59% (ConceptNet) accuracy at top 1, which approaches human performance for these resources. Our BRIGHT is publicly available at https://github.com/GreyHuu/BRIGHT/tree/main.

{"title":"Amplifying commonsense knowledge via bi-directional relation integrated graph-based contrastive pre-training from large language models","authors":"Liu Yu, Fenghui Tian, Ping Kuang, Fan Zhou","doi":"10.1016/j.ipm.2025.104068","DOIUrl":"10.1016/j.ipm.2025.104068","url":null,"abstract":"<div><div>Commonsense knowledge graph acquisition (CKGA) is vital in numerous knowledge-intensive applications such as question-answering and knowledge reasoning. Conventional CKGA methods rely on node-level and unidirectional relations, making them suffer from a shallow grasp of between entities and relations. Moreover, they also demand expensive, labor-intensive human annotations, and the yielding CK lacks diversity and quality. Existing commonsense knowledge bases such as ConceptNet or ATOMIC often struggle with significant scarcity and pose a major challenge in meeting the high demand for a vast amount of commonsense information. Given the recent momentum of large language models (LLMs), there is growing interest in leveraging them to overcome the above challenges.</div><div>In this study, we propose a new paradigm to amplify commonsense knowledge via <u>b</u>i-di<u>r</u>ect<u>i</u>onal relation integrated <u>g</u>rap<u>h</u>-based con<u>t</u>rastive pre-training (<strong>BIRGHT</strong>) from the newest foundation models. BRIGHT is an integral and closed-loop framework composed of corpora construction, further contrastive pre-training, task-driven instruction tuning, filtering strategy, and an evaluation system. The key of BRIGHT is to leverage reverse relations to create a symmetric graph and transform the bi-directional relations into sentence-level ones. The reverse sentences are considered positive examples for forward sentences, and three types of negatives are introduced to ensure efficient contrastive learning, which mitigates the “reversal curse” issue as evidenced in experiments. Empirical results demonstrate that BRIGHT is able to generate novel knowledge (up to 397K) and that the GPT-4 acceptance rate is high quality, with up to 90.51% (ATOMIC) and 85.59% (ConceptNet) accuracy at top 1, which approaches human performance for these resources. Our BRIGHT is publicly available at <span><span>https://github.com/GreyHuu/BRIGHT/tree/main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104068"},"PeriodicalIF":7.4,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Beyond expression: Comprehensive visualization of knowledge triplet facts

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-18 DOI: 10.1016/j.ipm.2025.104062

Wei Liu , Yixue He , Chao Wang , Shaorong Xie , Weimin Li

Multi-modal Knowledge Graphs (KGs) enhance traditional KGs by incorporating multi-modal data to bridge the information gap in natural language processing (NLP) tasks. One direct method to incorporate multi-modal data is to associate structured KG with corresponding image modalities, thereby visualizing entities and triplet facts. However, existing visualization methods for triplet facts often exclude triplet facts containing abstract entities and non-visual relations, resulting in their disassociation from corresponding image modalities. This exclusion compromises the completeness and utility of multi-modal KGs. In this paper, we aim to construct a comprehensive multi-modal KG that includes abstract entities and non-visual relations, ensuring complete visualization of every triplet fact. To achieve this purpose, we propose a method for the integration of image Retrieval-Generation-Editing (RGE) to completely and accurately visualize each triplet fact. Initially, we correct the triplet facts by integrating a Large Language Model (LLM) with a retrieved knowledge database about triplet facts. Subsequently, by providing appropriate contextual examples to the LLM, we generate visual elements of relations, enriching the semantics of the triplet facts. We then employ image retrieval to obtain images that reflect the semantics of each triplet fact. For those triplet facts for which images cannot be directly retrieved, we utilize image generation and editing to create and modify images that can express the semantics of the triplet facts. Through the RGE method, we construct a multi-modal KG named DB15kFact, which includes 86,722 triplet facts, 274 relations, 12,842 entities, and 387,096 images. The construction of DB15kFact has resulted in a fourfold increase in the number of relations compared to the previous multi-modal KG, ImgFact. In experiments, both automatic and manual evaluations confirm the quality of DB15kFact. The results demonstrate that the DB15kFact significantly enhances model performance in link prediction and relation classification. Notably, in link prediction, the model optimized with DB15kFact achieves a 7.12% improvement in the H@10 metric compared to existing solutions.

{"title":"Beyond expression: Comprehensive visualization of knowledge triplet facts","authors":"Wei Liu , Yixue He , Chao Wang , Shaorong Xie , Weimin Li","doi":"10.1016/j.ipm.2025.104062","DOIUrl":"10.1016/j.ipm.2025.104062","url":null,"abstract":"<div><div>Multi-modal Knowledge Graphs (KGs) enhance traditional KGs by incorporating multi-modal data to bridge the information gap in natural language processing (NLP) tasks. One direct method to incorporate multi-modal data is to associate structured KG with corresponding image modalities, thereby visualizing entities and triplet facts. However, existing visualization methods for triplet facts often exclude triplet facts containing abstract entities and non-visual relations, resulting in their disassociation from corresponding image modalities. This exclusion compromises the completeness and utility of multi-modal KGs. In this paper, we aim to construct a comprehensive multi-modal KG that includes abstract entities and non-visual relations, ensuring complete visualization of every triplet fact. To achieve this purpose, we propose a method for the integration of image <strong>R</strong>etrieval-<strong>G</strong>eneration-<strong>E</strong>diting (RGE) to completely and accurately visualize each triplet fact. Initially, we correct the triplet facts by integrating a Large Language Model (LLM) with a retrieved knowledge database about triplet facts. Subsequently, by providing appropriate contextual examples to the LLM, we generate visual elements of relations, enriching the semantics of the triplet facts. We then employ image retrieval to obtain images that reflect the semantics of each triplet fact. For those triplet facts for which images cannot be directly retrieved, we utilize image generation and editing to create and modify images that can express the semantics of the triplet facts. Through the RGE method, we construct a multi-modal KG named <span>DB15kFact</span>, which includes 86,722 triplet facts, 274 relations, 12,842 entities, and 387,096 images. The construction of <span>DB15kFact</span> has resulted in a fourfold increase in the number of relations compared to the previous multi-modal KG, ImgFact. In experiments, both automatic and manual evaluations confirm the quality of <span>DB15kFact</span>. The results demonstrate that the <span>DB15kFact</span> significantly enhances model performance in link prediction and relation classification. Notably, in link prediction, the model optimized with <span>DB15kFact</span> achieves a 7.12% improvement in the H@10 metric compared to existing solutions.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104062"},"PeriodicalIF":7.4,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Historical facts learning from Long-Short Terms with Language Model for Temporal Knowledge Graph Reasoning

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-17 DOI: 10.1016/j.ipm.2024.104047

Wenjie Xu , Ben Liu , Miao Peng , Zihao Jiang , Xu Jia , Kai Liu , Lei Liu , Min Peng

Temporal Knowledge Graph Reasoning (TKGR) aims to reason the missing parts in TKGs based on historical facts from different time periods. Traditional GCN-based TKGR models depend on structured relations between entities. To utilize the rich linguistic information in TKGs, some models have focused on applying pre-trained language models (PLMs) to TKGR. However, previous PLM-based models still face some issues: (1) they did not mine the associations in relations; (2) they did not differentiate the impact of historical facts from different time periods. (3) they introduced external knowledge to enhance the performance without fully utilizing the inherent reasoning capabilities of PLMs. To deal with these issues, we propose HFL: Historical Facts Learning from Long-Short Terms with Language Model for TKGR. Firstly, we construct time tokens for different types of time intervals to use timestamps and input the historical facts relevant to the query into the PLMs to learn the associations in relations. Secondly, we take a multi-perspective sampling strategy to learn from different time periods and use the original text information in TKGs or even no text information to learn reasoning abilities without any external knowledge. Finally, we perform HFL on four TKGR benchmarks, and the experiment results demonstrate that HFL has great competitiveness compared to both graph-based and PLM-based models. Additionally, we design a variant that applies HFL to LLMs and evaluate the performance of different LLMs.

{"title":"Historical facts learning from Long-Short Terms with Language Model for Temporal Knowledge Graph Reasoning","authors":"Wenjie Xu , Ben Liu , Miao Peng , Zihao Jiang , Xu Jia , Kai Liu , Lei Liu , Min Peng","doi":"10.1016/j.ipm.2024.104047","DOIUrl":"10.1016/j.ipm.2024.104047","url":null,"abstract":"<div><div>Temporal Knowledge Graph Reasoning (TKGR) aims to reason the missing parts in TKGs based on historical facts from different time periods. Traditional GCN-based TKGR models depend on structured relations between entities. To utilize the rich linguistic information in TKGs, some models have focused on applying pre-trained language models (PLMs) to TKGR. However, previous PLM-based models still face some issues: (1) they did not mine the associations in relations; (2) they did not differentiate the impact of historical facts from different time periods. (3) they introduced external knowledge to enhance the performance without fully utilizing the inherent reasoning capabilities of PLMs. To deal with these issues, we propose HFL: <strong>H</strong>istorical <strong>F</strong>acts <strong>L</strong>earning from Long-Short Terms with Language Model for TKGR. Firstly, we construct time tokens for different types of time intervals to use timestamps and input the historical facts relevant to the query into the PLMs to learn the associations in relations. Secondly, we take a multi-perspective sampling strategy to learn from different time periods and use the original text information in TKGs or even no text information to learn reasoning abilities without any external knowledge. Finally, we perform HFL on four TKGR benchmarks, and the experiment results demonstrate that HFL has great competitiveness compared to both graph-based and PLM-based models. Additionally, we design a variant that applies HFL to LLMs and evaluate the performance of different LLMs.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104047"},"PeriodicalIF":7.4,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

User identification network with contrastive clustering for shared-account recommendation

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-15 DOI: 10.1016/j.ipm.2024.104055

Xinhua Wang , Houping Yue , Lei Guo , Feng Guo , Chen He , Xiaohui Han

The Shared-Account Recommendation (SAR) aims to accurately identify and accommodate the varied preferences of multiple users sharing a single account by analyzing their aggregated interactions. SAR faces challenges in preference identification when multiple users share an account. Existing Shared-Account Modeling (SAM) methods assume overly simplistic conditions and overlook the robustness of representations, leading to inaccurate embeddings that are susceptible to disturbances. To address limitations in existing SAR methods, we introduce the Contrastive Clustering User Identification Network (CCUI-Net) framework to enhance SAR. This framework employs graph-based transformations and node representation learning to refine user embeddings, utilizes hierarchical contrastive clustering for improved user identification and robustness against data noise, and leverages an attention mechanism to dynamically balance contributions from various users. These innovations significantly boost the precision and reliability of recommendations. Experimental results across four domains from the HVIDEO and HAMAZON datasets (E-domain and V-domain in HVIDEO, M-domain and B-domain in HAMAZON) demonstrate that CCUI-Net exceeds the performance of many existing available methods on the metrics MRR@5, MRR@20, Recall@5, and Recall@20. Specifically, the improvements in the M-domain and B-domain for Recall@5 and Recall@20 are 14.64%, 8.55%, 18.67%, and 9.59% respectively.

{"title":"User identification network with contrastive clustering for shared-account recommendation","authors":"Xinhua Wang , Houping Yue , Lei Guo , Feng Guo , Chen He , Xiaohui Han","doi":"10.1016/j.ipm.2024.104055","DOIUrl":"10.1016/j.ipm.2024.104055","url":null,"abstract":"<div><div>The Shared-Account Recommendation (SAR) aims to accurately identify and accommodate the varied preferences of multiple users sharing a single account by analyzing their aggregated interactions. SAR faces challenges in preference identification when multiple users share an account. Existing Shared-Account Modeling (SAM) methods assume overly simplistic conditions and overlook the robustness of representations, leading to inaccurate embeddings that are susceptible to disturbances. To address limitations in existing SAR methods, we introduce the <strong>C</strong>ontrastive <strong>C</strong>lustering <strong>U</strong>ser <strong>I</strong>dentification Network (CCUI-Net) framework to enhance SAR. This framework employs graph-based transformations and node representation learning to refine user embeddings, utilizes hierarchical contrastive clustering for improved user identification and robustness against data noise, and leverages an attention mechanism to dynamically balance contributions from various users. These innovations significantly boost the precision and reliability of recommendations. Experimental results across four domains from the HVIDEO and HAMAZON datasets (E-domain and V-domain in HVIDEO, M-domain and B-domain in HAMAZON) demonstrate that CCUI-Net exceeds the performance of many existing available methods on the metrics MRR@5, MRR@20, Recall@5, and Recall@20. Specifically, the improvements in the M-domain and B-domain for Recall@5 and Recall@20 are 14.64%, 8.55%, 18.67%, and 9.59% respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104055"},"PeriodicalIF":7.4,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A unified framework for multi-modal rumor detection via multi-level dynamic interaction with evolving stances

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management

Pub Date : 2025-01-14 DOI: 10.1016/j.ipm.2025.104066

Tiening Sun, Chengwei Liu, Lizhi Chen, Zhong Qian, Peifeng Li, Qiaoming Zhu

With the escalating dissemination of textual and visual content on the Internet, multi-modal rumor detection has garnered significant scholarly attention in recent research studies. Currently, the prevailing methods in multi-modal rumor detection tend to emphasize the information integration from source posts and images, overlooking the dynamic interaction between multi-modal sources and evolving conversational structures. Furthermore, they fail to recognize the potential advantage that introducing evolving user stances as a form of collective decision-making can improve the model’s performance in rumor classification. In this paper, we propose a novel Evolving Stance-aware Dynamic Graph Fusion Network (ESDGFN) to address the above issues. This network aims to integrate the source, the image and the dynamic conversation graph into a unified framework. Specifically, we begin by leveraging a cross-modal transformer for fine-grained feature fusion of the multi-modal sources. Simultaneously, based on the temporal attributes of posts, we construct a set of dynamically changing conversation graphs for each conversation thread, simulating and encoding the evolving stances of users towards the target event within these conversation graphs. Subsequently, we design a multi-level fusion strategy, incorporating both coarse-grained multi-modal feature guidance and fine-grained cross-modal similarity-aware fusion. This strategy aims to generate interactively enhanced multi-modal encoding and dynamic graph representations. The experimental results on both PHEME and Twitter datasets highlight the excellence of our ESDGFN model. It achieves 90.6% accuracy on PHEME, a 3.3% improvement compared to the state-of-the-art method, and 87% accuracy on Twitter, with a 2.4% improvement.

{"title":"A unified framework for multi-modal rumor detection via multi-level dynamic interaction with evolving stances","authors":"Tiening Sun, Chengwei Liu, Lizhi Chen, Zhong Qian, Peifeng Li, Qiaoming Zhu","doi":"10.1016/j.ipm.2025.104066","DOIUrl":"10.1016/j.ipm.2025.104066","url":null,"abstract":"<div><div>With the escalating dissemination of textual and visual content on the Internet, multi-modal rumor detection has garnered significant scholarly attention in recent research studies. Currently, the prevailing methods in multi-modal rumor detection tend to emphasize the information integration from source posts and images, overlooking the dynamic interaction between multi-modal sources and evolving conversational structures. Furthermore, they fail to recognize the potential advantage that introducing evolving user stances as a form of collective decision-making can improve the model’s performance in rumor classification. In this paper, we propose a novel Evolving Stance-aware Dynamic Graph Fusion Network (ESDGFN) to address the above issues. This network aims to integrate the source, the image and the dynamic conversation graph into a unified framework. Specifically, we begin by leveraging a cross-modal transformer for fine-grained feature fusion of the multi-modal sources. Simultaneously, based on the temporal attributes of posts, we construct a set of dynamically changing conversation graphs for each conversation thread, simulating and encoding the evolving stances of users towards the target event within these conversation graphs. Subsequently, we design a multi-level fusion strategy, incorporating both coarse-grained multi-modal feature guidance and fine-grained cross-modal similarity-aware fusion. This strategy aims to generate interactively enhanced multi-modal encoding and dynamic graph representations. The experimental results on both PHEME and Twitter datasets highlight the excellence of our ESDGFN model. It achieves 90.6% accuracy on PHEME, a 3.3% improvement compared to the state-of-the-art method, and 87% accuracy on Twitter, with a 2.4% improvement.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 3","pages":"Article 104066"},"PeriodicalIF":7.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143139415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0