Momentum is a widely adopted technique in the deep neural network (DNN) optimization, recognized for enhancing performance. However, our analysis indicates that momentum is not always beneficial for the network. We theoretically demonstrate that increasing the orthogonality of parameter vectors significantly improves the generalization ability of some common types of DNNs, while momentum tends to reduce this orthogonality. Common DNNs include multilayer perceptrons (MLPs) convolutional neural networks (CNN), and Transformers. Our results further show that integrating normalization and residual connections into commonDNNs helps preserve orthogonality, thereby enhancing the generalization of networks optimized with momentum. Extensive experiments across MLPs, CNNs and Transformers validate our theoretical findings. Finally, we find that the parameter vectors of commonly pre-trained language models (PLMs) all maintain a better orthogonality.
{"title":"Improving generalization in DNNs through enhanced orthogonality in momentum-based optimizers","authors":"Zhixing Lu, Yuanyuan Sun, Zhihao Yang, Yuanyu Zhang, Paerhati Tulajiang, Haochen Sun, Hongfei Lin","doi":"10.1016/j.ipm.2025.104109","DOIUrl":"10.1016/j.ipm.2025.104109","url":null,"abstract":"<div><div>Momentum is a widely adopted technique in the deep neural network (DNN) optimization, recognized for enhancing performance. However, our analysis indicates that momentum is not always beneficial for the network. We theoretically demonstrate that increasing the orthogonality of parameter vectors significantly improves the generalization ability of some common types of DNNs, while momentum tends to reduce this orthogonality. Common DNNs include multilayer perceptrons (MLPs) convolutional neural networks (CNN), and Transformers. Our results further show that integrating normalization and residual connections into commonDNNs helps preserve orthogonality, thereby enhancing the generalization of networks optimized with momentum. Extensive experiments across MLPs, CNNs and Transformers validate our theoretical findings. Finally, we find that the parameter vectors of commonly pre-trained language models (PLMs) all maintain a better orthogonality.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104109"},"PeriodicalIF":7.4,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143550589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-06DOI: 10.1016/j.ipm.2025.104123
Mike Thelwall , Xiaorui Jiang , Peter A. Bath
Estimating the quality of published research is important for evaluations of departments, researchers, and job candidates. Citation-based indicators sometimes support these tasks, but do not work for new articles and have low or moderate accuracy. Previous research has shown that ChatGPT can estimate the quality of research articles, with its scores correlating positively with an expert scores proxy in all fields, and often more strongly than citation-based indicators, except for clinical medicine. ChatGPT scores may therefore replace citation-based indicators for some applications. This article investigates the clinical medicine anomaly with the largest dataset yet and a more detailed analysis. The results showed that ChatGPT 4o-mini scores for articles submitted to the UK's Research Excellence Framework (REF) 2021 Unit of Assessment (UoA) 1 Clinical Medicine correlated positively (r = 0.134, n = 9872) with departmental mean REF scores, against a theoretical maximum correlation of r = 0.226. ChatGPT 4o and 3.5 turbo also gave positive correlations. At the departmental level, mean ChatGPT scores correlated more strongly with departmental mean REF scores (r = 0.395, n = 31). For the 100 journals with the most articles in UoA 1, their mean ChatGPT score correlated strongly with their departmental mean REF score (r = 0.495) but negatively with their citation rate (r=-0.148). Journal and departmental anomalies in these results point to ChatGPT being ineffective at assessing the quality of research in prestigious medical journals or research directly affecting human health, or both. Nevertheless, the results give evidence of ChatGPT's ability to assess research quality overall for Clinical Medicine, where it might replace citation-based indicators for new research.
{"title":"Estimating the quality of published medical research with ChatGPT","authors":"Mike Thelwall , Xiaorui Jiang , Peter A. Bath","doi":"10.1016/j.ipm.2025.104123","DOIUrl":"10.1016/j.ipm.2025.104123","url":null,"abstract":"<div><div>Estimating the quality of published research is important for evaluations of departments, researchers, and job candidates. Citation-based indicators sometimes support these tasks, but do not work for new articles and have low or moderate accuracy. Previous research has shown that ChatGPT can estimate the quality of research articles, with its scores correlating positively with an expert scores proxy in all fields, and often more strongly than citation-based indicators, except for clinical medicine. ChatGPT scores may therefore replace citation-based indicators for some applications. This article investigates the clinical medicine anomaly with the largest dataset yet and a more detailed analysis. The results showed that ChatGPT 4o-mini scores for articles submitted to the UK's Research Excellence Framework (REF) 2021 Unit of Assessment (UoA) 1 Clinical Medicine correlated positively (<em>r</em> = 0.134, <em>n</em> = 9872) with departmental mean REF scores, against a theoretical maximum correlation of <em>r</em> = 0.226. ChatGPT 4o and 3.5 turbo also gave positive correlations. At the departmental level, mean ChatGPT scores correlated more strongly with departmental mean REF scores (<em>r</em> = 0.395, <em>n</em> = 31). For the 100 journals with the most articles in UoA 1, their mean ChatGPT score correlated strongly with their departmental mean REF score (<em>r</em> = 0.495) but negatively with their citation rate (<em>r</em>=-0.148). Journal and departmental anomalies in these results point to ChatGPT being ineffective at assessing the quality of research in prestigious medical journals or research directly affecting human health, or both. Nevertheless, the results give evidence of ChatGPT's ability to assess research quality overall for Clinical Medicine, where it might replace citation-based indicators for new research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104123"},"PeriodicalIF":7.4,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-28DOI: 10.1016/j.ipm.2025.104105
Huan Zhu , Yu Xiao , Dongmei Chen , Jun Wu
As rating expresses preferences in online or offline evaluation tasks, aggregating diverse ratings provided by raters is an essential process for thoroughly assessing the quality of an object, which can aid in decision-making and recommendation. Eliminating the impact of rating distortion on certain objects has attracted significant attention from researchers to design robust rating aggregation methods. However, existing methods are constrained by massive distorting ratings, which usually emerge mainly in specific temporal ranges, namely temporal burstiness. Therefore, we propose a novel robust rating aggregation method based on a temporal coupled bipartite network, which can effectively model the segmentation of ratings to deal with the burstiness. Experimental results and analyses indicate that our method exhibits greater robustness than state-of-the-art methods, particularly in handling significant disturbances occurring within specific temporal intervals. This novel approach holds potential for application in real-time rating platforms.
{"title":"A robust rating aggregation method based on temporal coupled bipartite network","authors":"Huan Zhu , Yu Xiao , Dongmei Chen , Jun Wu","doi":"10.1016/j.ipm.2025.104105","DOIUrl":"10.1016/j.ipm.2025.104105","url":null,"abstract":"<div><div>As rating expresses preferences in online or offline evaluation tasks, aggregating diverse ratings provided by raters is an essential process for thoroughly assessing the quality of an object, which can aid in decision-making and recommendation. Eliminating the impact of rating distortion on certain objects has attracted significant attention from researchers to design robust rating aggregation methods. However, existing methods are constrained by massive distorting ratings, which usually emerge mainly in specific temporal ranges, namely temporal burstiness. Therefore, we propose a novel robust rating aggregation method based on a temporal coupled bipartite network, which can effectively model the segmentation of ratings to deal with the burstiness. Experimental results and analyses indicate that our method exhibits greater robustness than state-of-the-art methods, particularly in handling significant disturbances occurring within specific temporal intervals. This novel approach holds potential for application in real-time rating platforms.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104105"},"PeriodicalIF":7.4,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143511478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1016/j.ipm.2025.104104
Rupali Goyal , Parteek Kumar , V.P. Singh
Question answering (QA) portals like Quora, Stack Overflow, AskUbuntu, Yahoo! Answers, Reddit, and Wiki Answers have emerged as hubs of curiosity, highlighting the rising demands for easily accessible information and are drawing focus to hundreds of millions of questions. The efficient utilization of these questions and associated answers has become significantly vital for these QA websites. The similarity-based information retrieval methods provide a ranked list of potentially relevant questions, and the users have to spend significant time sifting through the results to discover the best answer. This paper aims to provide a precise, comprehensive, summarized answer to the user asked query using extracted keywords that offer valuable insights into relevant content. The research work presents a Query focused Answer Summarization framework using Keyword Extraction (QFAS-KE). It is a four-stage framework, including query question pre-processing, semantic question search (utilizing SBERT and FAISS vector database), answer retrieval and re-ranking (utilizing BERT-based bi-encoder and cross-encoder), and answer summary generation (using fine-tuned transformers such as BART, PEGASUS, T5) with keyword guidance (using a keyword extractor such as KeyBERT). The results conceptualize the efficacy of the proposed framework on task-specific datasets (CNN/DailyMail and MS-MARCO) over the ROUGE metric. The model outperformed existing baseline models on CNN/DailyMail dataset with a value of 47.5 (PEGASUS), 46.2 (BART), and 45.1 (T5) in terms of ROUGE-1 and on MS-MARCO dataset with a value of 75.18 (PEGASUS), 79.02 (BART), and 74.69 (T5) in terms of ROUGE-L.
{"title":"QFAS-KE: Query focused answer summarization using keyword extraction","authors":"Rupali Goyal , Parteek Kumar , V.P. Singh","doi":"10.1016/j.ipm.2025.104104","DOIUrl":"10.1016/j.ipm.2025.104104","url":null,"abstract":"<div><div>Question answering (QA) portals like Quora, Stack Overflow, AskUbuntu, Yahoo! Answers, Reddit, and Wiki Answers have emerged as hubs of curiosity, highlighting the rising demands for easily accessible information and are drawing focus to hundreds of millions of questions. The efficient utilization of these questions and associated answers has become significantly vital for these QA websites. The similarity-based information retrieval methods provide a ranked list of potentially relevant questions, and the users have to spend significant time sifting through the results to discover the best answer. This paper aims to provide a precise, comprehensive, summarized answer to the user asked query using extracted keywords that offer valuable insights into relevant content. The research work presents a Query focused Answer Summarization framework using Keyword Extraction (QFAS-KE). It is a four-stage framework, including query question pre-processing, semantic question search (utilizing SBERT and FAISS vector database), answer retrieval and re-ranking (utilizing BERT-based bi-encoder and cross-encoder), and answer summary generation (using fine-tuned transformers such as BART, PEGASUS, T5) with keyword guidance (using a keyword extractor such as KeyBERT). The results conceptualize the efficacy of the proposed framework on task-specific datasets (CNN/DailyMail and MS-MARCO) over the ROUGE metric. The model outperformed existing baseline models on CNN/DailyMail dataset with a value of 47.5 (PEGASUS), 46.2 (BART), and 45.1 (T5) in terms of ROUGE-1 and on MS-MARCO dataset with a value of 75.18 (PEGASUS), 79.02 (BART), and 74.69 (T5) in terms of ROUGE-L.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104104"},"PeriodicalIF":7.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-24DOI: 10.1016/j.ipm.2025.104107
Junfeng Hao, Peng Chen, Juan Chen, Xi Li
Distributed multivariate time series anomaly detection is widely-used in industrial equipment monitoring, financial risk management, and smart cities. Although Federated learning (FL) has garnered significant interest and achieved decent performance in various scenarios, most existing FL-based distributed anomaly detection methods still face challenges including: inadequate detection performance in global model, insufficient essential features extraction caused by the fragmentation of local time series, and lack for practical anomaly localization. To address these challenges, we propose an Unsupervised Federated Hypernetwork Method for Distributed Multivariate Time Series Anomaly Detection and Diagnosis (uFedHy-DisMTSADD). Specifically, we introduce a federated hypernetwork architecture that effectively mitigates the heterogeneity and fluctuations in distributed environments while protecting client data privacy. Then, we adopt the Series Conversion Normalization Transformer (SC Nor-Transformer) to tackle the timing bias due to model aggregation through series conversion. Series normalization improves the temporal dependence of capturing subsequences. Finally, uFedHy-DisMTSADD simultaneously localizes the root cause of the anomaly by reconstructing the anomaly scores obtained from each subsequence. We performed an extensive evaluation on nine datasets, in which uFedHy-DisMTSADD outperformed the existing state-of-the-art baseline average F1 score by 9.19% and the average AUROC by 2.41%. Moreover, the average localization fault accuracy of uFedHy-DisMTSADD is 9.23% higher than that of the optimal baseline method. Code is available at this repository:https://github.com/Hjfyoyo/uFedHy-DisMTSADD.
{"title":"Effectively detecting and diagnosing distributed multivariate time series anomalies via Unsupervised Federated Hypernetwork","authors":"Junfeng Hao, Peng Chen, Juan Chen, Xi Li","doi":"10.1016/j.ipm.2025.104107","DOIUrl":"10.1016/j.ipm.2025.104107","url":null,"abstract":"<div><div>Distributed multivariate time series anomaly detection is widely-used in industrial equipment monitoring, financial risk management, and smart cities. Although Federated learning (FL) has garnered significant interest and achieved decent performance in various scenarios, most existing FL-based distributed anomaly detection methods still face challenges including: inadequate detection performance in global model, insufficient essential features extraction caused by the fragmentation of local time series, and lack for practical anomaly localization. To address these challenges, we propose an Unsupervised Federated Hypernetwork Method for Distributed Multivariate Time Series Anomaly Detection and Diagnosis (uFedHy-DisMTSADD). Specifically, we introduce a federated hypernetwork architecture that effectively mitigates the heterogeneity and fluctuations in distributed environments while protecting client data privacy. Then, we adopt the Series Conversion Normalization Transformer (SC Nor-Transformer) to tackle the timing bias due to model aggregation through series conversion. Series normalization improves the temporal dependence of capturing subsequences. Finally, uFedHy-DisMTSADD simultaneously localizes the root cause of the anomaly by reconstructing the anomaly scores obtained from each subsequence. We performed an extensive evaluation on nine datasets, in which uFedHy-DisMTSADD outperformed the existing state-of-the-art baseline average F1 score by 9.19% and the average AUROC by 2.41%. Moreover, the average localization fault accuracy of uFedHy-DisMTSADD is 9.23% higher than that of the optimal baseline method. Code is available at this repository:<span><span>https://github.com/Hjfyoyo/uFedHy-DisMTSADD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104107"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-24DOI: 10.1016/j.ipm.2025.104080
Xiaoxia Xu, Ruguo Fan, Dongxue Wang, Xiao Xie, Kang Du
The growing review manipulation has seriously hampered credit regulation on e-commerce platforms, yet few studies have explored its complex dynamics. Unlike current research centering on merchants creating various management strategies, this study examines the collusion between merchants and consumers. By integrating evolutionary game theory and a system dynamics approach, this study offers meaningful conclusions for platform credit management. First, our findings indicate that merchants can maintain honesty regardless of the regulatory strategy implemented. For positive regulation, platforms can impose higher penalties; for negative regulation, maintaining lower exposure is feasible. Second, our analysis illustrates the necessity of breaking the collusion between merchants and consumers. Under positive regulation, platforms can amplify penalties or enhance the regulatory impact on platform revenues. Conversely, negative regulation allows for reducing the short-term financial impact of reviews or adjusting cashback. Third, we uncover that dynamic punishment strategies are not always optimal. In some cases, static punishment strategies outperform linear dynamic punishment strategies, highlighting the importance of carefully evaluating the effectiveness of different regulatory approaches in various contexts.
{"title":"Mitigating collusive manipulation of reviews in e-commerce platforms: Evolutionary game and strategy simulation","authors":"Xiaoxia Xu, Ruguo Fan, Dongxue Wang, Xiao Xie, Kang Du","doi":"10.1016/j.ipm.2025.104080","DOIUrl":"10.1016/j.ipm.2025.104080","url":null,"abstract":"<div><div>The growing review manipulation has seriously hampered credit regulation on e-commerce platforms, yet few studies have explored its complex dynamics. Unlike current research centering on merchants creating various management strategies, this study examines the collusion between merchants and consumers. By integrating evolutionary game theory and a system dynamics approach, this study offers meaningful conclusions for platform credit management. First, our findings indicate that merchants can maintain honesty regardless of the regulatory strategy implemented. For positive regulation, platforms can impose higher penalties; for negative regulation, maintaining lower exposure is feasible. Second, our analysis illustrates the necessity of breaking the collusion between merchants and consumers. Under positive regulation, platforms can amplify penalties or enhance the regulatory impact on platform revenues. Conversely, negative regulation allows for reducing the short-term financial impact of reviews or adjusting cashback. Third, we uncover that dynamic punishment strategies are not always optimal. In some cases, static punishment strategies outperform linear dynamic punishment strategies, highlighting the importance of carefully evaluating the effectiveness of different regulatory approaches in various contexts.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104080"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Policy style is a crucial concept in policy science that reflects persistent patterns in the policy process across different governance settings. Despite its importance, policy style measurement faces issues of complexity, subjectivity, data sparseness, and computational cost. To overcome these obstacles, we propose KOALA, a novel KnOwledge distillation framework based on large lAnguage modeL collAboration. It transforms the weak scoring abilities of LLMs into a pairwise ranking problem, employs a small set of expert-annotated samples for non-parametric learning, and utilizes knowledge distillation to transfer insights from LLMs to a smaller, more efficient model. The framework incorporates multiple LLM-based agents (Prompter, Ranker, and Analyst) collaborating to comprehend complex measurement standards and self-explain policy style definitions. We validate KOALA on 4,572 Chinese government work reports (1954–2019) from central, provincial, and municipal levels, with a focus on the imposition dimension of policy style. Extensive experiments demonstrate KOALA’s effectiveness in measuring the intensity of policy style, highlighting its superiority over state-of-the-art methods. While GPT-4 achieves only 66% accuracy in pairwise ranking of policy styles, KOALA, despite being based on GPT-3.5, achieves a remarkable 85% accuracy, highlighting significant performance improvement. This framework offers a transferable approach for quantifying complex social science concepts in textual data, bridging computational techniques with social science research.
{"title":"Expert-level policy style measurement via knowledge distillation with large language model collaboration","authors":"Yujie Zhang , Biao Huang , Weikang Yuan , Zhuoren Jiang , Longsheng Peng , Shuai Chen , Jie-Sheng Tan-Soo","doi":"10.1016/j.ipm.2025.104090","DOIUrl":"10.1016/j.ipm.2025.104090","url":null,"abstract":"<div><div>Policy style is a crucial concept in policy science that reflects persistent patterns in the policy process across different governance settings. Despite its importance, policy style measurement faces issues of complexity, subjectivity, data sparseness, and computational cost. To overcome these obstacles, we propose <strong>KOALA</strong>, a novel <strong><u>K</u></strong>n<strong><u>O</u></strong>wledge distillation framework based on large l<strong><u>A</u></strong>nguage mode<strong><u>L</u></strong> coll<strong><u>A</u></strong>boration. It transforms the weak scoring abilities of LLMs into a pairwise ranking problem, employs a small set of expert-annotated samples for non-parametric learning, and utilizes knowledge distillation to transfer insights from LLMs to a smaller, more efficient model. The framework incorporates multiple LLM-based agents (Prompter, Ranker, and Analyst) collaborating to comprehend complex measurement standards and self-explain policy style definitions. We validate KOALA on 4,572 Chinese government work reports (1954–2019) from central, provincial, and municipal levels, with a focus on the imposition dimension of policy style. Extensive experiments demonstrate KOALA’s effectiveness in measuring the intensity of policy style, highlighting its superiority over state-of-the-art methods. While GPT-4 achieves only 66% accuracy in pairwise ranking of policy styles, KOALA, despite being based on GPT-3.5, achieves a remarkable 85% accuracy, highlighting significant performance improvement. This framework offers a transferable approach for quantifying complex social science concepts in textual data, bridging computational techniques with social science research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104090"},"PeriodicalIF":7.4,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-21DOI: 10.1016/j.ipm.2025.104102
Jiajie Wang , Wanfang Hou , Yue Li , Jianjun Sun , Lele Kang
Interaction between science and technology (S&T) is a vital mechanism for generating significant innovative breakthroughs. Prior studies have utilized indicators such as semantic similarity or citation analysis to measure the relationships between scientific communities and technological communities represented by papers and patents. However, shifts in innovation paradigms have progressively blurred the boundaries between S&T, leading to the formation of fusion knowledge communities (FKCs) that encompass both scientific and technological knowledge. Therefore, this study proposes a novel approach to exploring the S&T interaction within FKCs. To achieve this, we integrate semantic and citation information by combining BERT and Graph Auto-Encoder algorithms, and employ the Louvain algorithm for FKCs detection. We then conduct a two-step analysis. First, we quantify the strength of S&T interactions over different periods by defining an interaction intensity metric based on the coupling of keywords, and assess the knowledge depth. Second, we analyze the evolution of S&T interactions by measuring knowledge transfer, transmission direction, and degree, which involves computing knowledge similarity between papers and patents and constructing citation networks to highlight key transfer channels over time. We apply this approach to the field of Genetically Engineered Vaccines (GEV), analyzing 1,937 patents and 4,393 papers from 1980 to 2020. The results demonstrate that our method effectively reveals the fusion knowledge community structures between S&T and provides a detailed analysis of interaction patterns and their evolution within FKCs. This study advances the methodology for exploring S&T interactions within FKCs, offering a fine-grained analytical perspective for innovation management research.
{"title":"Beyond boundaries: Exploring the interaction between science and technology in fusion knowledge communities","authors":"Jiajie Wang , Wanfang Hou , Yue Li , Jianjun Sun , Lele Kang","doi":"10.1016/j.ipm.2025.104102","DOIUrl":"10.1016/j.ipm.2025.104102","url":null,"abstract":"<div><div>Interaction between science and technology (S&T) is a vital mechanism for generating significant innovative breakthroughs. Prior studies have utilized indicators such as semantic similarity or citation analysis to measure the relationships between scientific communities and technological communities represented by papers and patents. However, shifts in innovation paradigms have progressively blurred the boundaries between S&T, leading to the formation of fusion knowledge communities (FKCs) that encompass both scientific and technological knowledge. Therefore, this study proposes a novel approach to exploring the S&T interaction within FKCs. To achieve this, we integrate semantic and citation information by combining BERT and Graph Auto-Encoder algorithms, and employ the Louvain algorithm for FKCs detection. We then conduct a two-step analysis. First, we quantify the strength of S&T interactions over different periods by defining an interaction intensity metric based on the coupling of keywords, and assess the knowledge depth. Second, we analyze the evolution of S&T interactions by measuring knowledge transfer, transmission direction, and degree, which involves computing knowledge similarity between papers and patents and constructing citation networks to highlight key transfer channels over time. We apply this approach to the field of Genetically Engineered Vaccines (GEV), analyzing 1,937 patents and 4,393 papers from 1980 to 2020. The results demonstrate that our method effectively reveals the fusion knowledge community structures between S&T and provides a detailed analysis of interaction patterns and their evolution within FKCs. This study advances the methodology for exploring S&T interactions within FKCs, offering a fine-grained analytical perspective for innovation management research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104102"},"PeriodicalIF":7.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1016/j.ipm.2025.104103
Yangding Li , Yangyang Zeng , Xiangchao Zhao , Jiawei Chai , Hao Feng , Shaobin Fu , Cui Ye , Shichao Zhang
Graph Contrastive Learning (GCL) leverages graph structure and node feature information to learn powerful node representations in a self-supervised manner, attracting significant attention from researchers. Most GCL frameworks typically use Graph Neural Networks (GNNs) as their foundational encoders. Still, GNN methods have inherent drawbacks: local GNNs struggle to capture long-range dependencies, and deep GNNs face the oversmoothing problem. Moreover, existing GCL methods do not adequately model node feature information, relying on topology to learn neighbor features. In this paper, we introduce a novel contrastive learning mechanism that employs transformers to capture long-range dependency information while integrating the strong perceptual capabilities of GNNs for local topology, resulting in a GCL architecture that is highly robust across different levels of homophily. Specifically, we design three views: the original view, the long-range information view, and the feature view. By jointly contrasting these three views, the model effectively acquires rich information from the graph. Experimental results on seven real-world datasets with varying levels of homophily demonstrate that the proposed method significantly outperforms other baseline models, validating its effectiveness and rationality.
{"title":"GNN-transformer contrastive learning explores homophily","authors":"Yangding Li , Yangyang Zeng , Xiangchao Zhao , Jiawei Chai , Hao Feng , Shaobin Fu , Cui Ye , Shichao Zhang","doi":"10.1016/j.ipm.2025.104103","DOIUrl":"10.1016/j.ipm.2025.104103","url":null,"abstract":"<div><div>Graph Contrastive Learning (GCL) leverages graph structure and node feature information to learn powerful node representations in a self-supervised manner, attracting significant attention from researchers. Most GCL frameworks typically use Graph Neural Networks (GNNs) as their foundational encoders. Still, GNN methods have inherent drawbacks: local GNNs struggle to capture long-range dependencies, and deep GNNs face the oversmoothing problem. Moreover, existing GCL methods do not adequately model node feature information, relying on topology to learn neighbor features. In this paper, we introduce a novel contrastive learning mechanism that employs transformers to capture long-range dependency information while integrating the strong perceptual capabilities of GNNs for local topology, resulting in a GCL architecture that is highly robust across different levels of homophily. Specifically, we design three views: the original view, the long-range information view, and the feature view. By jointly contrasting these three views, the model effectively acquires rich information from the graph. Experimental results on seven real-world datasets with varying levels of homophily demonstrate that the proposed method significantly outperforms other baseline models, validating its effectiveness and rationality.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104103"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1016/j.ipm.2025.104108
Gang Ren , Li Jiang , Tingting Huang , Ying Yang , Taeho Hong
The widespread dissemination of misinformation on social media platforms significantly affects public security. Current methods for detecting misinformation predominantly rely on semantic information and social context features. However, they often neglect the intricate noise issues and unreliable information interactions resulting from diverse public behaviors, such as cognitive biases, user prejudices, and bot activity. To tackle these challenges, we propose an approach named TSHCL (temporal-spatial hierarchical contrastive learning) for automatic misinformation detection from the public-behavior perspective. First, the integration of a graph convolutional network (GCN)-based autoencoder architecture with a hybrid augmentation method is designed to model typical public behaviors. Next, node-level contrastive learning is designed to maintain the heterogeneity of comments in the spatial view under the influence of complex public behaviors. Finally, cross-view graph-level contrastive learning is designed to promote collaborative learning between the temporal sequence view of events and the spatial propagation structure view. By conducting temporal-spatial hierarchical contrastive learning, the model effectively retains crucial node information and facilitates the interaction of temporal-spatial information. Extensive experiments conducted on real datasets from MCFEND and Weibo demonstrate that our model surpasses the state-of-the-art models. Our proposed model can effectively alleviate the noise and unreliable information interaction caused by public behavior, and enrich the research perspective of misinformation detection.
{"title":"Temporal-spatial hierarchical contrastive learning for misinformation detection: A public-behavior perspective","authors":"Gang Ren , Li Jiang , Tingting Huang , Ying Yang , Taeho Hong","doi":"10.1016/j.ipm.2025.104108","DOIUrl":"10.1016/j.ipm.2025.104108","url":null,"abstract":"<div><div>The widespread dissemination of misinformation on social media platforms significantly affects public security. Current methods for detecting misinformation predominantly rely on semantic information and social context features. However, they often neglect the intricate noise issues and unreliable information interactions resulting from diverse public behaviors, such as cognitive biases, user prejudices, and bot activity. To tackle these challenges, we propose an approach named TSHCL (temporal-spatial hierarchical contrastive learning) for automatic misinformation detection from the public-behavior perspective. First, the integration of a graph convolutional network (GCN)-based autoencoder architecture with a hybrid augmentation method is designed to model typical public behaviors. Next, node-level contrastive learning is designed to maintain the heterogeneity of comments in the spatial view under the influence of complex public behaviors. Finally, cross-view graph-level contrastive learning is designed to promote collaborative learning between the temporal sequence view of events and the spatial propagation structure view. By conducting temporal-spatial hierarchical contrastive learning, the model effectively retains crucial node information and facilitates the interaction of temporal-spatial information. Extensive experiments conducted on real datasets from MCFEND and Weibo demonstrate that our model surpasses the state-of-the-art models. Our proposed model can effectively alleviate the noise and unreliable information interaction caused by public behavior, and enrich the research perspective of misinformation detection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104108"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}