Pub Date : 2025-12-22DOI: 10.1016/j.ipm.2025.104573
Xunlian Wu , Anqi Zhang , Jingqi Hu , Han Zhang , Yining Quan , Qiguang Miao , Peng Gang Sun
Community detection plays a crucial role in network analysis. While the Label Propagation Algorithm (LPA) is known for its efficiency, it suffers from unstable results due to random label updates and the inability to capture higher-order structural information. To address these limitations, we propose LALPA (Label Acceptance-based Label Propagation Algorithm) for community detection. LALPA introduces a node importance measure based on neighbor similarity to guide a stable, ordered label update process. To better capture structural information, we reconstruct the network topology by integrating both low-order (adjacent links) and high-order (motif-based) interactions, modeling node influence acceptance. Label acceptance is then determined by combining node importance and influence acceptance. A novel propagation strategy is designed to aggregate labels not only from current neighbors but also from those sharing the same label. Extensive experiments on 10 real-world and 24 synthetic networks show that LALPA consistently outperforms state-of-the-art methods, especially in networks with unobvious community structures. In particular, on all unattributed graphs, LALPA achieves an average performance gain of 2.69 % over the best baseline.
{"title":"Label acceptance based label propagation algorithm for community detection","authors":"Xunlian Wu , Anqi Zhang , Jingqi Hu , Han Zhang , Yining Quan , Qiguang Miao , Peng Gang Sun","doi":"10.1016/j.ipm.2025.104573","DOIUrl":"10.1016/j.ipm.2025.104573","url":null,"abstract":"<div><div>Community detection plays a crucial role in network analysis. While the Label Propagation Algorithm (LPA) is known for its efficiency, it suffers from unstable results due to random label updates and the inability to capture higher-order structural information. To address these limitations, we propose LALPA (Label Acceptance-based Label Propagation Algorithm) for community detection. LALPA introduces a node importance measure based on neighbor similarity to guide a stable, ordered label update process. To better capture structural information, we reconstruct the network topology by integrating both low-order (adjacent links) and high-order (motif-based) interactions, modeling node influence acceptance. Label acceptance is then determined by combining node importance and influence acceptance. A novel propagation strategy is designed to aggregate labels not only from current neighbors but also from those sharing the same label. Extensive experiments on 10 real-world and 24 synthetic networks show that LALPA consistently outperforms state-of-the-art methods, especially in networks with unobvious community structures. In particular, on all unattributed graphs, LALPA achieves an average performance gain of 2.69 % over the best baseline.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104573"},"PeriodicalIF":6.9,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.ipm.2025.104569
Fanyi Yang , Xue Li , Wentao Wang , Xiguo Yuan
Recent advances in ECG representation learning have leveraged frequency-domain information to improve representation quality, yet most methods still suffer from inadequate view fusion and coarse-grained modeling of inter-lead structural dependencies. To address these challenges, we propose D2VLA, a novel framework for 12-lead ECG representation learning that integrates dual-view conditional diffusion with a lead-aware dual-attention mechanism. The diffusion module enables semantic alignment between time-domain and frequency-domain views through denoising-based conditional guidance, while the attention module jointly models the temporal dynamics of individual leads and the spatial relationships among leads within a unified encoder. In addition, we introduce a patch-level contrastive objective to further enhance the discriminative capability of the learned representations. Extensive experiments on three real-world ECG datasets demonstrate that D2VLA achieves competitive performance on classification tasks against eight baseline models, improving accuracy by 4.6 % on PTB-XL and by 4.5 % on CPSC, and achieving AUROC improvement of about 4.0 % on Chapman, thereby highlighting its superior structural modeling capability.
{"title":"Representation learning for 12-lead ECGs via dual-view conditional diffusion and lead-aware attention","authors":"Fanyi Yang , Xue Li , Wentao Wang , Xiguo Yuan","doi":"10.1016/j.ipm.2025.104569","DOIUrl":"10.1016/j.ipm.2025.104569","url":null,"abstract":"<div><div>Recent advances in ECG representation learning have leveraged frequency-domain information to improve representation quality, yet most methods still suffer from inadequate view fusion and coarse-grained modeling of inter-lead structural dependencies. To address these challenges, we propose D<sup>2</sup>VLA, a novel framework for 12-lead ECG representation learning that integrates dual-view conditional diffusion with a lead-aware dual-attention mechanism. The diffusion module enables semantic alignment between time-domain and frequency-domain views through denoising-based conditional guidance, while the attention module jointly models the temporal dynamics of individual leads and the spatial relationships among leads within a unified encoder. In addition, we introduce a patch-level contrastive objective to further enhance the discriminative capability of the learned representations. Extensive experiments on three real-world ECG datasets demonstrate that D<sup>2</sup>VLA achieves competitive performance on classification tasks against eight baseline models, improving accuracy by 4.6 % on PTB-XL and by 4.5 % on CPSC, and achieving AUROC improvement of about 4.0 % on Chapman, thereby highlighting its superior structural modeling capability.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104569"},"PeriodicalIF":6.9,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.ipm.2025.104555
Jingrui Hou , Zhihang Tan , Qibiao Hu , Ping Wang , Yan Gong
We propose Cascade-of-Thought (CSOT), a novel prompt-based method for multimodal hierarchical classification (MHC) that requires no training or labeled exemplars. Inspired by the LLM-as-a-Judge (LaaJ) paradigm, CSOT decomposes classification into rationale generation, confidence scoring, and decision ranking–each implemented via structured prompts to a vision-language model (VLM). Experiments on two public MHC benchmarks demonstrate that CSOT yields substantial performance gains, particularly for weaker VLMs, while also enhancing the output quality of near-ceiling models. CSOT offers a flexible, generalizable solution for real-world MHC tasks.
{"title":"Multimodal hierarchical classification using cascade-of-thought","authors":"Jingrui Hou , Zhihang Tan , Qibiao Hu , Ping Wang , Yan Gong","doi":"10.1016/j.ipm.2025.104555","DOIUrl":"10.1016/j.ipm.2025.104555","url":null,"abstract":"<div><div>We propose Cascade-of-Thought (CSOT), a novel prompt-based method for multimodal hierarchical classification (MHC) that requires no training or labeled exemplars. Inspired by the <em>LLM-as-a-Judge</em> (LaaJ) paradigm, CSOT decomposes classification into rationale generation, confidence scoring, and decision ranking–each implemented via structured prompts to a vision-language model (VLM). Experiments on two public MHC benchmarks demonstrate that CSOT yields substantial performance gains, particularly for weaker VLMs, while also enhancing the output quality of near-ceiling models. CSOT offers a flexible, generalizable solution for real-world MHC tasks.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104555"},"PeriodicalIF":6.9,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-20DOI: 10.1016/j.ipm.2025.104567
ShaoPeng Che , Min Zhu , Shunan Zhang , Hae Sun Jung , Haein Lee , Zhixiao Wang , Lee Miller
Traditional public opinion surveys face persistent challenges related to cost, sample representativeness, and respondent willingness. These limitations have encouraged growing interest in using large language models (LLMs) to generate silicon samples as synthetic substitutes for human data. Although previous studies report high algorithmic fidelity in Western contexts, much less is known about whether globally trained LLMs can reproduce public attitudes in regulated and non-Western information environments. Using nationally representative data from the Chinese General Social Survey (CGSS 2021), this study evaluates ChatGPT’s ability to simulate Chinese public opinion on ten policy issues by comparing human responses with demographic-conditioned silicon samples. Analyses of response rates, response distributions, and demographic subgroups show that LLM outputs approximate human attitudes on low-sensitivity and consensus-oriented topics, but diverge systematically on culturally embedded and governance-sensitive issues. Silicon samples also produce near-complete response rates, which fails to capture human patterns of strategic non-response, and show larger misalignment among politically embedded and highly educated subgroups. Robustness diagnostics across model generations reveal strong cross-model structural stability but continued limitations when the model is applied in different sociopolitical contexts. These findings reconceptualize algorithmic fidelity as a context-sensitive construct and extend Pattern Correspondence into a multidimensional framework that incorporates response rates, response distributions, and demographic subgroup patterns. Overall, the study highlights both the potential and the limits of using LLMs to simulate public opinion in non-Western settings, emphasizing the need for culturally grounded calibration, transparent reporting, and cautious use in policy-relevant domains.
{"title":"Simulating the people's voice: Leveraging algorithmic fidelity to assess ChatGPT's performance in modeling public opinion on Chinese government policies","authors":"ShaoPeng Che , Min Zhu , Shunan Zhang , Hae Sun Jung , Haein Lee , Zhixiao Wang , Lee Miller","doi":"10.1016/j.ipm.2025.104567","DOIUrl":"10.1016/j.ipm.2025.104567","url":null,"abstract":"<div><div>Traditional public opinion surveys face persistent challenges related to cost, sample representativeness, and respondent willingness. These limitations have encouraged growing interest in using large language models (LLMs) to generate silicon samples as synthetic substitutes for human data. Although previous studies report high algorithmic fidelity in Western contexts, much less is known about whether globally trained LLMs can reproduce public attitudes in regulated and non-Western information environments. Using nationally representative data from the Chinese General Social Survey (CGSS 2021), this study evaluates ChatGPT’s ability to simulate Chinese public opinion on ten policy issues by comparing human responses with demographic-conditioned silicon samples. Analyses of response rates, response distributions, and demographic subgroups show that LLM outputs approximate human attitudes on low-sensitivity and consensus-oriented topics, but diverge systematically on culturally embedded and governance-sensitive issues. Silicon samples also produce near-complete response rates, which fails to capture human patterns of strategic non-response, and show larger misalignment among politically embedded and highly educated subgroups. Robustness diagnostics across model generations reveal strong cross-model structural stability but continued limitations when the model is applied in different sociopolitical contexts. These findings reconceptualize algorithmic fidelity as a context-sensitive construct and extend Pattern Correspondence into a multidimensional framework that incorporates response rates, response distributions, and demographic subgroup patterns. Overall, the study highlights both the potential and the limits of using LLMs to simulate public opinion in non-Western settings, emphasizing the need for culturally grounded calibration, transparent reporting, and cautious use in policy-relevant domains.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104567"},"PeriodicalIF":6.9,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transactions and addresses on account-based blockchains form highly interconnected networks. However, current methods face several challenges, as incomplete graph structures inadequately support specific analytical tasks and are inefficient for certain time-sensitive analyses. In this paper, we propose a multi-view comprehensive graph structure in account-based blockchains to overcome these challenges. Specifically, we deploy archive nodes to extract various raw data from the account-based blockchain and construct the basic graph structure. Meanwhile, we analyze the initiating reasons of transactions to form the intrinsic attribute view. Then, we further analyze the on-chain activities of addresses and annotate the edges within the graph to form a view of common behaviors. Our comprehensive graph structure includes the two views mentioned above, which not only supports analyses within a single view but also enables the exploration of correlations between different views. The proposed graph structure achieves an average speed improvement of 49.08% compared to baselines in experiments. Through multi-view ecosystem analyses, we provide insights into the blockchain characteristics. We demonstrate the application of the comprehensive graph to multi-view analytical tasks on account-based blockchains, including forensics of ”the DAO attack”, phishing detection, and address classification as examples. As a result, we find 13 unreported potential DAO attacker accounts, and outperform existing graph structures in various downstream tasks.
{"title":"One for all: A comprehensive graph structure of the account-based blockchain for multi-view analysis","authors":"Yuan Gao, Ruibin Yan, Zeyu Zhang, Zhihao Li, Dechun Yin, Yijun Gu","doi":"10.1016/j.ipm.2025.104556","DOIUrl":"10.1016/j.ipm.2025.104556","url":null,"abstract":"<div><div>Transactions and addresses on account-based blockchains form highly interconnected networks. However, current methods face several challenges, as incomplete graph structures inadequately support specific analytical tasks and are inefficient for certain time-sensitive analyses. In this paper, we propose a multi-view comprehensive graph structure in account-based blockchains to overcome these challenges. Specifically, we deploy archive nodes to extract various raw data from the account-based blockchain and construct the basic graph structure. Meanwhile, we analyze the initiating reasons of transactions to form the intrinsic attribute view. Then, we further analyze the on-chain activities of addresses and annotate the edges within the graph to form a view of common behaviors. Our comprehensive graph structure includes the two views mentioned above, which not only supports analyses within a single view but also enables the exploration of correlations between different views. The proposed graph structure achieves an average speed improvement of <strong>49.08%</strong> compared to baselines in experiments. Through multi-view ecosystem analyses, we provide insights into the blockchain characteristics. We demonstrate the application of the comprehensive graph to multi-view analytical tasks on account-based blockchains, including forensics of ”the DAO attack”, phishing detection, and address classification as examples. As a result, we find 13 unreported potential DAO attacker accounts, and outperform existing graph structures in various downstream tasks.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104556"},"PeriodicalIF":6.9,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.ipm.2025.104572
Xingpeng Zheng , Jing Li , Yue Xia , Yingji Li
In the field of green product consumption, consumers tend to seek detailed information to assess product efficacy. In recent years, the advent of Generative Artificial Intelligence (GenAI) tools has significantly streamlined consumers’ access to relevant information on green products. However, as emotional factors are decisive in purchase decision-making, existing studies that predominantly focus on rational decision-making frequently overlook this crucial emotional dimension. Addressing this gap, this study adopts the extended emotion heuristic theory to examine the impact of GenAI tools on green product purchase preferences. Using the partial least squares structural equation model, green product consumption data were collected from multiple regions including Chongqing, Guangdong, Hunan, Hubei, Shanghai, Beijing, and others, between January and March 2025. A total of 717 valid responses were analysed using SPSS 28, Amos 28, and Smart PLS 4.0. The results reveal that certain characteristics of GenAI-generated content—specifically, quality (content relevance, content accuracy), communication style (personalisation, anthropomorphism), and serendipity—positively influence purchase intention for green products. Furthermore, emotional connection plays a partial mediating role. These findings extend the application of emotion heuristic theory in the context of artificial intelligence and highlight the significant role of emotional factors in fostering consumption intentions via GenAI tools. The results offer insights for green product marketers and GenAI tool developers to enhance content quality, communication methods, and additional functions, while also informing regulatory policymaking related to GenAI tools.
{"title":"How GenAI tools influence the purchase intention of green products through the mediating role of emotional connection: Evidence from China","authors":"Xingpeng Zheng , Jing Li , Yue Xia , Yingji Li","doi":"10.1016/j.ipm.2025.104572","DOIUrl":"10.1016/j.ipm.2025.104572","url":null,"abstract":"<div><div>In the field of green product consumption, consumers tend to seek detailed information to assess product efficacy. In recent years, the advent of Generative Artificial Intelligence (GenAI) tools has significantly streamlined consumers’ access to relevant information on green products. However, as emotional factors are decisive in purchase decision-making, existing studies that predominantly focus on rational decision-making frequently overlook this crucial emotional dimension. Addressing this gap, this study adopts the extended emotion heuristic theory to examine the impact of GenAI tools on green product purchase preferences. Using the partial least squares structural equation model, green product consumption data were collected from multiple regions including Chongqing, Guangdong, Hunan, Hubei, Shanghai, Beijing, and others, between January and March 2025. A total of 717 valid responses were analysed using SPSS 28, Amos 28, and Smart PLS 4.0. The results reveal that certain characteristics of GenAI-generated content—specifically, quality (content relevance, content accuracy), communication style (personalisation, anthropomorphism), and serendipity—positively influence purchase intention for green products. Furthermore, emotional connection plays a partial mediating role. These findings extend the application of emotion heuristic theory in the context of artificial intelligence and highlight the significant role of emotional factors in fostering consumption intentions via GenAI tools. The results offer insights for green product marketers and GenAI tool developers to enhance content quality, communication methods, and additional functions, while also informing regulatory policymaking related to GenAI tools.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104572"},"PeriodicalIF":6.9,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1016/j.ipm.2025.104566
Shuqin Wang , Yi Zhu , Peipei Li
The widespread application of Large Language Models (LLMs) has recently raised social concerns regarding potential misuse, which accentuates both the importance and challenges of identifying LLM-generated texts. However, existing advanced zero-shot black-box methods rely on LLMs for LLM-generated text detection, which leads to hallucinations in classification tasks characterized by unclear decision boundaries. On the other hand, they only focus on exploiting inherent text features, which is essentially the idea of hand-crafted feature engineering but results in poor robustness and generality when faced with different data distributions. Therefore, in this paper, we propose a novel few-shot black-box method via prompt-tuning based on the Pre-trained Language Models (PLMs), which is a smaller language model than LLM. Specifically, in our method, a few labeled data are considered as the source domain, while the unlabeled test data are treated as the target domain, correspondingly the LLM-generated text detection is firstly reformulated as the cross-domain text classification task. Secondly, the soft prompt-tuning model is learned in the source domain and converted into an iterative model to find the true label information in the target domain. By voting for predicted labels that are generated with the iterative model, soft prompt-tuning is trained for LLM-generated text detection tasks. Finally, extensive experimental results demonstrate that our method outperforms current SOTA baselines.
{"title":"A smaller model can be better: Domain adaptation for LLM-generated text detection via soft prompt-tuning","authors":"Shuqin Wang , Yi Zhu , Peipei Li","doi":"10.1016/j.ipm.2025.104566","DOIUrl":"10.1016/j.ipm.2025.104566","url":null,"abstract":"<div><div>The widespread application of Large Language Models (LLMs) has recently raised social concerns regarding potential misuse, which accentuates both the importance and challenges of identifying LLM-generated texts. However, existing advanced zero-shot black-box methods rely on LLMs for LLM-generated text detection, which leads to hallucinations in classification tasks characterized by unclear decision boundaries. On the other hand, they only focus on exploiting inherent text features, which is essentially the idea of hand-crafted feature engineering but results in poor robustness and generality when faced with different data distributions. Therefore, in this paper, we propose a novel few-shot black-box method via prompt-tuning based on the Pre-trained Language Models (PLMs), which is a smaller language model than LLM. Specifically, in our method, a few labeled data are considered as the source domain, while the unlabeled test data are treated as the target domain, correspondingly the LLM-generated text detection is firstly reformulated as the cross-domain text classification task. Secondly, the soft prompt-tuning model is learned in the source domain and converted into an iterative model to find the true label information in the target domain. By voting for predicted labels that are generated with the iterative model, soft prompt-tuning is trained for LLM-generated text detection tasks. Finally, extensive experimental results demonstrate that our method outperforms current SOTA baselines.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104566"},"PeriodicalIF":6.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1016/j.ipm.2025.104560
Jianjun Lei , Yijie Tan , Ying Wang
To address three key challenges of Text-to-SQL self-correction, including schema mismatch, structural incompleteness, and semantic validation weakness, this paper proposes EMLC, an extensible multi-level correction framework that hierarchically integrates schema, skeleton, and execution corrections. EMLC incorporates a dual-validation schema correction mechanism that combines large language model (LLM)-based prediction with token-level mapping for precise schema alignment. Moreover, it employs supervised fine-tuning skeleton generation to detect and correct keyword-level errors through abstract skeleton comparison, while the executability verification strategy is designed to further ensure both syntactic integrity and semantic fidelity of generated queries. EMLC supports plug-and-play integration with mainstream LLMs and flexible scalability. Experiments on the SPIDER and BIRD datasets show that EMLC achieves state-of-the-art execution accuracy, outperforming baseline methods by 2–4 %. Ablation studies further validate the individual contributions of each component and their synergistic effects.
{"title":"EMLC: An extensible multi-level correction framework for text-to-SQL","authors":"Jianjun Lei , Yijie Tan , Ying Wang","doi":"10.1016/j.ipm.2025.104560","DOIUrl":"10.1016/j.ipm.2025.104560","url":null,"abstract":"<div><div>To address three key challenges of Text-to-SQL self-correction, including schema mismatch, structural incompleteness, and semantic validation weakness, this paper proposes EMLC, an extensible multi-level correction framework that hierarchically integrates schema, skeleton, and execution corrections. EMLC incorporates a dual-validation schema correction mechanism that combines large language model (LLM)-based prediction with token-level mapping for precise schema alignment. Moreover, it employs supervised fine-tuning skeleton generation to detect and correct keyword-level errors through abstract skeleton comparison, while the executability verification strategy is designed to further ensure both syntactic integrity and semantic fidelity of generated queries. EMLC supports plug-and-play integration with mainstream LLMs and flexible scalability. Experiments on the SPIDER and BIRD datasets show that EMLC achieves state-of-the-art execution accuracy, outperforming baseline methods by 2–4 %. Ablation studies further validate the individual contributions of each component and their synergistic effects.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104560"},"PeriodicalIF":6.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1016/j.ipm.2025.104563
Chong Liu , Zaiwen Feng , Zhenyun Deng , Lin Liu , Jiuyong Li , Ruifang Zhai , Debo Cheng , Li Qin
Plausibility Estimation (PE) plays a crucial role for enabling language models to objectively comprehend the real world. While large language models (LLMs) demonstrate remarkable capabilities in PE tasks but sometimes produce trivial commonsense errors due to the complexity of commonsense knowledge. They lack two key traits of an ideal PE model: a) Language-explainable: relying on critical word segments for decisions, b) Commonsense-sensitive: detecting subtle linguistic variations in commonsense. To address these issues, we propose a novel model-agnostic method, referred to as Commonsense Counterfactual Samples Generating (CCSG). By training PE models with CCSG, we encourage them to focus on critical words, thereby enhancing both their language-explainable and commonsense-sensitive capabilities. Specifically, CCSG generates counterfactual samples by strategically replacing key words and introducing low-level dropout within sentences. These counterfactual samples are then incorporated into a sentence-level contrastive training framework to further enhance the model’s learning process. Experimental results across nine diverse datasets demonstrate the effectiveness of CCSG in addressing commonsense reasoning challenges, with our CCSG method showing 3.07% improvement against the SOTA methods.
{"title":"Counterfactual samples constructing and training for commonsense statements estimation","authors":"Chong Liu , Zaiwen Feng , Zhenyun Deng , Lin Liu , Jiuyong Li , Ruifang Zhai , Debo Cheng , Li Qin","doi":"10.1016/j.ipm.2025.104563","DOIUrl":"10.1016/j.ipm.2025.104563","url":null,"abstract":"<div><div>Plausibility Estimation (PE) plays a crucial role for enabling language models to objectively comprehend the real world. While large language models (LLMs) demonstrate remarkable capabilities in PE tasks but sometimes produce trivial commonsense errors due to the complexity of commonsense knowledge. They lack two key traits of an ideal PE model: a) <em>Language-explainable</em>: relying on critical word segments for decisions, b) <em>Commonsense-sensitive</em>: detecting subtle linguistic variations in commonsense. To address these issues, we propose a novel model-agnostic method, referred to as <strong>C</strong>ommonsense <strong>C</strong>ounterfactual <strong>S</strong>amples <strong>G</strong>enerating (<strong>CCSG</strong>). By training PE models with CCSG, we encourage them to focus on critical words, thereby enhancing both their language-explainable and commonsense-sensitive capabilities. Specifically, CCSG generates counterfactual samples by strategically replacing key words and introducing low-level dropout within sentences. These counterfactual samples are then incorporated into a sentence-level contrastive training framework to further enhance the model’s learning process. Experimental results across nine diverse datasets demonstrate the effectiveness of CCSG in addressing commonsense reasoning challenges, with our CCSG method showing 3.07% improvement against the SOTA methods.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104563"},"PeriodicalIF":6.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adaptive dynamic retrieval-augmented generation (RAG) paradigm dynamically determines whether the large language model (LLM) needs to activate the retrieval step during the generation process, and accordingly formulates appropriate queries for retrieval. This paradigm has two key components: determining the optimal moment to activate the retrieval module (when to retrieve) and formulating the appropriate query after retrieval is triggered (what to retrieve). However, existing adaptive dynamic RAG methods rely on the internal knowledge of the LLM to trigger the retrieval process and formulate retrieval queries, largely neglecting the significance of the external query knowledge. This leads to unreliable retrieval timing and the inability to retrieve truly relevant documents. To overcome these limitations, we introduce a new adaptive dynamic RAG framework, Internal-External Knowledge Integration based Retrieval (INKER), which integrates both internal and external knowledge involved in the LLM text generation process to decide when and what to retrieve. Experiments on 2WikiMultihopQA, HotpotQA, StrategyQA, and Natural Questions (NQ) demonstrate that INKER outperforms six advanced RAG methods in terms of accuracy, while also reducing retrieval frequency by approximately 40 % on average, verifying the effectiveness of INKER and its components.
{"title":"INKER: Adaptive dynamic retrieval augmented generation with internal-external knowledge integration","authors":"Mingjun Zhou, Jiuyang Tang, Weixin Zeng, Xiang Zhao","doi":"10.1016/j.ipm.2025.104534","DOIUrl":"10.1016/j.ipm.2025.104534","url":null,"abstract":"<div><div>Adaptive dynamic retrieval-augmented generation (RAG) paradigm dynamically determines whether the large language model (LLM) needs to activate the retrieval step during the generation process, and accordingly formulates appropriate queries for retrieval. This paradigm has two key components: determining the optimal moment to activate the retrieval module (<em>when to retrieve</em>) and formulating the appropriate query after retrieval is triggered (<em>what to retrieve</em>). However, existing adaptive dynamic RAG methods rely on the internal knowledge of the LLM to trigger the retrieval process and formulate retrieval queries, largely neglecting the significance of the external query knowledge. This leads to unreliable retrieval timing and the inability to retrieve truly relevant documents. To overcome these limitations, we introduce a new adaptive dynamic RAG framework, <em>Internal-External Knowledge Integration based Retrieval</em> (<span>INKER</span>), which integrates both internal and external knowledge involved in the LLM text generation process to decide when and what to retrieve. Experiments on 2WikiMultihopQA, HotpotQA, StrategyQA, and Natural Questions (NQ) demonstrate that <span>INKER</span> outperforms six advanced RAG methods in terms of accuracy, while also reducing retrieval frequency by approximately 40 % on average, verifying the effectiveness of <span>INKER</span> and its components.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 3","pages":"Article 104534"},"PeriodicalIF":6.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}