首页 > 最新文献

Information Fusion最新文献

英文 中文
GC-Fed: Gradient centralized federated learning with partial client participation GC-Fed:部分客户参与的梯度集中式联邦学习
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-13 DOI: 10.1016/j.inffus.2026.104148
Jungwon Seo , Ferhat Ozgur Catak , Chunming Rong , Kibeom Hong , Minhoe Kim
Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but suffers from client drift in highly heterogeneous data settings. Many existing approaches mitigate drift by providing clients with common reference points, typically derived from past information, to align objectives or gradient directions. However, under severe partial participation, such history-dependent references may become unreliable, as the set of client data distributions participating in each round can vary drastically. To overcome this limitation, we propose a method that mitigates client drift without relying on past information by constraining the update space through Gradient Centralization (GC). Specifically, we introduce Local GC and Global GC, which apply GC at the local and global update stages, respectively, and further present GC-Fed, a hybrid formulation that generalizes both. Theoretical analysis and extensive experiments on benchmark FL tasks demonstrate that GC-Fed effectively alleviates client drift and achieves up to 20 % accuracy improvement under data heterogeneous and partial participation conditions.
联邦学习(FL)支持保护隐私的多源信息融合(MSIF),但在高度异构的数据设置中存在客户端漂移问题。许多现有的方法通过向客户提供公共参考点(通常来自过去的信息)来调整目标或梯度方向,从而减轻了漂移。然而,在严重的部分参与下,这种依赖历史的引用可能变得不可靠,因为参与每一轮的客户端数据分布集可能会有很大的变化。为了克服这一限制,我们提出了一种方法,通过梯度集中化(GC)限制更新空间,在不依赖过去信息的情况下减轻客户端漂移。具体来说,我们介绍了本地GC和全局GC,它们分别在本地和全局更新阶段应用GC,并进一步提出了GC- fed,这是一种推广两者的混合公式。对基准FL任务的理论分析和大量实验表明,GC-Fed有效缓解了客户端漂移,在数据异构和部分参与条件下,准确率提高了20%。
{"title":"GC-Fed: Gradient centralized federated learning with partial client participation","authors":"Jungwon Seo ,&nbsp;Ferhat Ozgur Catak ,&nbsp;Chunming Rong ,&nbsp;Kibeom Hong ,&nbsp;Minhoe Kim","doi":"10.1016/j.inffus.2026.104148","DOIUrl":"10.1016/j.inffus.2026.104148","url":null,"abstract":"<div><div>Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but suffers from client drift in highly heterogeneous data settings. Many existing approaches mitigate drift by providing clients with common reference points, typically derived from past information, to align objectives or gradient directions. However, under severe partial participation, such history-dependent references may become unreliable, as the set of client data distributions participating in each round can vary drastically. To overcome this limitation, we propose a method that mitigates client drift without relying on past information by constraining the update space through Gradient Centralization (GC). Specifically, we introduce <span>Local GC</span> and <span>Global GC</span>, which apply GC at the local and global update stages, respectively, and further present <span>GC-Fed</span>, a hybrid formulation that generalizes both. Theoretical analysis and extensive experiments on benchmark FL tasks demonstrate that <span>GC-Fed</span> effectively alleviates client drift and achieves up to 20 % accuracy improvement under data heterogeneous and partial participation conditions.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104148"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145962592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unleashing Mamba’s expressive power: A non-tradeoff approach to spatio-temporal forecasting 释放曼巴的表现力:一种非权衡的时空预测方法
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-22 DOI: 10.1016/j.inffus.2026.104172
Zhiqi Shao , Ze Wang , Haoning Xi , Michael G.H. Bell , Xusheng Yao , D. Glenn Geers , Junbin Gao
Real-time spatiotemporal forecasting, particularly in traffic systems, requires balancing computational cost and predictive accuracy-a challenge that conventional methods struggle to address effectively. In this work, we propose a non-trade-off framework called Spatial-Temporal Selective State Space (ST-Mamba), which leverages two key components to achieve both efficiency and accuracy concurrently. The Spatial-Temporal Mixer (ST-Mixer) dynamically fuses spatial and temporal features to capture complex dependencies, and the STF-Mamba layer incorporates Mamba’s selective state-space formulation to capture long-range dynamics efficiently. Beyond empirical improvements, we address a critical gap in the literature by presenting a theoretical analysis of ST-Mamba’s expressive power. Specifically, we establish its ability to approximate a broad class of Transformer and formally demonstrate its equivalence to at least two consecutive attention layers within the same framework. This result highlights ST-Mamba’s capacity to capture long-range dependencies while reducing computational overhead efficiently, reinforcing its theoretical and practical advantages over conventional transformer-based models. Through extensive evaluations of real-world traffic datasets, ST-Mamba demonstrates a 61.11% reduction in runtime alongside a 0.67% improvement in predictive performance compared to leading approaches, underscoring its potential to set a new benchmark for real-time spatiotemporal forecasting.
实时时空预测,特别是在交通系统中,需要平衡计算成本和预测准确性,这是传统方法难以有效解决的挑战。在这项工作中,我们提出了一个称为时空选择状态空间(ST-Mamba)的非权衡框架,它利用两个关键组件同时实现效率和准确性。时空混频器(ST-Mixer)动态地融合空间和时间特征来捕获复杂的依赖关系,而STF-Mamba层结合了Mamba的选择性状态空间公式来有效地捕获远程动态。除了经验的改进,我们通过提出st -曼巴的表达能力的理论分析,解决了一个关键的差距在文献。具体地说,我们建立了它近似于一个广泛的Transformer类的能力,并正式证明了它与同一框架内至少两个连续的注意层的等价性。这一结果突出了ST-Mamba在有效减少计算开销的同时捕获远程依赖关系的能力,加强了其与传统变压器模型相比的理论和实践优势。通过对现实世界交通数据集的广泛评估,ST-Mamba显示,与领先的方法相比,运行时间减少了61.11%,预测性能提高了0.67%,强调了其为实时时空预测设定新基准的潜力。
{"title":"Unleashing Mamba’s expressive power: A non-tradeoff approach to spatio-temporal forecasting","authors":"Zhiqi Shao ,&nbsp;Ze Wang ,&nbsp;Haoning Xi ,&nbsp;Michael G.H. Bell ,&nbsp;Xusheng Yao ,&nbsp;D. Glenn Geers ,&nbsp;Junbin Gao","doi":"10.1016/j.inffus.2026.104172","DOIUrl":"10.1016/j.inffus.2026.104172","url":null,"abstract":"<div><div>Real-time spatiotemporal forecasting, particularly in traffic systems, requires balancing computational cost and predictive accuracy-a challenge that conventional methods struggle to address effectively. In this work, we propose a non-trade-off framework called Spatial-Temporal Selective State Space (ST-Mamba), which leverages two key components to achieve both efficiency and accuracy concurrently. The Spatial-Temporal Mixer (ST-Mixer) dynamically fuses spatial and temporal features to capture complex dependencies, and the STF-Mamba layer incorporates Mamba’s selective state-space formulation to capture long-range dynamics efficiently. Beyond empirical improvements, we address a critical gap in the literature by presenting a theoretical analysis of ST-Mamba’s expressive power. Specifically, we establish its ability to approximate a broad class of Transformer and formally demonstrate its equivalence to at least two consecutive attention layers within the same framework. This result highlights ST-Mamba’s capacity to capture long-range dependencies while reducing computational overhead efficiently, reinforcing its theoretical and practical advantages over conventional transformer-based models. Through extensive evaluations of real-world traffic datasets, <span>ST-Mamba</span> demonstrates a 61.11% reduction in runtime alongside a 0.67% improvement in predictive performance compared to leading approaches, underscoring its potential to set a new benchmark for real-time spatiotemporal forecasting.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104172"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PromptMix: LLM-aided prompt learning for generalizing vision-language models PromptMix:用于泛化视觉语言模型的llm辅助提示学习
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-23 DOI: 10.1016/j.inffus.2026.104186
Yongcai Chen , Qinghua Zhang , Xinfa Shi , Lei Zhang
Intelligent engineering tasks step into real application with the development of deep learning techniques. However, performance in real conditions often falls into decline caused by scarce data, or subtle, easily confused patterns. Although vision-language models with prompt learning provide a new way for learning without retraining the backbone, these approaches still suffer from problems of overfitting under low-data regimes or poor expressive ability of prompts. To address these challenges, we propose a novel framework PromptMix that jointly considers semantic prompt learning, multimodal information fusion, and the alignment between pre-trained and domain-specific data. Specifically, PromptMix integrates three key components: (1) a Modality-Agnostic Shared Representation module to construct a shared latent space that mitigates the distribution discrepancies between pre-trained and target data, (2) a LLM-Aided Prompt Evolution mechanism to semantically enrich and iteratively refine learnable context prompts, and (3) a Cross-Attentive Adapter to enhance multimodal information fusion and robustness under low-sample conditions. Experiments on seven datasets, including six public benchmarks and one custom industrial dataset, demonstrate that PromptMix effectively enhances vision-language model adaptability, improves semantic representations, and achieves robust generalization under both base-to-novel and few-shot learning scenarios, delivering superior performance in engineering applications with limited labeled data.
随着深度学习技术的发展,智能工程任务逐步进入实际应用。然而,在实际情况下,由于数据稀缺或微妙的、容易混淆的模式,性能往往会下降。尽管具有提示学习的视觉语言模型提供了一种无需再训练主干的新学习方法,但这些方法在低数据情况下仍然存在过拟合或提示表达能力差的问题。为了解决这些挑战,我们提出了一个新的框架PromptMix,它联合考虑了语义提示学习、多模态信息融合以及预训练数据和特定领域数据之间的对齐。具体来说,PromptMix集成了三个关键组件:(1)模态不可知共享表示模块,用于构建共享潜在空间,以减轻预训练数据和目标数据之间的分布差异;(2)llm辅助提示进化机制,用于语义丰富和迭代细化可学习的上下文提示;(3)交叉关注适配器,用于增强低样本条件下的多模态信息融合和鲁棒性。在包括6个公共基准和1个自定义工业数据集在内的7个数据集上进行的实验表明,PromptMix有效地增强了视觉语言模型的适应性,改善了语义表示,并在基础到新颖和少量学习场景下实现了鲁棒泛化,在标记数据有限的工程应用中提供了卓越的性能。
{"title":"PromptMix: LLM-aided prompt learning for generalizing vision-language models","authors":"Yongcai Chen ,&nbsp;Qinghua Zhang ,&nbsp;Xinfa Shi ,&nbsp;Lei Zhang","doi":"10.1016/j.inffus.2026.104186","DOIUrl":"10.1016/j.inffus.2026.104186","url":null,"abstract":"<div><div>Intelligent engineering tasks step into real application with the development of deep learning techniques. However, performance in real conditions often falls into decline caused by scarce data, or subtle, easily confused patterns. Although vision-language models with prompt learning provide a new way for learning without retraining the backbone, these approaches still suffer from problems of overfitting under low-data regimes or poor expressive ability of prompts. To address these challenges, we propose a novel framework <em>PromptMix</em> that jointly considers semantic prompt learning, multimodal information fusion, and the alignment between pre-trained and domain-specific data. Specifically, PromptMix integrates three key components: (1) a <em>Modality-Agnostic Shared Representation</em> module to construct a shared latent space that mitigates the distribution discrepancies between pre-trained and target data, (2) a <em>LLM-Aided Prompt Evolution</em> mechanism to semantically enrich and iteratively refine learnable context prompts, and (3) a <em>Cross-Attentive Adapter</em> to enhance multimodal information fusion and robustness under low-sample conditions. Experiments on seven datasets, including six public benchmarks and one custom industrial dataset, demonstrate that PromptMix effectively enhances vision-language model adaptability, improves semantic representations, and achieves robust generalization under both base-to-novel and few-shot learning scenarios, delivering superior performance in engineering applications with limited labeled data.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104186"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial perturbation for RGB-T tracking via intra-modal excavation and cross-modal collusion 基于模态内挖掘和跨模态合谋的RGB-T跟踪的对抗性扰动
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-23 DOI: 10.1016/j.inffus.2026.104183
Xinyu Xiang , Xuying Wu , Shengxiang Li , Qinglong Yan , Tong Zou , Hao Zhang , Jiayi Ma
Existing adversarial perturbation attack for visual object trackers mainly focuses on RGB modality, yet research on RGB-T trackers’ adversarial perturbation remains unexplored. To address this gap, we propose an Intra-modal excavation and Cross-modal collusion adversarial perturbation attack algorithm (ICAttack) for RGB-T Tracking. Firstly, we establish a novel intra-modal adversarial clues excavation (ImAE) paradigm. By leveraging the unique distribution properties of each modality as a prior, we independently extract the attack cues of different modalities from the public noise space. Building upon this, we develop a cross-modal adversarial collusion (CmAC) strategy, which enables implicit and dynamic interaction between the adversarial tokens of two modalities. This interaction facilitates negotiation and collaboration, achieving a synergistic attack gain for RGB-T trackers that surpasses the effect of a single-modality attack. The above process, from intra-modal excavation to cross-modal collusion, creates a progressive and systematic attack framework for RGB-T trackers. Besides, by introducing the spatial adversarial intensity control module and precise response disruption loss, we further enhance both the attack stealthiness and precision of our adversarial perturbations. The control module reduces attack strength in less critical areas to improve stealth. The disruption loss uses a small mask on the tracker’s brightest semantic response region, concentrating the perturbation to interfere with the tracker’s target awareness precisely. Extensive evaluations of attack performances in different SOTA victimized RGB-T trackers demonstrate the advantages of ICAttack in terms of specificity and effectiveness of cross-modal attacks. Moreover, we offer a user-friendly interface to promote the practical deployment of adversarial perturbations. Our code is publicly available at https://github.com/Xinyu-Xiang/ICAttack.
现有的针对视觉目标跟踪器的对抗性摄动攻击主要集中在RGB模态上,而针对RGB- t跟踪器的对抗性摄动攻击的研究还很少。为了解决这一差距,我们提出了一种用于RGB-T跟踪的模内挖掘和跨模态合谋对抗摄动攻击算法(ICAttack)。首先,我们建立了一种新的模态内对抗线索挖掘(ImAE)范式。通过利用每个模态的独特分布特性作为先验,我们独立地从公共噪声空间中提取不同模态的攻击线索。在此基础上,我们开发了一种跨模态对抗性共谋(CmAC)策略,该策略使两种模态对抗性令牌之间的隐式和动态交互成为可能。这种交互促进了协商和协作,实现了RGB-T跟踪器的协同攻击增益,超过了单模态攻击的影响。上述过程从模态内挖掘到跨模态串通,为RGB-T跟踪器创建了一个渐进的、系统的攻击框架。此外,通过引入空间对抗强度控制模块和精确响应干扰损失,进一步提高了对抗摄动的攻击隐身性和精度。控制模块在不太关键的区域降低攻击强度,以提高隐身性。干扰损失在跟踪器最亮的语义响应区域上使用一个小掩模,集中扰动来精确干扰跟踪器的目标感知。对不同SOTA受害RGB-T跟踪器攻击性能的广泛评估表明,ICAttack在跨模式攻击的特异性和有效性方面具有优势。此外,我们提供了一个用户友好的界面,以促进对抗性扰动的实际部署。我们的代码可以在https://github.com/Xinyu-Xiang/ICAttack上公开获得。
{"title":"Adversarial perturbation for RGB-T tracking via intra-modal excavation and cross-modal collusion","authors":"Xinyu Xiang ,&nbsp;Xuying Wu ,&nbsp;Shengxiang Li ,&nbsp;Qinglong Yan ,&nbsp;Tong Zou ,&nbsp;Hao Zhang ,&nbsp;Jiayi Ma","doi":"10.1016/j.inffus.2026.104183","DOIUrl":"10.1016/j.inffus.2026.104183","url":null,"abstract":"<div><div>Existing adversarial perturbation attack for visual object trackers mainly focuses on RGB modality, yet research on RGB-T trackers’ adversarial perturbation remains unexplored. To address this gap, we propose an <strong>I</strong>ntra-modal excavation and <strong>C</strong>ross-modal collusion adversarial perturbation attack algorithm (ICAttack) for RGB-T Tracking. Firstly, we establish a novel intra-modal adversarial clues excavation (ImAE) paradigm. By leveraging the unique distribution properties of each modality as a prior, we independently extract the attack cues of different modalities from the public noise space. Building upon this, we develop a cross-modal adversarial collusion (CmAC) strategy, which enables implicit and dynamic interaction between the adversarial tokens of two modalities. This interaction facilitates negotiation and collaboration, achieving a synergistic attack gain for RGB-T trackers that surpasses the effect of a single-modality attack. The above process, from intra-modal excavation to cross-modal collusion, creates a progressive and systematic attack framework for RGB-T trackers. Besides, by introducing the spatial adversarial intensity control module and precise response disruption loss, we further enhance both the attack stealthiness and precision of our adversarial perturbations. The control module reduces attack strength in less critical areas to improve stealth. The disruption loss uses a small mask on the tracker’s brightest semantic response region, concentrating the perturbation to interfere with the tracker’s target awareness precisely. Extensive evaluations of attack performances in different SOTA victimized RGB-T trackers demonstrate the advantages of ICAttack in terms of specificity and effectiveness of cross-modal attacks. Moreover, we offer a user-friendly interface to promote the practical deployment of adversarial perturbations. Our code is publicly available at <span><span>https://github.com/Xinyu-Xiang/ICAttack</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104183"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple channel access and power control for discount-average weighting criterion over multi-sensor and Markovian fading environments 基于折扣平均加权准则的多传感器和马尔可夫衰落环境下的多通道接入和功率控制
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.inffus.2026.104191
Yunbo Song , Jianrong Zhao , Kangkai Zheng , Ticao Jiao
This paper investigates the joint design of multiple channel access and power control for multi-sensor remote estimation. Smart sensors with energy constraints transmit their local estimates over sharing Markovian fading channels. A novel discount-average weighting criterion (DAWC) is introduced in the infinite horizon, which balances immediate and long-term transmission performance, unlike traditional criteria that focus on one aspect. We formulate the co-design issue including channel selection and power allocation as a Markov decision process (MDP) with DAWC. The existence of ϵ-optimal policy is presented for ergodic MDP via a model checking method, and the switch-like optimal transmission policy is derived from the set of randomized Markov strategies. Further, we prove the existence of ϵ-s-optimal policy that is an ultimately deterministic policy for general MDP. An elaborately devised algorithm is employed to generate optimal transmission decisions utilizing a forward iterative approach. Finally, an example of turbofan engine speed regulation is applied to demonstrate the superiority of previous results.
本文研究了多通道接入和多传感器远程估计功率控制的联合设计。具有能量约束的智能传感器通过共享马尔可夫衰落信道传输其局部估计。在无限视界中引入了一种新的折扣平均加权准则(DAWC),它能平衡当前和长期传输性能,而不是传统的标准只关注一个方面。我们将包括信道选择和功率分配在内的协同设计问题与DAWC一起制定为马尔可夫决策过程(MDP)。通过模型检验方法,给出了遍历MDP中ϵ-optimal策略的存在性,并从随机马尔可夫策略集导出了类开关的最优传输策略。进一步证明了ϵ-s-optimal策略的存在性,该策略是一般MDP的最终确定性策略。采用一种精心设计的算法,利用前向迭代方法生成最优传输决策。最后,以涡扇发动机转速调节为例,验证了上述结果的优越性。
{"title":"Multiple channel access and power control for discount-average weighting criterion over multi-sensor and Markovian fading environments","authors":"Yunbo Song ,&nbsp;Jianrong Zhao ,&nbsp;Kangkai Zheng ,&nbsp;Ticao Jiao","doi":"10.1016/j.inffus.2026.104191","DOIUrl":"10.1016/j.inffus.2026.104191","url":null,"abstract":"<div><div>This paper investigates the joint design of multiple channel access and power control for multi-sensor remote estimation. Smart sensors with energy constraints transmit their local estimates over sharing Markovian fading channels. A novel discount-average weighting criterion (DAWC) is introduced in the infinite horizon, which balances immediate and long-term transmission performance, unlike traditional criteria that focus on one aspect. We formulate the co-design issue including channel selection and power allocation as a Markov decision process (MDP) with DAWC. The existence of ϵ-optimal policy is presented for ergodic MDP via a model checking method, and the switch-like optimal transmission policy is derived from the set of randomized Markov strategies. Further, we prove the existence of ϵ-s-optimal policy that is an ultimately deterministic policy for general MDP. An elaborately devised algorithm is employed to generate optimal transmission decisions utilizing a forward iterative approach. Finally, an example of turbofan engine speed regulation is applied to demonstrate the superiority of previous results.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104191"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large multimodal models for low-resource languages: A survey 低资源语言的大型多模态模型:综述
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-27 DOI: 10.1016/j.inffus.2026.104189
Marian Lupaşcu, Ana-Cristina Rogoz, Mihai Sorin Stupariu, Radu Tudor Ionescu
In this survey, we systematically analyze techniques used to adapt large multimodal models (LMMs) for low-resource (LR) languages, examining approaches ranging from visual enhancement and data creation to cross-modal transfer and fusion strategies. Through a comprehensive analysis of 117 studies across 96 LR languages, we identify key patterns in how researchers tackle the challenges of limited data and computational resources. We categorize works into resource-oriented and method-oriented contributions, further dividing contributions into relevant sub-categories. We compare method-oriented contributions in terms of performance and efficiency, discussing benefits and limitations of representative studies. We find that visual information often serves as a crucial bridge for improving model performance in LR settings, though significant challenges remain in areas such as hallucination mitigation and computational efficiency. In summary, we provide researchers with a clear understanding of current approaches and remaining challenges in making LMMs more accessible to speakers of LR (understudied) languages. We complement our survey with an open-source repository available at: https://github.com/marianlupascu/LMM4LRL-Survey.
在这项调查中,我们系统地分析了用于适应低资源(LR)语言的大型多模态模型(lmm)的技术,研究了从视觉增强和数据创建到跨模态迁移和融合策略的方法。通过对96种LR语言的117项研究的综合分析,我们确定了研究人员如何应对有限数据和计算资源挑战的关键模式。我们将作品分为面向资源的贡献和面向方法的贡献,并进一步将贡献划分为相关的子类别。我们在性能和效率方面比较了以方法为导向的贡献,讨论了代表性研究的优点和局限性。我们发现视觉信息通常是改善LR设置中模型性能的关键桥梁,尽管在幻觉缓解和计算效率等领域仍然存在重大挑战。总之,我们为研究人员提供了清晰的理解,使lmm更容易被LR(未充分研究)语言的使用者使用,目前的方法和仍然存在的挑战。我们提供了一个开源存储库来补充我们的调查:https://github.com/marianlupascu/LMM4LRL-Survey。
{"title":"Large multimodal models for low-resource languages: A survey","authors":"Marian Lupaşcu,&nbsp;Ana-Cristina Rogoz,&nbsp;Mihai Sorin Stupariu,&nbsp;Radu Tudor Ionescu","doi":"10.1016/j.inffus.2026.104189","DOIUrl":"10.1016/j.inffus.2026.104189","url":null,"abstract":"<div><div>In this survey, we systematically analyze techniques used to adapt large multimodal models (LMMs) for low-resource (LR) languages, examining approaches ranging from visual enhancement and data creation to cross-modal transfer and fusion strategies. Through a comprehensive analysis of 117 studies across 96 LR languages, we identify key patterns in how researchers tackle the challenges of limited data and computational resources. We categorize works into resource-oriented and method-oriented contributions, further dividing contributions into relevant sub-categories. We compare method-oriented contributions in terms of performance and efficiency, discussing benefits and limitations of representative studies. We find that visual information often serves as a crucial bridge for improving model performance in LR settings, though significant challenges remain in areas such as hallucination mitigation and computational efficiency. In summary, we provide researchers with a clear understanding of current approaches and remaining challenges in making LMMs more accessible to speakers of LR (understudied) languages. We complement our survey with an open-source repository available at: <span><span>https://github.com/marianlupascu/LMM4LRL-Survey</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104189"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive virtual anchors for efficient and stable clustering over large multi-view attributed graphs 大型多视图属性图高效稳定聚类的自适应虚拟锚
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-28 DOI: 10.1016/j.inffus.2026.104190
Mengyao Li , Zhibang Yang , Xu Zhou , Joey Tianyi Zhou , Quanqing Xu , Chuanhui Yang , Kenli Li , Keqin Li
Multi-view attributed graphs (MVAG) are well-known for their ability to model complex networks and relationships, which can provide diverse yet complementary information for finding a consensus partition suitable for all views. There have been abundant methods for clustering over multi-view attributed graphs. However, most of them are not suitable for large-scale graphs due to high complexity. Moreover, while existing anchor-based methods can effectively accelerate clustering, they mainly focus on either attribute information or graph structure during anchor selection, and some suffer from stability issues. Inspired by this, in this paper, we propose the adaptive virtual anchor clustering method (AVAC) to boost clustering performance and keep stable results. In particular, we first introduce adaptive virtual anchors for multi-view attributed graphs, which are learned and generated from graphs adaptively. After that, we connect anchor learning and anchor graph construction closely and cyclically to learn virtual anchors dynamically and make them capture real data distribution and topology information more accurately. Last but not least, we design a five-block coordinate descent method with proven convergence to further optimize our virtual anchors more representative of existing nodes. Extensive experiments over both real and synthetic datasets demonstrate the effectiveness, efficiency, and stability of our method. Compared to state-of-the-art approaches, the AVAC algorithm always gains stable results with a significant improvement in accuracy, and achieves a speedup of 1.8 times on public large-scale datasets. The source code is available at https://github.com/lmyfree/AVAC.
多视图属性图(MVAG)以其对复杂网络和关系建模的能力而闻名,它可以为找到适合所有视图的共识分区提供多样化但互补的信息。多视图属性图的聚类方法很多。然而,由于复杂度高,大多数算法不适合大规模图。此外,现有的基于锚点的聚类方法虽然可以有效地加速聚类,但在锚点选择过程中主要关注属性信息或图结构,有些方法存在稳定性问题。受此启发,本文提出了自适应虚拟锚点聚类方法(AVAC)来提高聚类性能并保持聚类结果的稳定性。特别地,我们首先引入了多视图属性图的自适应虚拟锚,它是自适应地从图中学习和生成的。然后将锚点学习与构建锚点图紧密循环地联系起来,动态地学习虚拟锚点,使其更准确地捕捉真实的数据分布和拓扑信息。最后但并非最不重要的是,我们设计了一个经过验证的五块坐标下降方法,以进一步优化我们的虚拟锚点,使其更能代表现有节点。在真实和合成数据集上进行的大量实验证明了我们方法的有效性、效率和稳定性。与现有的方法相比,AVAC算法的结果稳定,精度显著提高,在公共大规模数据集上的加速达到1.8倍。源代码可从https://github.com/lmyfree/AVAC获得。
{"title":"Adaptive virtual anchors for efficient and stable clustering over large multi-view attributed graphs","authors":"Mengyao Li ,&nbsp;Zhibang Yang ,&nbsp;Xu Zhou ,&nbsp;Joey Tianyi Zhou ,&nbsp;Quanqing Xu ,&nbsp;Chuanhui Yang ,&nbsp;Kenli Li ,&nbsp;Keqin Li","doi":"10.1016/j.inffus.2026.104190","DOIUrl":"10.1016/j.inffus.2026.104190","url":null,"abstract":"<div><div>Multi-view attributed graphs (MVAG) are well-known for their ability to model complex networks and relationships, which can provide diverse yet complementary information for finding a consensus partition suitable for all views. There have been abundant methods for clustering over multi-view attributed graphs. However, most of them are not suitable for large-scale graphs due to high complexity. Moreover, while existing anchor-based methods can effectively accelerate clustering, they mainly focus on either attribute information or graph structure during anchor selection, and some suffer from stability issues. Inspired by this, in this paper, we propose the adaptive virtual anchor clustering method (AVAC) to boost clustering performance and keep stable results. In particular, we first introduce adaptive virtual anchors for multi-view attributed graphs, which are learned and generated from graphs adaptively. After that, we connect anchor learning and anchor graph construction closely and cyclically to learn virtual anchors dynamically and make them capture real data distribution and topology information more accurately. Last but not least, we design a five-block coordinate descent method with proven convergence to further optimize our virtual anchors more representative of existing nodes. Extensive experiments over both real and synthetic datasets demonstrate the effectiveness, efficiency, and stability of our method. Compared to state-of-the-art approaches, the AVAC algorithm always gains stable results with a significant improvement in accuracy, and achieves a speedup of 1.8 times on public large-scale datasets. The source code is available at <span><span>https://github.com/lmyfree/AVAC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104190"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel knowledge distillation and hybrid explainability approach for phenology stage classification from multi-source time series 一种新的多源时间序列物候阶段分类的知识蒸馏和混合可解释性方法
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-16 DOI: 10.1016/j.inffus.2026.104158
Naeem Ullah , Andrés Manuel Chacón-Maldonado , Francisco Martínez-Álvarez , Ivanoe De Falco , Giovanna Sannino
Accurate phenological stage classification is crucial for addressing global challenges to food security posed by climate change, water scarcity, and land degradation. It enables precision agriculture by optimizing key interventions such as irrigation, fertilization, and pest control. While deep learning offers powerful tools, existing methods face four key limitations: reliance on narrow features and models, limited long-term forecasting capability, computational inefficiency, and opaque, unvalidated explanations. To overcome these limitations, this paper presents a deep learning framework for phenology classification, utilizing multi-source time series data from satellite imagery, meteorological stations, and field observations. The approach emphasizes temporal consistency, spatial adaptability, computational efficiency, and explainability. A feature engineering pipeline extracts temporal dynamics via lag features, rolling statistics, Fourier transforms and seasonal encodings. Feature selection combines incremental strategies with classical filter, wrapper, and embedded methods. Deep learning models across multiple paradigms-feedforward, recurrent, convolutional, and attention-based-are benchmarked under multi-horizon forecasting tasks. To reduce model complexity while preserving performance where possible, the framework employs knowledge distillation, transferring predictive knowledge from complex teacher models to compact and deployable student models. For model interpretability, a new Hybrid SHAP-Association Rule Explainability approach is proposed, integrating model-driven and data-driven explanations. Agreement between views is quantified using trust metrics: precision@k, coverage, and Jaccard similarity, with a retraining-based validation mechanism. Experiments on phenology data from Andalusia demonstrate high accuracy, strong generalizability, trustworthy explanations and resource-efficient phenology monitoring in agricultural systems.
准确的物候阶段分类对于应对气候变化、水资源短缺和土地退化给粮食安全带来的全球挑战至关重要。它通过优化灌溉、施肥和病虫害防治等关键干预措施,实现精准农业。虽然深度学习提供了强大的工具,但现有方法面临四个关键限制:依赖狭窄的特征和模型,有限的长期预测能力,计算效率低下,以及不透明、未经验证的解释。为了克服这些限制,本文提出了一个物候分类的深度学习框架,利用来自卫星图像、气象站和野外观测的多源时间序列数据。该方法强调时间一致性、空间适应性、计算效率和可解释性。特征工程管道通过滞后特征、滚动统计、傅里叶变换和季节编码提取时间动态。特征选择将增量策略与经典的过滤、包装和嵌入方法相结合。跨多种范式的深度学习模型-前馈,循环,卷积和基于注意-在多视界预测任务下进行基准测试。为了在尽可能保持性能的同时降低模型复杂性,该框架采用了知识蒸馏,将预测知识从复杂的教师模型转移到紧凑且可部署的学生模型。在模型可解释性方面,提出了一种新的混合shap -关联规则可解释性方法,将模型驱动和数据驱动的解释相结合。视图之间的一致性使用信任度量来量化:precision@k、覆盖率和Jaccard相似性,以及基于再训练的验证机制。对安达卢西亚物候数据的实验表明,物候数据具有较高的准确性、较强的通用性、可靠的解释和资源效率。
{"title":"A novel knowledge distillation and hybrid explainability approach for phenology stage classification from multi-source time series","authors":"Naeem Ullah ,&nbsp;Andrés Manuel Chacón-Maldonado ,&nbsp;Francisco Martínez-Álvarez ,&nbsp;Ivanoe De Falco ,&nbsp;Giovanna Sannino","doi":"10.1016/j.inffus.2026.104158","DOIUrl":"10.1016/j.inffus.2026.104158","url":null,"abstract":"<div><div>Accurate phenological stage classification is crucial for addressing global challenges to food security posed by climate change, water scarcity, and land degradation. It enables precision agriculture by optimizing key interventions such as irrigation, fertilization, and pest control. While deep learning offers powerful tools, existing methods face four key limitations: reliance on narrow features and models, limited long-term forecasting capability, computational inefficiency, and opaque, unvalidated explanations. To overcome these limitations, this paper presents a deep learning framework for phenology classification, utilizing multi-source time series data from satellite imagery, meteorological stations, and field observations. The approach emphasizes temporal consistency, spatial adaptability, computational efficiency, and explainability. A feature engineering pipeline extracts temporal dynamics via lag features, rolling statistics, Fourier transforms and seasonal encodings. Feature selection combines incremental strategies with classical filter, wrapper, and embedded methods. Deep learning models across multiple paradigms-feedforward, recurrent, convolutional, and attention-based-are benchmarked under multi-horizon forecasting tasks. To reduce model complexity while preserving performance where possible, the framework employs knowledge distillation, transferring predictive knowledge from complex teacher models to compact and deployable student models. For model interpretability, a new Hybrid SHAP-Association Rule Explainability approach is proposed, integrating model-driven and data-driven explanations. Agreement between views is quantified using trust metrics: precision@k, coverage, and Jaccard similarity, with a retraining-based validation mechanism. Experiments on phenology data from Andalusia demonstrate high accuracy, strong generalizability, trustworthy explanations and resource-efficient phenology monitoring in agricultural systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104158"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lifting wavelet transform-guided network with histogram attention for liver segmentation in CT scans 基于直方图关注的提升小波变换引导网络在CT肝脏分割中的应用
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-16 DOI: 10.1016/j.inffus.2026.104153
Huaxiang Liu , Wei Sun , Youyao Fu , Shiqing Zhang , Jie Jin , Jiangxiong Fang , Binliang Wang
Accurate liver segmentation in computed tomography (CT) scans is crucial for the diagnosis of hepatocellular carcinoma and surgical planning; however, manual delineation is laborious and prone to operator variability. Existing deep learning methods frequently sacrifice precise boundary delineation when expanding receptive fields or fail to leverage frequency-domain cues that encode global shape, while conventional attention mechanisms are less effective in processing low-contrast images. To address these challenges, we introduce LWT-Net, a novel network guided by a trainable lifting wavelet transform, incorporating a frequency-split histogram attention mechanism to enhance liver segmentation. LWT-Net incorporates a trainable lifting wavelet transform within an encoder-decoder framework to hierarchically decompose features into low-frequency components that capture global structure and high-frequency bands that preserve edge and texture details. A complementary inverse lifting stage reconstructs high-resolution features while maintaining spatial consistency. The frequency-spatial fusion module, driven by a histogram-based attention mechanism, performs histogram-guided feature reorganization across global and local bins, while employing self-attention to capture long-range dependencies and prioritize anatomically significant regions. Comprehensive evaluations on the LiTS2017, WORD, and FLARE22 datasets confirm LWT-Net’s superior performance, achieving mean Dice similarity coefficients of 95.96%, 97.15%, and 95.97%.
计算机断层扫描(CT)中准确的肝脏分割对肝癌的诊断和手术计划至关重要;然而,手工描绘是费力的,而且容易受到操作者的变化。现有的深度学习方法在扩展接受域或无法利用编码全局形状的频域线索时,往往会牺牲精确的边界描绘,而传统的注意机制在处理低对比度图像时效果较差。为了解决这些挑战,我们引入了LWT-Net,这是一种由可训练提升小波变换引导的新型网络,结合了频率分裂直方图注意机制来增强肝脏分割。LWT-Net在编码器-解码器框架内结合了可训练的提升小波变换,分层次将特征分解为捕获全局结构的低频分量和保留边缘和纹理细节的高频波段。互补的逆提升阶段重建高分辨率特征,同时保持空间一致性。频率-空间融合模块由基于直方图的注意机制驱动,在全局和局部bins中执行直方图引导的特征重组,同时利用自注意捕获远程依赖关系并优先考虑解剖上重要的区域。在LiTS2017、WORD和FLARE22数据集上的综合评价证实了LWT-Net的优越性能,平均Dice相似系数分别达到95.96%、97.15%和95.97%。
{"title":"Lifting wavelet transform-guided network with histogram attention for liver segmentation in CT scans","authors":"Huaxiang Liu ,&nbsp;Wei Sun ,&nbsp;Youyao Fu ,&nbsp;Shiqing Zhang ,&nbsp;Jie Jin ,&nbsp;Jiangxiong Fang ,&nbsp;Binliang Wang","doi":"10.1016/j.inffus.2026.104153","DOIUrl":"10.1016/j.inffus.2026.104153","url":null,"abstract":"<div><div>Accurate liver segmentation in computed tomography (CT) scans is crucial for the diagnosis of hepatocellular carcinoma and surgical planning; however, manual delineation is laborious and prone to operator variability. Existing deep learning methods frequently sacrifice precise boundary delineation when expanding receptive fields or fail to leverage frequency-domain cues that encode global shape, while conventional attention mechanisms are less effective in processing low-contrast images. To address these challenges, we introduce LWT-Net, a novel network guided by a trainable lifting wavelet transform, incorporating a frequency-split histogram attention mechanism to enhance liver segmentation. LWT-Net incorporates a trainable lifting wavelet transform within an encoder-decoder framework to hierarchically decompose features into low-frequency components that capture global structure and high-frequency bands that preserve edge and texture details. A complementary inverse lifting stage reconstructs high-resolution features while maintaining spatial consistency. The frequency-spatial fusion module, driven by a histogram-based attention mechanism, performs histogram-guided feature reorganization across global and local bins, while employing self-attention to capture long-range dependencies and prioritize anatomically significant regions. Comprehensive evaluations on the LiTS2017, WORD, and FLARE22 datasets confirm LWT-Net’s superior performance, achieving mean Dice similarity coefficients of 95.96%, 97.15%, and 95.97%.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104153"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-layer prompt ensembles: Leveraging system- and user-level instructions for robust LLM-based query expansion and rank fusion 双层提示集成:利用系统级和用户级指令进行稳健的基于llm的查询扩展和秩融合
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-17 DOI: 10.1016/j.inffus.2026.104160
Minghan Li , Ercong Nie , Huiping Huang , Xinxuan Lv , Guodong Zhou
Large Language Models (LLMs) show strong potential for query expansion (QE), but their effectiveness is highly sensitive to prompt design. This paper investigates whether exploiting the system-user prompt distinction in chat-based LLMs improves QE, and how multiple expansions should be combined. We propose Dual-Layer Prompt Ensembles, which pair a behavioural system prompt with varied user prompts to generate diverse expansions, and aggregate their BM25-ranked lists using lightweight SU-RankFusion schemes. Experiments on six heterogeneous datasets show that dual-layer prompting consistently outperforms strong single-prompt baselines. For example, on Touche-2020 a dual-layer configuration improves nDCG@10 from 0.4177 (QE-CoT) to 0.4696, and SU-RankFusion further raises it to 0.4797. On Robust04 and DBPedia, SU-RankFusion improves nDCG@10 over BM25 by 24.7% and 25.5%, respectively, with similar gains on NFCorpus, FiQA, and TREC-COVID. These results demonstrate that system-user prompt ensembles are effective for QE, and that simple fusion transforms prompt-level diversity into stable retrieval improvements.
大型语言模型(llm)具有很强的查询扩展潜力,但其有效性对提示设计高度敏感。本文研究了在基于聊天的llm中利用系统-用户提示区别是否可以提高QE,以及多个扩展应该如何组合。我们提出了双层提示合集,它将行为系统提示与不同的用户提示配对,以生成不同的扩展,并使用轻量级的SU-RankFusion方案聚合它们的bm25排名列表。在六个异构数据集上的实验表明,双层提示始终优于强单提示基线。例如,在touch -2020上,双层配置将nDCG@10从0.4177 (q - cot)提高到0.4696,SU-RankFusion进一步将其提高到0.4797。在Robust04和DBPedia上,SU-RankFusion比BM25分别提高了24.7%和25.5%,在NFCorpus、FiQA和TREC-COVID上也有类似的提高。这些结果表明,系统-用户提示集成对于QE是有效的,并且简单的融合将提示级多样性转化为稳定的检索改进。
{"title":"Dual-layer prompt ensembles: Leveraging system- and user-level instructions for robust LLM-based query expansion and rank fusion","authors":"Minghan Li ,&nbsp;Ercong Nie ,&nbsp;Huiping Huang ,&nbsp;Xinxuan Lv ,&nbsp;Guodong Zhou","doi":"10.1016/j.inffus.2026.104160","DOIUrl":"10.1016/j.inffus.2026.104160","url":null,"abstract":"<div><div>Large Language Models (LLMs) show strong potential for query expansion (QE), but their effectiveness is highly sensitive to prompt design. This paper investigates whether exploiting the system-user prompt distinction in chat-based LLMs improves QE, and how multiple expansions should be combined. We propose Dual-Layer Prompt Ensembles, which pair a behavioural system prompt with varied user prompts to generate diverse expansions, and aggregate their BM25-ranked lists using lightweight SU-RankFusion schemes. Experiments on six heterogeneous datasets show that dual-layer prompting consistently outperforms strong single-prompt baselines. For example, on Touche-2020 a dual-layer configuration improves nDCG@10 from 0.4177 (QE-CoT) to 0.4696, and SU-RankFusion further raises it to 0.4797. On Robust04 and DBPedia, SU-RankFusion improves nDCG@10 over BM25 by 24.7% and 25.5%, respectively, with similar gains on NFCorpus, FiQA, and TREC-COVID. These results demonstrate that system-user prompt ensembles are effective for QE, and that simple fusion transforms prompt-level diversity into stable retrieval improvements.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104160"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Fusion
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1