首页 > 最新文献

Information Fusion最新文献

英文 中文
PGSC: A Gradient Sparsification Communication Optimization Criterion for Nonequilibrium Thermodynamics 非平衡态热力学的梯度稀疏化通信优化准则
IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-27 DOI: 10.1016/j.inffus.2026.104188
Wenlong Zhang, Ying Li, Hanhan Du, Yan Wei, Aiqing Fang
Gradient compression can reduce communication overhead. However, current static sparsity techniques may disturb gradient dynamics, resulting in unstable model convergence and reduced feature discriminative ability, whereas transmitting the complete gradient leads to high costs. To address this issue, inspired by nonequilibrium thermodynamics, this paper proposes a Physics-guided Gradient Sparsification Criterion (PGSC). Specifically, we formulate a continuous field equation based on the gradient magnitude distribution, deriving an adaptive decay rule for the sparsification threshold during the training phase. We then dynamically adjust the sparsification threshold according to this rule, effectively addressing the complexity of multimodal features and ensuring consistent information transmission. Our method achieves adaptive co-optimization of gradient compression and model accuracy by establishing a dynamic equilibrium mechanism between gradient dissipation and information entropy. This approach ensures stable convergence rates while preserving the gradient structure of multi-scale features. Extensive experiments on public datasets, including CIFAR-10, MNIST, and FLIR_ADAS_v2, demonstrate significant advantages over competitors such as TopK and quantization compression, while also reducing communication costs.
梯度压缩可以减少通信开销。然而,现有的静态稀疏性技术可能会干扰梯度动态,导致模型收敛不稳定,特征判别能力下降,而传输完整梯度的成本较高。为了解决这个问题,受非平衡态热力学的启发,本文提出了一个物理引导的梯度稀疏化准则(PGSC)。具体来说,我们基于梯度幅度分布建立了一个连续场方程,推导了训练阶段稀疏化阈值的自适应衰减规则。然后根据该规则动态调整稀疏化阈值,有效地解决了多模态特征的复杂性,保证了信息传输的一致性。该方法通过建立梯度耗散与信息熵之间的动态平衡机制,实现梯度压缩与模型精度的自适应协同优化。该方法在保持多尺度特征梯度结构的同时保证了稳定的收敛速度。在公共数据集(包括CIFAR-10、MNIST和FLIR_ADAS_v2)上进行的大量实验表明,与TopK和量化压缩等竞争对手相比,该算法具有显著优势,同时还降低了通信成本。
{"title":"PGSC: A Gradient Sparsification Communication Optimization Criterion for Nonequilibrium Thermodynamics","authors":"Wenlong Zhang, Ying Li, Hanhan Du, Yan Wei, Aiqing Fang","doi":"10.1016/j.inffus.2026.104188","DOIUrl":"https://doi.org/10.1016/j.inffus.2026.104188","url":null,"abstract":"Gradient compression can reduce communication overhead. However, current static sparsity techniques may disturb gradient dynamics, resulting in unstable model convergence and reduced feature discriminative ability, whereas transmitting the complete gradient leads to high costs. To address this issue, inspired by nonequilibrium thermodynamics, this paper proposes a Physics-guided Gradient Sparsification Criterion (PGSC). Specifically, we formulate a continuous field equation based on the gradient magnitude distribution, deriving an adaptive decay rule for the sparsification threshold during the training phase. We then dynamically adjust the sparsification threshold according to this rule, effectively addressing the complexity of multimodal features and ensuring consistent information transmission. Our method achieves adaptive co-optimization of gradient compression and model accuracy by establishing a dynamic equilibrium mechanism between gradient dissipation and information entropy. This approach ensures stable convergence rates while preserving the gradient structure of multi-scale features. Extensive experiments on public datasets, including CIFAR-10, MNIST, and FLIR_ADAS_v2, demonstrate significant advantages over competitors such as TopK and quantization compression, while also reducing communication costs.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"13 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SU-RMT: Toward Bridging Semantic Representation and Structural Detail Modeling for Medical Image Segmentation 面向医学图像分割的语义表示和结构细节建模
IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-26 DOI: 10.1016/j.inffus.2026.104182
Peibo Song, Zihao Wang, Jinshuo Zhang, Shujun Fu, Yunfeng Zhang, Wei Wu, Fangxun Bao
Accurate medical image segmentation requires models that capture high-level semantics while preserving fine-grained structural details, due to anatomical heterogeneity and subtle textures in clinical scenarios. However, existing U-shaped networks usually lack a unified perspective to reconcile semantic representation with structural detail. To this end, we present SU-RMT, a U-shaped network that embodies this unified perspective by redesigning the encoder, bottleneck, and skip connection. The encoder employs the Dynamic Spatial Attention (DySA) mechanism to capture global context with spatial priors. The bottleneck introduces a Hybrid Spectral Adaptive (HSA) module to transform abstract semantics into structure-aware features. The first skip connection incorporates a Frequency-Fused (F2) block to enhance boundary details without amplifying noise. Across several medical image segmentation tasks, SU-RMT demonstrates strong performance. The code is at the link.
由于临床场景中的解剖异质性和微妙纹理,准确的医学图像分割需要模型捕获高级语义,同时保留细粒度的结构细节。然而,现有的u型网络通常缺乏统一的视角来协调语义表示和结构细节。为此,我们提出了SU-RMT,这是一个u形网络,通过重新设计编码器、瓶颈和跳过连接来体现这种统一的观点。编码器采用动态空间注意(DySA)机制捕捉具有空间先验的全局上下文。瓶颈引入了混合光谱自适应(HSA)模块,将抽象语义转换为结构感知特征。第一跳过连接包含频率融合(F2)块,以增强边界细节而不会放大噪声。在多个医学图像分割任务中,SU-RMT显示出强大的性能。代码在链接处。
{"title":"SU-RMT: Toward Bridging Semantic Representation and Structural Detail Modeling for Medical Image Segmentation","authors":"Peibo Song, Zihao Wang, Jinshuo Zhang, Shujun Fu, Yunfeng Zhang, Wei Wu, Fangxun Bao","doi":"10.1016/j.inffus.2026.104182","DOIUrl":"https://doi.org/10.1016/j.inffus.2026.104182","url":null,"abstract":"Accurate medical image segmentation requires models that capture high-level semantics while preserving fine-grained structural details, due to anatomical heterogeneity and subtle textures in clinical scenarios. However, existing U-shaped networks usually lack a unified perspective to reconcile semantic representation with structural detail. To this end, we present <ce:bold>SU-RMT</ce:bold>, a U-shaped network that embodies this unified perspective by redesigning the encoder, bottleneck, and skip connection. The encoder employs the <ce:bold>Dy</ce:bold>namic <ce:bold>S</ce:bold>patial <ce:bold>A</ce:bold>ttention <ce:bold>(DySA)</ce:bold> mechanism to capture global context with spatial priors. The bottleneck introduces a <ce:bold>H</ce:bold>ybrid <ce:bold>S</ce:bold>pectral <ce:bold>A</ce:bold>daptive <ce:bold>(HSA)</ce:bold> module to transform abstract semantics into structure-aware features. The first skip connection incorporates a <ce:bold>F</ce:bold>requency-<ce:bold>F</ce:bold>used <ce:bold>(F</ce:bold><ce:sup loc=\"post\">2</ce:sup><ce:bold>)</ce:bold> block to enhance boundary details without amplifying noise. Across several medical image segmentation tasks, SU-RMT demonstrates strong performance. The code is at the <ce:inter-ref xlink:href=\"https://github.com/setsese/SURMTArchive\" xlink:type=\"simple\">link</ce:inter-ref>.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"41 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSTFDN: An EEG-fNIRS multimodal spatial-temporal fusion decoding network for personalized multi-task scenarios MSTFDN:面向个性化多任务场景的EEG-fNIRS多模态时空融合解码网络
IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.inffus.2026.104187
Peng Ding, Liyong Yin, Zhengxuan Zhou, Yuwei Su, Minqian Zhang, Yingwei Li, Xiaoli Li
{"title":"MSTFDN: An EEG-fNIRS multimodal spatial-temporal fusion decoding network for personalized multi-task scenarios","authors":"Peng Ding, Liyong Yin, Zhengxuan Zhou, Yuwei Su, Minqian Zhang, Yingwei Li, Xiaoli Li","doi":"10.1016/j.inffus.2026.104187","DOIUrl":"https://doi.org/10.1016/j.inffus.2026.104187","url":null,"abstract":"","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"271 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlled subspace fusion for language model continual learning 语言模型持续学习控制子空间融合
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.inffus.2026.104184
Xingcan Bao , Jianzhou Feng , Yiru Huo , Huaxiao Qiu , Haoran Yu , Shenyuan Ren , Jiadong Ren
Large language models (LLMs) have demonstrated remarkable performance across diverse natural language processing tasks. However, they still face significant challenges in multi-task continual learning, particularly in dynamic environments where tasks evolve sequentially and resources are constrained. Existing approaches typically learn separate adapter modules for each task, leading to a linear increase in parameters as tasks accumulate and thus hindering scalability and deployment efficiency. In this paper, we propose Controlled Subspace Fusion (CSF), a rehearsal-free and task-agnostic continual learning framework for language models that integrates knowledge across tasks while preventing parameter explosion. CSF introduces a shared low-rank projection subspace to provide a unified representational foundation, thereby enhancing consistency and facilitating cross-task knowledge transfer. In addition, we design an incremental subspace fusion mechanism that adaptively merges new task adapters with previously fused representations, while suppressing redundant parameter growth. As a result, the framework achieves scalable and robust knowledge fusion across sequential tasks. We evaluate CSF on mainstream architectures, including LLaMA and T5, across model scales ranging from 220M to 13B parameters. Experimental results on continual learning benchmarks demonstrate that CSF not only achieves superior average accuracy and parameter efficiency compared to existing approaches, but also provides a scalable and deployment-friendly solution that supports efficient knowledge fusion.
大型语言模型(llm)在不同的自然语言处理任务中表现出了显著的性能。然而,他们在多任务持续学习中仍然面临着重大挑战,特别是在任务顺序演变和资源受限的动态环境中。现有的方法通常为每个任务学习单独的适配器模块,导致参数随着任务的积累呈线性增长,从而阻碍了可伸缩性和部署效率。在本文中,我们提出了受控子空间融合(CSF),这是一种无需预演且与任务无关的语言模型持续学习框架,可在防止参数爆炸的同时集成跨任务的知识。CSF引入共享的低秩投影子空间,提供统一的表示基础,从而增强一致性,促进跨任务知识转移。此外,我们设计了一种增量子空间融合机制,该机制自适应地将新的任务适配器与先前融合的表示合并,同时抑制冗余参数的增长。因此,该框架实现了跨顺序任务的可伸缩和健壮的知识融合。我们在主流架构上评估CSF,包括LLaMA和T5,模型尺度从220M到13B参数。持续学习基准的实验结果表明,与现有方法相比,CSF不仅具有更高的平均精度和参数效率,而且还提供了可扩展和部署友好的解决方案,支持高效的知识融合。
{"title":"Controlled subspace fusion for language model continual learning","authors":"Xingcan Bao ,&nbsp;Jianzhou Feng ,&nbsp;Yiru Huo ,&nbsp;Huaxiao Qiu ,&nbsp;Haoran Yu ,&nbsp;Shenyuan Ren ,&nbsp;Jiadong Ren","doi":"10.1016/j.inffus.2026.104184","DOIUrl":"10.1016/j.inffus.2026.104184","url":null,"abstract":"<div><div>Large language models (LLMs) have demonstrated remarkable performance across diverse natural language processing tasks. However, they still face significant challenges in multi-task continual learning, particularly in dynamic environments where tasks evolve sequentially and resources are constrained. Existing approaches typically learn separate adapter modules for each task, leading to a linear increase in parameters as tasks accumulate and thus hindering scalability and deployment efficiency. In this paper, we propose Controlled Subspace Fusion (CSF), a rehearsal-free and task-agnostic continual learning framework for language models that integrates knowledge across tasks while preventing parameter explosion. CSF introduces a shared low-rank projection subspace to provide a unified representational foundation, thereby enhancing consistency and facilitating cross-task knowledge transfer. In addition, we design an incremental subspace fusion mechanism that adaptively merges new task adapters with previously fused representations, while suppressing redundant parameter growth. As a result, the framework achieves scalable and robust knowledge fusion across sequential tasks. We evaluate CSF on mainstream architectures, including LLaMA and T5, across model scales ranging from 220M to 13B parameters. Experimental results on continual learning benchmarks demonstrate that CSF not only achieves superior average accuracy and parameter efficiency compared to existing approaches, but also provides a scalable and deployment-friendly solution that supports efficient knowledge fusion.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104184"},"PeriodicalIF":15.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IDFL: Incentive-driven federated learning with selfish clients IDFL:激励驱动的联邦学习与自私的客户
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.inffus.2026.104185
Jin Xu, Hengrun Zhang, Huiqun Yu, Guisheng Fan
Heterogeneity challenges have been long discussed in Federated Learning (FL). Among these challenges, statistical heterogeneity, where non-independent and identical (non-IID) data distributions across clients severely impact model convergence and performance, remains particularly problematic. While existing batch size optimization strategies effectively address system-level heterogeneity and resource constraints, they inadequately tackle statistical heterogeneity, often simply increasing batch sizes without theoretical justification. Such approaches overlook a critical convergence-generalization dilemma well-established in traditional machine learning: larger batch sizes accelerate convergence but may deteriorate generalization performance beyond critical thresholds, which is usually termed “generalization gap”. To bridge this gap in FL, we propose a comprehensive framework with three key contributions. First, we establish a batch size optimization mechanism that balances convergence and generalization objectives through a penalty function, providing mathematically derived closed-form solutions for optimal batch sizes. Second, we design a Stackelberg game-based incentive mechanism that coordinates batch size assignments with resource contributions while ensuring fair reward allocation to maximize individual client utility (defined as the difference between rewards and costs). Third, we develop a two-step verification strategy that detects and mitigates free-riding behaviors while monitoring convergence patterns to terminate ineffective training processes. Extensive experiments on real-world datasets validate our approach, demonstrating significant improvements in both convergence performance and fairness compared to state-of-the-art algorithms. Ablation studies confirm the effectiveness of each component.
异质性挑战在联邦学习(FL)中已经讨论了很长时间。在这些挑战中,统计异质性,即跨客户端的非独立和相同(非iid)数据分布严重影响模型的收敛和性能,仍然是特别成问题的。虽然现有的批大小优化策略有效地解决了系统级的异构性和资源约束,但它们不能充分解决统计上的异构性,通常只是在没有理论依据的情况下简单地增加批大小。这些方法忽略了传统机器学习中存在的一个关键的收敛-泛化困境:更大的批处理规模加速了收敛,但可能会使泛化性能恶化,超过临界阈值,这通常被称为“泛化差距”。为了弥补这一差距,我们提出了一个全面的框架,其中包括三个关键贡献。首先,我们建立了一个批大小优化机制,通过惩罚函数平衡收敛和泛化目标,提供数学上导出的最优批大小的封闭形式解。其次,我们设计了一个基于Stackelberg博弈的激励机制,该机制协调了批量分配与资源贡献,同时确保公平的奖励分配,以最大化个人客户效用(定义为奖励与成本之间的差异)。第三,我们开发了一个两步验证策略,在监测收敛模式以终止无效训练过程的同时检测和减轻搭便车行为。在现实世界数据集上进行的大量实验验证了我们的方法,与最先进的算法相比,在收敛性能和公平性方面都有了显著的改进。消融研究证实了每个组成部分的有效性。
{"title":"IDFL: Incentive-driven federated learning with selfish clients","authors":"Jin Xu,&nbsp;Hengrun Zhang,&nbsp;Huiqun Yu,&nbsp;Guisheng Fan","doi":"10.1016/j.inffus.2026.104185","DOIUrl":"10.1016/j.inffus.2026.104185","url":null,"abstract":"<div><div>Heterogeneity challenges have been long discussed in Federated Learning (FL). Among these challenges, statistical heterogeneity, where non-independent and identical (non-IID) data distributions across clients severely impact model convergence and performance, remains particularly problematic. While existing batch size optimization strategies effectively address system-level heterogeneity and resource constraints, they inadequately tackle statistical heterogeneity, often simply increasing batch sizes without theoretical justification. Such approaches overlook a critical convergence-generalization dilemma well-established in traditional machine learning: larger batch sizes accelerate convergence but may deteriorate generalization performance beyond critical thresholds, which is usually termed “generalization gap”. To bridge this gap in FL, we propose a comprehensive framework with three key contributions. First, we establish a batch size optimization mechanism that balances convergence and generalization objectives through a penalty function, providing mathematically derived closed-form solutions for optimal batch sizes. Second, we design a Stackelberg game-based incentive mechanism that coordinates batch size assignments with resource contributions while ensuring fair reward allocation to maximize individual client utility (defined as the difference between rewards and costs). Third, we develop a two-step verification strategy that detects and mitigates free-riding behaviors while monitoring convergence patterns to terminate ineffective training processes. Extensive experiments on real-world datasets validate our approach, demonstrating significant improvements in both convergence performance and fairness compared to state-of-the-art algorithms. Ablation studies confirm the effectiveness of each component.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104185"},"PeriodicalIF":15.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
All-weather Multi-Modality Image Fusion: Unified Framework and 100k Benchmark 全天候多模态图像融合:统一框架和100k基准
IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.inffus.2026.104130
Xilai Li, Wuyang Liu, Xiaosong Li, Fuqiang Zhou, Huafeng Li, Feiping Nie
{"title":"All-weather Multi-Modality Image Fusion: Unified Framework and 100k Benchmark","authors":"Xilai Li, Wuyang Liu, Xiaosong Li, Fuqiang Zhou, Huafeng Li, Feiping Nie","doi":"10.1016/j.inffus.2026.104130","DOIUrl":"https://doi.org/10.1016/j.inffus.2026.104130","url":null,"abstract":"","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"30 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GIAFormer: A Gradient-Infused Attention and Transformer for Pain Assessment with EDA-fNIRS Fusion GIAFormer:一种梯度注入的注意力和转换器,用于EDA-fNIRS融合的疼痛评估
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.inffus.2026.104173
Muhammad Umar Khan , Girija Chetty , Stefanos Gkikas , Manolis Tsiknakis , Roland Goecke , Raul Fernandez-Rojas
Reliable pain assessment is crucial in clinical practice, yet it remains a challenge because self-report-based assessment is inherently subjective. In this work, we introduce GIAFormer, a deep learning framework designed to provide an objective measure of multilevel pain by jointly analysing Electrodermal Activity (EDA) and functional Near-Infrared Spectroscopy (fNIRS) signals. By combining the complementary information from autonomic and cortical responses, the proposed model aims to capture both physiological and neural aspects of pain. GIAFormer integrates a Gradient-Infused Attention (GIA) module with a Transformer. The GIA module enhances signal representation by fusing the physiological signals with their temporal gradients and applying spatial attention to highlight inter-channel dependencies. The Transformer component follows, enabling the model to learn long-range temporal relationships. The framework was evaluated on the AI4Pain dataset comprising 65 subjects using a leave-one-subject-out validation protocol. GIAFormer achieved an accuracy of 90.51% and outperformed recent state-of-the-art approaches. These findings highlight the potential of gradient-aware attention and multimodal fusion for interpretable, non-invasive, and generalisable pain assessment suitable for clinical and real-world applications.
可靠的疼痛评估在临床实践中至关重要,但它仍然是一个挑战,因为基于自我报告的评估本质上是主观的。在这项工作中,我们介绍了GIAFormer,这是一个深度学习框架,旨在通过联合分析皮肤电活动(EDA)和功能近红外光谱(fNIRS)信号,提供多层次疼痛的客观测量。通过结合来自自主神经和皮层反应的互补信息,提出的模型旨在捕捉疼痛的生理和神经方面。GIAFormer集成了一个梯度注入注意力(GIA)模块和一个变压器。GIA模块通过融合生理信号及其时间梯度和应用空间注意来突出通道间依赖性来增强信号表示。接下来是Transformer组件,使模型能够学习长期时间关系。该框架在AI4Pain数据集上进行评估,该数据集包括65个受试者,使用留一个受试者验证协议。GIAFormer实现了90.51%的准确率,优于最近的最先进的方法。这些发现强调了梯度感知注意力和多模态融合在可解释、无创、可推广的疼痛评估中的潜力,适用于临床和现实世界的应用。
{"title":"GIAFormer: A Gradient-Infused Attention and Transformer for Pain Assessment with EDA-fNIRS Fusion","authors":"Muhammad Umar Khan ,&nbsp;Girija Chetty ,&nbsp;Stefanos Gkikas ,&nbsp;Manolis Tsiknakis ,&nbsp;Roland Goecke ,&nbsp;Raul Fernandez-Rojas","doi":"10.1016/j.inffus.2026.104173","DOIUrl":"10.1016/j.inffus.2026.104173","url":null,"abstract":"<div><div>Reliable pain assessment is crucial in clinical practice, yet it remains a challenge because self-report-based assessment is inherently subjective. In this work, we introduce GIAFormer, a deep learning framework designed to provide an objective measure of multilevel pain by jointly analysing Electrodermal Activity (EDA) and functional Near-Infrared Spectroscopy (fNIRS) signals. By combining the complementary information from autonomic and cortical responses, the proposed model aims to capture both physiological and neural aspects of pain. GIAFormer integrates a Gradient-Infused Attention (GIA) module with a Transformer. The GIA module enhances signal representation by fusing the physiological signals with their temporal gradients and applying spatial attention to highlight inter-channel dependencies. The Transformer component follows, enabling the model to learn long-range temporal relationships. The framework was evaluated on the AI4Pain dataset comprising 65 subjects using a leave-one-subject-out validation protocol. GIAFormer achieved an accuracy of 90.51% and outperformed recent state-of-the-art approaches. These findings highlight the potential of gradient-aware attention and multimodal fusion for interpretable, non-invasive, and generalisable pain assessment suitable for clinical and real-world applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104173"},"PeriodicalIF":15.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grading-Inspired Complementary Enhancing for Multimodal Sentiment Analysis 基于评分的多模态情感分析互补增强
IF 18.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.inffus.2026.104174
Zhijing Huang, Wen-Jue He, Baotian Hu, Zheng Zhang
Due to its strong capacity for integrating heterogeneous multi-source information, multimodal sentiment analysis (MSA) has achieved remarkable progress in affective computing. However, existing methods typically adopt symmetric fusion strategies that treat all modalities equally, overlooking their inherent performance disparities that some modalities excel at discriminative representation, while others carry underutilized supportive cues. This limitation leads to insufficiency in cross-modal complementary correlation exploration. To address this issue, we propose a novel Grading-Inspired Complementary Enhancing (GCE) framework for MSA, which is one of the first attempts to conduct dynamic assessment for knowledge transfer in progressive multimodal fusion and cooperation. Specifically, based on cross-modal interaction, a task-aware grading mechanism categorizes modality-pair associations into dominant (high-performing) and supplementary (low-performing) branches according to their task performance. Accordingly, a relation filtering module selectively identifies the trustworthy information from the dominant branch to enhance consistency exploration in supplementary modality pairs with minimized redundancy. Afterwards, a weight adaptation module is adopted to dynamically adjust the guiding weight of individual samples for adaptability and generalization. Extensive experiments conducted on three benchmark datasets evidence that our proposed GCE approach can outperform the state-of-the-art MSA methods. Our code is available at https://github.com/hka-7/GCEforMSA.
多模态情感分析(MSA)由于具有强大的整合异构多源信息的能力,在情感计算领域取得了显著的进展。然而,现有的方法通常采用对称融合策略,平等地对待所有模式,忽视了它们内在的性能差异,即一些模式擅长于歧视性表征,而另一些模式则带有未充分利用的支持性线索。这种局限性导致了跨模态互补相关性研究的不足。为了解决这一问题,我们提出了一种新的基于评分启发的互补增强(GCE)框架,这是对渐进式多模态融合与合作中的知识转移进行动态评估的首次尝试。具体而言,基于跨模态交互,任务感知分级机制根据其任务绩效将模态对关联分类为主导(高性能)和辅助(低性能)分支。因此,关系过滤模块选择性地从优势分支中识别可信信息,以增强冗余最小化的互补模态对的一致性探索。然后,采用权值自适应模块动态调整单个样本的引导权值,实现自适应性和泛化。在三个基准数据集上进行的大量实验表明,我们提出的GCE方法优于最先进的MSA方法。我们的代码可在https://github.com/hka-7/GCEforMSA上获得。
{"title":"Grading-Inspired Complementary Enhancing for Multimodal Sentiment Analysis","authors":"Zhijing Huang, Wen-Jue He, Baotian Hu, Zheng Zhang","doi":"10.1016/j.inffus.2026.104174","DOIUrl":"https://doi.org/10.1016/j.inffus.2026.104174","url":null,"abstract":"Due to its strong capacity for integrating heterogeneous multi-source information, multimodal sentiment analysis (MSA) has achieved remarkable progress in affective computing. However, existing methods typically adopt symmetric fusion strategies that treat all modalities equally, overlooking their inherent performance disparities that some modalities excel at discriminative representation, while others carry underutilized supportive cues. This limitation leads to insufficiency in cross-modal complementary correlation exploration. To address this issue, we propose a novel Grading-Inspired Complementary Enhancing (GCE) framework for MSA, which is one of the first attempts to conduct dynamic assessment for knowledge transfer in progressive multimodal fusion and cooperation. Specifically, based on cross-modal interaction, a task-aware grading mechanism categorizes modality-pair associations into dominant (high-performing) and supplementary (low-performing) branches according to their task performance. Accordingly, a relation filtering module selectively identifies the trustworthy information from the dominant branch to enhance consistency exploration in supplementary modality pairs with minimized redundancy. Afterwards, a weight adaptation module is adopted to dynamically adjust the guiding weight of individual samples for adaptability and generalization. Extensive experiments conducted on three benchmark datasets evidence that our proposed GCE approach can outperform the state-of-the-art MSA methods. Our code is available at <ce:inter-ref xlink:href=\"https://github.com/hka-7/GCEforMSA\" xlink:type=\"simple\">https://github.com/hka-7/GCEforMSA</ce:inter-ref>.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"290 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial perturbation for RGB-T tracking via intra-modal excavation and cross-modal collusion 基于模态内挖掘和跨模态合谋的RGB-T跟踪的对抗性扰动
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.inffus.2026.104183
Xinyu Xiang , Xuying Wu , Shengxiang Li , Qinglong Yan , Tong Zou , Hao Zhang , Jiayi Ma
Existing adversarial perturbation attack for visual object trackers mainly focuses on RGB modality, yet research on RGB-T trackers’ adversarial perturbation remains unexplored. To address this gap, we propose an Intra-modal excavation and Cross-modal collusion adversarial perturbation attack algorithm (ICAttack) for RGB-T Tracking. Firstly, we establish a novel intra-modal adversarial clues excavation (ImAE) paradigm. By leveraging the unique distribution properties of each modality as a prior, we independently extract the attack cues of different modalities from the public noise space. Building upon this, we develop a cross-modal adversarial collusion (CmAC) strategy, which enables implicit and dynamic interaction between the adversarial tokens of two modalities. This interaction facilitates negotiation and collaboration, achieving a synergistic attack gain for RGB-T trackers that surpasses the effect of a single-modality attack. The above process, from intra-modal excavation to cross-modal collusion, creates a progressive and systematic attack framework for RGB-T trackers. Besides, by introducing the spatial adversarial intensity control module and precise response disruption loss, we further enhance both the attack stealthiness and precision of our adversarial perturbations. The control module reduces attack strength in less critical areas to improve stealth. The disruption loss uses a small mask on the tracker’s brightest semantic response region, concentrating the perturbation to interfere with the tracker’s target awareness precisely. Extensive evaluations of attack performances in different SOTA victimized RGB-T trackers demonstrate the advantages of ICAttack in terms of specificity and effectiveness of cross-modal attacks. Moreover, we offer a user-friendly interface to promote the practical deployment of adversarial perturbations. Our code is publicly available at https://github.com/Xinyu-Xiang/ICAttack.
现有的针对视觉目标跟踪器的对抗性摄动攻击主要集中在RGB模态上,而针对RGB- t跟踪器的对抗性摄动攻击的研究还很少。为了解决这一差距,我们提出了一种用于RGB-T跟踪的模内挖掘和跨模态合谋对抗摄动攻击算法(ICAttack)。首先,我们建立了一种新的模态内对抗线索挖掘(ImAE)范式。通过利用每个模态的独特分布特性作为先验,我们独立地从公共噪声空间中提取不同模态的攻击线索。在此基础上,我们开发了一种跨模态对抗性共谋(CmAC)策略,该策略使两种模态对抗性令牌之间的隐式和动态交互成为可能。这种交互促进了协商和协作,实现了RGB-T跟踪器的协同攻击增益,超过了单模态攻击的影响。上述过程从模态内挖掘到跨模态串通,为RGB-T跟踪器创建了一个渐进的、系统的攻击框架。此外,通过引入空间对抗强度控制模块和精确响应干扰损失,进一步提高了对抗摄动的攻击隐身性和精度。控制模块在不太关键的区域降低攻击强度,以提高隐身性。干扰损失在跟踪器最亮的语义响应区域上使用一个小掩模,集中扰动来精确干扰跟踪器的目标感知。对不同SOTA受害RGB-T跟踪器攻击性能的广泛评估表明,ICAttack在跨模式攻击的特异性和有效性方面具有优势。此外,我们提供了一个用户友好的界面,以促进对抗性扰动的实际部署。我们的代码可以在https://github.com/Xinyu-Xiang/ICAttack上公开获得。
{"title":"Adversarial perturbation for RGB-T tracking via intra-modal excavation and cross-modal collusion","authors":"Xinyu Xiang ,&nbsp;Xuying Wu ,&nbsp;Shengxiang Li ,&nbsp;Qinglong Yan ,&nbsp;Tong Zou ,&nbsp;Hao Zhang ,&nbsp;Jiayi Ma","doi":"10.1016/j.inffus.2026.104183","DOIUrl":"10.1016/j.inffus.2026.104183","url":null,"abstract":"<div><div>Existing adversarial perturbation attack for visual object trackers mainly focuses on RGB modality, yet research on RGB-T trackers’ adversarial perturbation remains unexplored. To address this gap, we propose an <strong>I</strong>ntra-modal excavation and <strong>C</strong>ross-modal collusion adversarial perturbation attack algorithm (ICAttack) for RGB-T Tracking. Firstly, we establish a novel intra-modal adversarial clues excavation (ImAE) paradigm. By leveraging the unique distribution properties of each modality as a prior, we independently extract the attack cues of different modalities from the public noise space. Building upon this, we develop a cross-modal adversarial collusion (CmAC) strategy, which enables implicit and dynamic interaction between the adversarial tokens of two modalities. This interaction facilitates negotiation and collaboration, achieving a synergistic attack gain for RGB-T trackers that surpasses the effect of a single-modality attack. The above process, from intra-modal excavation to cross-modal collusion, creates a progressive and systematic attack framework for RGB-T trackers. Besides, by introducing the spatial adversarial intensity control module and precise response disruption loss, we further enhance both the attack stealthiness and precision of our adversarial perturbations. The control module reduces attack strength in less critical areas to improve stealth. The disruption loss uses a small mask on the tracker’s brightest semantic response region, concentrating the perturbation to interfere with the tracker’s target awareness precisely. Extensive evaluations of attack performances in different SOTA victimized RGB-T trackers demonstrate the advantages of ICAttack in terms of specificity and effectiveness of cross-modal attacks. Moreover, we offer a user-friendly interface to promote the practical deployment of adversarial perturbations. Our code is publicly available at <span><span>https://github.com/Xinyu-Xiang/ICAttack</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104183"},"PeriodicalIF":15.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PromptMix: LLM-aided prompt learning for generalizing vision-language models PromptMix:用于泛化视觉语言模型的llm辅助提示学习
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.inffus.2026.104186
Yongcai Chen , Qinghua Zhang , Xinfa Shi , Lei Zhang
Intelligent engineering tasks step into real application with the development of deep learning techniques. However, performance in real conditions often falls into decline caused by scarce data, or subtle, easily confused patterns. Although vision-language models with prompt learning provide a new way for learning without retraining the backbone, these approaches still suffer from problems of overfitting under low-data regimes or poor expressive ability of prompts. To address these challenges, we propose a novel framework PromptMix that jointly considers semantic prompt learning, multimodal information fusion, and the alignment between pre-trained and domain-specific data. Specifically, PromptMix integrates three key components: (1) a Modality-Agnostic Shared Representation module to construct a shared latent space that mitigates the distribution discrepancies between pre-trained and target data, (2) a LLM-Aided Prompt Evolution mechanism to semantically enrich and iteratively refine learnable context prompts, and (3) a Cross-Attentive Adapter to enhance multimodal information fusion and robustness under low-sample conditions. Experiments on seven datasets, including six public benchmarks and one custom industrial dataset, demonstrate that PromptMix effectively enhances vision-language model adaptability, improves semantic representations, and achieves robust generalization under both base-to-novel and few-shot learning scenarios, delivering superior performance in engineering applications with limited labeled data.
随着深度学习技术的发展,智能工程任务逐步进入实际应用。然而,在实际情况下,由于数据稀缺或微妙的、容易混淆的模式,性能往往会下降。尽管具有提示学习的视觉语言模型提供了一种无需再训练主干的新学习方法,但这些方法在低数据情况下仍然存在过拟合或提示表达能力差的问题。为了解决这些挑战,我们提出了一个新的框架PromptMix,它联合考虑了语义提示学习、多模态信息融合以及预训练数据和特定领域数据之间的对齐。具体来说,PromptMix集成了三个关键组件:(1)模态不可知共享表示模块,用于构建共享潜在空间,以减轻预训练数据和目标数据之间的分布差异;(2)llm辅助提示进化机制,用于语义丰富和迭代细化可学习的上下文提示;(3)交叉关注适配器,用于增强低样本条件下的多模态信息融合和鲁棒性。在包括6个公共基准和1个自定义工业数据集在内的7个数据集上进行的实验表明,PromptMix有效地增强了视觉语言模型的适应性,改善了语义表示,并在基础到新颖和少量学习场景下实现了鲁棒泛化,在标记数据有限的工程应用中提供了卓越的性能。
{"title":"PromptMix: LLM-aided prompt learning for generalizing vision-language models","authors":"Yongcai Chen ,&nbsp;Qinghua Zhang ,&nbsp;Xinfa Shi ,&nbsp;Lei Zhang","doi":"10.1016/j.inffus.2026.104186","DOIUrl":"10.1016/j.inffus.2026.104186","url":null,"abstract":"<div><div>Intelligent engineering tasks step into real application with the development of deep learning techniques. However, performance in real conditions often falls into decline caused by scarce data, or subtle, easily confused patterns. Although vision-language models with prompt learning provide a new way for learning without retraining the backbone, these approaches still suffer from problems of overfitting under low-data regimes or poor expressive ability of prompts. To address these challenges, we propose a novel framework <em>PromptMix</em> that jointly considers semantic prompt learning, multimodal information fusion, and the alignment between pre-trained and domain-specific data. Specifically, PromptMix integrates three key components: (1) a <em>Modality-Agnostic Shared Representation</em> module to construct a shared latent space that mitigates the distribution discrepancies between pre-trained and target data, (2) a <em>LLM-Aided Prompt Evolution</em> mechanism to semantically enrich and iteratively refine learnable context prompts, and (3) a <em>Cross-Attentive Adapter</em> to enhance multimodal information fusion and robustness under low-sample conditions. Experiments on seven datasets, including six public benchmarks and one custom industrial dataset, demonstrate that PromptMix effectively enhances vision-language model adaptability, improves semantic representations, and achieves robust generalization under both base-to-novel and few-shot learning scenarios, delivering superior performance in engineering applications with limited labeled data.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104186"},"PeriodicalIF":15.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Fusion
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1