首页 > 最新文献

Information Fusion最新文献

英文 中文
Lifting wavelet transform-guided network with histogram attention for liver segmentation in CT scans 基于直方图关注的提升小波变换引导网络在CT肝脏分割中的应用
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.inffus.2026.104153
Huaxiang Liu , Wei Sun , Youyao Fu , Shiqing Zhang , Jie Jin , Jiangxiong Fang , Binliang Wang
Accurate liver segmentation in computed tomography (CT) scans is crucial for the diagnosis of hepatocellular carcinoma and surgical planning; however, manual delineation is laborious and prone to operator variability. Existing deep learning methods frequently sacrifice precise boundary delineation when expanding receptive fields or fail to leverage frequency-domain cues that encode global shape, while conventional attention mechanisms are less effective in processing low-contrast images. To address these challenges, we introduce LWT-Net, a novel network guided by a trainable lifting wavelet transform, incorporating a frequency-split histogram attention mechanism to enhance liver segmentation. LWT-Net incorporates a trainable lifting wavelet transform within an encoder-decoder framework to hierarchically decompose features into low-frequency components that capture global structure and high-frequency bands that preserve edge and texture details. A complementary inverse lifting stage reconstructs high-resolution features while maintaining spatial consistency. The frequency-spatial fusion module, driven by a histogram-based attention mechanism, performs histogram-guided feature reorganization across global and local bins, while employing self-attention to capture long-range dependencies and prioritize anatomically significant regions. Comprehensive evaluations on the LiTS2017, WORD, and FLARE22 datasets confirm LWT-Net’s superior performance, achieving mean Dice similarity coefficients of 95.96%, 97.15%, and 95.97%.
计算机断层扫描(CT)中准确的肝脏分割对肝癌的诊断和手术计划至关重要;然而,手工描绘是费力的,而且容易受到操作者的变化。现有的深度学习方法在扩展接受域或无法利用编码全局形状的频域线索时,往往会牺牲精确的边界描绘,而传统的注意机制在处理低对比度图像时效果较差。为了解决这些挑战,我们引入了LWT-Net,这是一种由可训练提升小波变换引导的新型网络,结合了频率分裂直方图注意机制来增强肝脏分割。LWT-Net在编码器-解码器框架内结合了可训练的提升小波变换,分层次将特征分解为捕获全局结构的低频分量和保留边缘和纹理细节的高频波段。互补的逆提升阶段重建高分辨率特征,同时保持空间一致性。频率-空间融合模块由基于直方图的注意机制驱动,在全局和局部bins中执行直方图引导的特征重组,同时利用自注意捕获远程依赖关系并优先考虑解剖上重要的区域。在LiTS2017、WORD和FLARE22数据集上的综合评价证实了LWT-Net的优越性能,平均Dice相似系数分别达到95.96%、97.15%和95.97%。
{"title":"Lifting wavelet transform-guided network with histogram attention for liver segmentation in CT scans","authors":"Huaxiang Liu ,&nbsp;Wei Sun ,&nbsp;Youyao Fu ,&nbsp;Shiqing Zhang ,&nbsp;Jie Jin ,&nbsp;Jiangxiong Fang ,&nbsp;Binliang Wang","doi":"10.1016/j.inffus.2026.104153","DOIUrl":"10.1016/j.inffus.2026.104153","url":null,"abstract":"<div><div>Accurate liver segmentation in computed tomography (CT) scans is crucial for the diagnosis of hepatocellular carcinoma and surgical planning; however, manual delineation is laborious and prone to operator variability. Existing deep learning methods frequently sacrifice precise boundary delineation when expanding receptive fields or fail to leverage frequency-domain cues that encode global shape, while conventional attention mechanisms are less effective in processing low-contrast images. To address these challenges, we introduce LWT-Net, a novel network guided by a trainable lifting wavelet transform, incorporating a frequency-split histogram attention mechanism to enhance liver segmentation. LWT-Net incorporates a trainable lifting wavelet transform within an encoder-decoder framework to hierarchically decompose features into low-frequency components that capture global structure and high-frequency bands that preserve edge and texture details. A complementary inverse lifting stage reconstructs high-resolution features while maintaining spatial consistency. The frequency-spatial fusion module, driven by a histogram-based attention mechanism, performs histogram-guided feature reorganization across global and local bins, while employing self-attention to capture long-range dependencies and prioritize anatomically significant regions. Comprehensive evaluations on the LiTS2017, WORD, and FLARE22 datasets confirm LWT-Net’s superior performance, achieving mean Dice similarity coefficients of 95.96%, 97.15%, and 95.97%.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104153"},"PeriodicalIF":15.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel knowledge distillation and hybrid explainability approach for phenology stage classification from multi-source time series 一种新的多源时间序列物候阶段分类的知识蒸馏和混合可解释性方法
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.inffus.2026.104158
Naeem Ullah , Andrés Manuel Chacón-Maldonado , Francisco Martínez-Álvarez , Ivanoe De Falco , Giovanna Sannino
Accurate phenological stage classification is crucial for addressing global challenges to food security posed by climate change, water scarcity, and land degradation. It enables precision agriculture by optimizing key interventions such as irrigation, fertilization, and pest control. While deep learning offers powerful tools, existing methods face four key limitations: reliance on narrow features and models, limited long-term forecasting capability, computational inefficiency, and opaque, unvalidated explanations. To overcome these limitations, this paper presents a deep learning framework for phenology classification, utilizing multi-source time series data from satellite imagery, meteorological stations, and field observations. The approach emphasizes temporal consistency, spatial adaptability, computational efficiency, and explainability. A feature engineering pipeline extracts temporal dynamics via lag features, rolling statistics, Fourier transforms and seasonal encodings. Feature selection combines incremental strategies with classical filter, wrapper, and embedded methods. Deep learning models across multiple paradigms-feedforward, recurrent, convolutional, and attention-based-are benchmarked under multi-horizon forecasting tasks. To reduce model complexity while preserving performance where possible, the framework employs knowledge distillation, transferring predictive knowledge from complex teacher models to compact and deployable student models. For model interpretability, a new Hybrid SHAP-Association Rule Explainability approach is proposed, integrating model-driven and data-driven explanations. Agreement between views is quantified using trust metrics: precision@k, coverage, and Jaccard similarity, with a retraining-based validation mechanism. Experiments on phenology data from Andalusia demonstrate high accuracy, strong generalizability, trustworthy explanations and resource-efficient phenology monitoring in agricultural systems.
准确的物候阶段分类对于应对气候变化、水资源短缺和土地退化给粮食安全带来的全球挑战至关重要。它通过优化灌溉、施肥和病虫害防治等关键干预措施,实现精准农业。虽然深度学习提供了强大的工具,但现有方法面临四个关键限制:依赖狭窄的特征和模型,有限的长期预测能力,计算效率低下,以及不透明、未经验证的解释。为了克服这些限制,本文提出了一个物候分类的深度学习框架,利用来自卫星图像、气象站和野外观测的多源时间序列数据。该方法强调时间一致性、空间适应性、计算效率和可解释性。特征工程管道通过滞后特征、滚动统计、傅里叶变换和季节编码提取时间动态。特征选择将增量策略与经典的过滤、包装和嵌入方法相结合。跨多种范式的深度学习模型-前馈,循环,卷积和基于注意-在多视界预测任务下进行基准测试。为了在尽可能保持性能的同时降低模型复杂性,该框架采用了知识蒸馏,将预测知识从复杂的教师模型转移到紧凑且可部署的学生模型。在模型可解释性方面,提出了一种新的混合shap -关联规则可解释性方法,将模型驱动和数据驱动的解释相结合。视图之间的一致性使用信任度量来量化:precision@k、覆盖率和Jaccard相似性,以及基于再训练的验证机制。对安达卢西亚物候数据的实验表明,物候数据具有较高的准确性、较强的通用性、可靠的解释和资源效率。
{"title":"A novel knowledge distillation and hybrid explainability approach for phenology stage classification from multi-source time series","authors":"Naeem Ullah ,&nbsp;Andrés Manuel Chacón-Maldonado ,&nbsp;Francisco Martínez-Álvarez ,&nbsp;Ivanoe De Falco ,&nbsp;Giovanna Sannino","doi":"10.1016/j.inffus.2026.104158","DOIUrl":"10.1016/j.inffus.2026.104158","url":null,"abstract":"<div><div>Accurate phenological stage classification is crucial for addressing global challenges to food security posed by climate change, water scarcity, and land degradation. It enables precision agriculture by optimizing key interventions such as irrigation, fertilization, and pest control. While deep learning offers powerful tools, existing methods face four key limitations: reliance on narrow features and models, limited long-term forecasting capability, computational inefficiency, and opaque, unvalidated explanations. To overcome these limitations, this paper presents a deep learning framework for phenology classification, utilizing multi-source time series data from satellite imagery, meteorological stations, and field observations. The approach emphasizes temporal consistency, spatial adaptability, computational efficiency, and explainability. A feature engineering pipeline extracts temporal dynamics via lag features, rolling statistics, Fourier transforms and seasonal encodings. Feature selection combines incremental strategies with classical filter, wrapper, and embedded methods. Deep learning models across multiple paradigms-feedforward, recurrent, convolutional, and attention-based-are benchmarked under multi-horizon forecasting tasks. To reduce model complexity while preserving performance where possible, the framework employs knowledge distillation, transferring predictive knowledge from complex teacher models to compact and deployable student models. For model interpretability, a new Hybrid SHAP-Association Rule Explainability approach is proposed, integrating model-driven and data-driven explanations. Agreement between views is quantified using trust metrics: precision@k, coverage, and Jaccard similarity, with a retraining-based validation mechanism. Experiments on phenology data from Andalusia demonstrate high accuracy, strong generalizability, trustworthy explanations and resource-efficient phenology monitoring in agricultural systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104158"},"PeriodicalIF":15.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion of quantum computing with smart agriculture: A systematic review of methods, implementation, applications, and challenges 量子计算与智慧农业的融合:方法、实现、应用和挑战的系统回顾
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.inffus.2026.104159
Sumit Kumar , Shashank Sheshar Singh , Gourav Bathla , Swati Sharma , Manisha Panjeta
The growing global population and the severity of environmental issues are driving the agriculture sector to adopt innovative technological advances for sustainable food production. Classical computing approaches frequently struggle with the volume and complexity of agricultural data when performing tasks such as crop yield prediction, disease detection, soil analysis, and weather forecasting. This Systematic Literature Review (SLR) provides an in-depth analysis of the evolving significance of quantum computing in smart agriculture. Quantum algorithms have the potential to reduce computational complexity and create novel data representation methods for high-dimensional challenges by leveraging quantum mechanics principles such as superposition and entanglement. This paper employs a structured research methodology based on eight specific research questions to comprehensively investigate over 100 peer-reviewed studies on quantum computing and smart agriculture published between 2012 and 2025. It demonstrates the effectiveness of Quantum Machine Learning (QML), quantum optimization, and hybrid quantum-classical models in various agricultural applications. The survey examines real-world implementations and compares existing quantum initiatives to classical benchmarks for the classification and prediction tasks. The presented work identifies challenges and limitations of current quantum approaches. The paper outlines directions for future work, including the accessibility of quantum hardware and the development of domain-specific algorithms. To the best of our knowledge, this is the first research question-driven SLR that provides an in-depth analysis of how quantum computing can be applied in agricultural applications.
全球人口的不断增长和环境问题的严重程度正在推动农业部门采用创新的技术进步来实现可持续的粮食生产。在执行诸如作物产量预测、疾病检测、土壤分析和天气预报等任务时,经典计算方法经常与农业数据的数量和复杂性作斗争。本系统文献综述(SLR)深入分析了量子计算在智能农业中的发展意义。量子算法有可能通过利用量子力学原理(如叠加和纠缠)来降低计算复杂性,并为高维挑战创造新的数据表示方法。本文采用基于8个具体研究问题的结构化研究方法,全面调查了2012年至2025年间发表的100多篇同行评审的量子计算和智慧农业研究。它展示了量子机器学习(QML),量子优化和混合量子经典模型在各种农业应用中的有效性。该调查考察了现实世界的实现,并将现有的量子计划与分类和预测任务的经典基准进行了比较。提出的工作确定了当前量子方法的挑战和局限性。本文概述了未来工作的方向,包括量子硬件的可访问性和特定领域算法的发展。据我们所知,这是第一个研究问题驱动的单反,它提供了量子计算如何应用于农业应用的深入分析。
{"title":"Fusion of quantum computing with smart agriculture: A systematic review of methods, implementation, applications, and challenges","authors":"Sumit Kumar ,&nbsp;Shashank Sheshar Singh ,&nbsp;Gourav Bathla ,&nbsp;Swati Sharma ,&nbsp;Manisha Panjeta","doi":"10.1016/j.inffus.2026.104159","DOIUrl":"10.1016/j.inffus.2026.104159","url":null,"abstract":"<div><div>The growing global population and the severity of environmental issues are driving the agriculture sector to adopt innovative technological advances for sustainable food production. Classical computing approaches frequently struggle with the volume and complexity of agricultural data when performing tasks such as crop yield prediction, disease detection, soil analysis, and weather forecasting. This Systematic Literature Review (SLR) provides an in-depth analysis of the evolving significance of quantum computing in smart agriculture. Quantum algorithms have the potential to reduce computational complexity and create novel data representation methods for high-dimensional challenges by leveraging quantum mechanics principles such as superposition and entanglement. This paper employs a structured research methodology based on eight specific research questions to comprehensively investigate over 100 peer-reviewed studies on quantum computing and smart agriculture published between 2012 and 2025. It demonstrates the effectiveness of Quantum Machine Learning (QML), quantum optimization, and hybrid quantum-classical models in various agricultural applications. The survey examines real-world implementations and compares existing quantum initiatives to classical benchmarks for the classification and prediction tasks. The presented work identifies challenges and limitations of current quantum approaches. The paper outlines directions for future work, including the accessibility of quantum hardware and the development of domain-specific algorithms. To the best of our knowledge, this is the first research question-driven SLR that provides an in-depth analysis of how quantum computing can be applied in agricultural applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104159"},"PeriodicalIF":15.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Negative can be positive: A stable and noise-resistant complementary contrastive learning for cross-modal matching 消极可以是积极的:一种稳定和抗噪声的跨模态匹配互补对比学习
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1016/j.inffus.2026.104156
Fangming Zhong , Xinyu He , Haiquan Yu , Xiu Liu , Suhua Zhang
Cross-modal matching with noisy correspondence has drawn considerable interest recently, due to the mismatched data imposed inevitably when collecting data from the Internet. Training on such noisy data often leads to severe performance degradation, as conventional methods tend to overfit rapidly to wrongly mismatched pairs. Most of the existing methods focus on predicting more reliable soft correspondence, generating higher weights for the pairs that are more likely to be correct. However, there still remain two limitations: (1) they ignore the informative signals embedded in the negative pairs, and (2) the instability of existing methods due to their sensitivity to the noise ratio. To address these issues, we explicitly take the negatives into account and propose a stable and noise-resistant complementary learning method, named Dual Contrastive Learning (DCL), for cross-modal matching with noisy correspondence. DCL leverages both positive pairs and negative pairs to improve the robustness. With the complementary contrastive learning, the negative pairs also contribute positively to the model optimization. Specifically, to fully explore the potential of mismatched data, we first partition the training data into clean and noisy subsets based on the memorization effect of deep neural networks. Then, we employ vanilla contrastive learning for positive matched pairs in the clean subset. As for negative pairs including the noisy subsets, complementary contrastive learning is adopted. In such doing, whatever the level of noise ratio is, the proposed method is robust to balance the positive information and negative information. Extensive experiments indicate that DCL significantly outperforms the state-of-the-art methods and exhibits remarkable stability with an extremely low variance of R@1. Specifically, the R@1 scores of our DCL are 7% and 9.1% higher than NPC on image-to-text and text-to-image, respectively. The source code is released at https://github.com/hxy2969/dcl.
由于从互联网上收集数据时不可避免地会产生不匹配的数据,具有噪声对应的跨模态匹配近年来引起了人们的广泛关注。在这种有噪声的数据上进行训练往往会导致严重的性能下降,因为传统的方法往往会迅速过拟合到错误的不匹配对。现有的大多数方法侧重于预测更可靠的软对应,为更可能正确的对生成更高的权重。然而,它们仍然存在两个局限性:(1)它们忽略了嵌入在负对中的信息信号;(2)现有方法由于对噪声比的敏感性而不稳定。为了解决这些问题,我们明确考虑了消极性,并提出了一种稳定且抗噪声的互补学习方法,称为双对比学习(DCL),用于与噪声对应的跨模态匹配。DCL同时利用正对和负对来提高鲁棒性。在互补对比学习中,负对对模型优化也有积极作用。具体来说,为了充分挖掘错配数据的潜力,我们首先基于深度神经网络的记忆效应,将训练数据划分为干净的和有噪声的子集。然后,我们对干净子集中的正匹配对采用香草对比学习。对于包含噪声子集的负对,采用互补对比学习。这样,无论噪声比是多少,所提出的方法都具有平衡正信息和负信息的鲁棒性。广泛的实验表明,DCL显著优于最先进的方法,并表现出显著的稳定性,方差极低R@1。具体来说,我们的DCL在图像到文本和文本到图像上的R@1得分分别比NPC高7%和9.1%。源代码发布在https://github.com/hxy2969/dcl。
{"title":"Negative can be positive: A stable and noise-resistant complementary contrastive learning for cross-modal matching","authors":"Fangming Zhong ,&nbsp;Xinyu He ,&nbsp;Haiquan Yu ,&nbsp;Xiu Liu ,&nbsp;Suhua Zhang","doi":"10.1016/j.inffus.2026.104156","DOIUrl":"10.1016/j.inffus.2026.104156","url":null,"abstract":"<div><div>Cross-modal matching with noisy correspondence has drawn considerable interest recently, due to the mismatched data imposed inevitably when collecting data from the Internet. Training on such noisy data often leads to severe performance degradation, as conventional methods tend to overfit rapidly to wrongly mismatched pairs. Most of the existing methods focus on predicting more reliable soft correspondence, generating higher weights for the pairs that are more likely to be correct. However, there still remain two limitations: (1) they ignore the informative signals embedded in the negative pairs, and (2) the instability of existing methods due to their sensitivity to the noise ratio. To address these issues, we explicitly take the negatives into account and propose a stable and noise-resistant complementary learning method, named Dual Contrastive Learning (DCL), for cross-modal matching with noisy correspondence. DCL leverages both positive pairs and negative pairs to improve the robustness. With the complementary contrastive learning, the negative pairs also contribute positively to the model optimization. Specifically, to fully explore the potential of mismatched data, we first partition the training data into clean and noisy subsets based on the memorization effect of deep neural networks. Then, we employ vanilla contrastive learning for positive matched pairs in the clean subset. As for negative pairs including the noisy subsets, complementary contrastive learning is adopted. In such doing, whatever the level of noise ratio is, the proposed method is robust to balance the positive information and negative information. Extensive experiments indicate that DCL significantly outperforms the state-of-the-art methods and exhibits remarkable stability with an extremely low variance of R@1. Specifically, the R@1 scores of our DCL are 7% and 9.1% higher than NPC on image-to-text and text-to-image, respectively. The source code is released at <span><span>https://github.com/hxy2969/dcl</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"130 ","pages":"Article 104156"},"PeriodicalIF":15.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MulMoSenT: Multimodal sentiment analysis for a low-resource language using textual-visual cross-attention and fusion 基于文本-视觉交叉注意和融合的低资源语言多模态情感分析
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-15 DOI: 10.1016/j.inffus.2026.104129
Sadia Afroze , Md. Rajib Hossain , Mohammed Moshiul Hoque , Nazmul Siddique
The widespread availability of the Internet and the growing use of smart devices have fueled the rapid expansion of multimodal (image-text) sentiment analysis (MSA), a burgeoning research field. This growth is driven by the massive volume of image-text data generated by these technologies. However, MSA faces significant challenges, notably the misalignment between images and text, where an image may carry multiple interpretations or contradict its paired text. In addition, short textual content often lacks sufficient context, complicating sentiment prediction. These issues are particularly acute in low-resource languages, where annotated image-text corpora are scarce, and Vision-Language Models (VLMs) and Large Language Models (LLMs) exhibit limited performance. This research introduces MulMoSenT, a multimodal image-text sentiment analysis system tailored to tackle these challenges for low-resource languages. The development of MulMoSenT unfolds across four key phases: corpus development, baseline model evaluation and selection, hyperparameter adaptation, and model fine-tuning and inference. The proposed MulMoSenT model achieves a peak accuracy of 84.90%, surpassing all baseline models. Delivers a 37. 83% improvement over VLMs, a 35.28% gain over image-only models, and a 0.71% enhancement over text-only models. Both the dataset and the solution are publicly accessible at: https://github.com/sadia-afroze/MulMoSenT.
互联网的广泛使用和智能设备的日益普及推动了多模态(图像-文本)情感分析(MSA)这一新兴研究领域的迅速发展。这种增长是由这些技术产生的大量图像-文本数据驱动的。然而,MSA面临着重大挑战,特别是图像和文本之间的不对齐,其中图像可能包含多种解释或与其配对的文本相矛盾。此外,短文本内容往往缺乏足够的上下文,使情感预测复杂化。这些问题在低资源语言中尤其严重,其中注释的图像文本语料库很少,并且视觉语言模型(vlm)和大型语言模型(llm)表现出有限的性能。本研究介绍了MulMoSenT,这是一个多模态图像-文本情感分析系统,专门针对低资源语言解决这些挑战。MulMoSenT的开发分为四个关键阶段:语料库开发、基线模型评估和选择、超参数适应以及模型微调和推理。所提出的MulMoSenT模型达到了84.90%的峰值精度,超过了所有基线模型。输出37。比vlm提高83%,比纯图像模型提高35.28%,比纯文本模型提高0.71%。数据集和解决方案都可以公开访问:https://github.com/sadia-afroze/MulMoSenT。
{"title":"MulMoSenT: Multimodal sentiment analysis for a low-resource language using textual-visual cross-attention and fusion","authors":"Sadia Afroze ,&nbsp;Md. Rajib Hossain ,&nbsp;Mohammed Moshiul Hoque ,&nbsp;Nazmul Siddique","doi":"10.1016/j.inffus.2026.104129","DOIUrl":"10.1016/j.inffus.2026.104129","url":null,"abstract":"<div><div>The widespread availability of the Internet and the growing use of smart devices have fueled the rapid expansion of multimodal (image-text) sentiment analysis (MSA), a burgeoning research field. This growth is driven by the massive volume of image-text data generated by these technologies. However, MSA faces significant challenges, notably the misalignment between images and text, where an image may carry multiple interpretations or contradict its paired text. In addition, short textual content often lacks sufficient context, complicating sentiment prediction. These issues are particularly acute in low-resource languages, where annotated image-text corpora are scarce, and Vision-Language Models (VLMs) and Large Language Models (LLMs) exhibit limited performance. This research introduces <strong>MulMoSenT</strong>, a multimodal image-text sentiment analysis system tailored to tackle these challenges for low-resource languages. The development of <strong>MulMoSenT</strong> unfolds across four key phases: corpus development, baseline model evaluation and selection, hyperparameter adaptation, and model fine-tuning and inference. The proposed <strong>MulMoSenT</strong> model achieves a peak accuracy of 84.90%, surpassing all baseline models. Delivers a 37. 83% improvement over VLMs, a 35.28% gain over image-only models, and a 0.71% enhancement over text-only models. Both the dataset and the solution are publicly accessible at: <span><span>https://github.com/sadia-afroze/MulMoSenT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104129"},"PeriodicalIF":15.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ExInCOACH: Strategic exploration meets interactive tutoring for context-aware game onboarding ExInCOACH:策略探索与情境感知游戏的互动辅导
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-14 DOI: 10.1016/j.inffus.2026.104151
Rui Hua , Zhaoyu Huang , Jinhao Lu , Yakun Li , Na Zhao
Traditional game tutorials often fail to deliver real-time contextual guidance, providing static instructions disconnected from dynamic gameplay states. This limitation stems from their inability to interpret evolving game environments and generate high-quality decisions during live player interactions. We present ExInCOACH, a hybrid framework that synergizes exploratory reinforcement learning (RL) with interactive large language models (LLMs) to enable state-aware adaptive tutoring. Our framework first employs deep RL to discover strategic patterns via self-play, constructing a Q-function. During player onboarding, LLMs map the Q-values of currently legal actions and their usage conditions into natural language rule explanations and strategic advice by analyzing live game states and player decisions.
Evaluations in Dou Di Zhu (a turn-based card game) reveal that learners using ExInCOACH experienced intuitive strategy internalization-all participants reported grasping advanced tactics faster than through rule-based tutorials, while most players highly valued the real-time contextual feedback. A comparative study demonstrated that players trained with ExInCOACH achieved a 70% win rate (14 wins/20 games) against those onboarded via traditional methods, as they benefited from adaptive guidance that evolved with their skill progression. To further validate the framework’s generalizability, evaluations were also conducted in StarCraft II, a high-complexity real-time strategy (RTS) game. In 2v2 cooperative battles, teams trained with ExInCOACH achieved a 66.7% win rate against teams assisted by Vision LLMs (VLLMs) and an impressive 100% win rate against teams relying on traditional static game wikis for learning. Cognitive load assessments indicated that ExInCOACH significantly reduced players- mental burden and frustration in complex scenarios involving real-time decision-making and multi-unit collaboration, while also outperforming traditional methods in information absorption efficiency and tactical adaptability. This work proposes a game tutorial design paradigm based on RL model exploration & LLM rule interpretation, making AI-generated strategies accessible through natural language interaction tailored to individual learning contexts.
传统的游戏教程通常无法提供实时情境指导,提供与动态游戏玩法状态脱节的静态指导。这种限制源于它们无法解释不断变化的游戏环境,无法在玩家互动过程中产生高质量的决策。我们提出了ExInCOACH,这是一个混合框架,它将探索性强化学习(RL)与交互式大型语言模型(llm)协同起来,以实现状态感知的自适应辅导。我们的框架首先采用深度强化学习,通过自我游戏来发现策略模式,构建一个q函数。在玩家入门阶段,法学硕士通过分析实时游戏状态和玩家决策,将当前法律行动的q值及其使用条件映射为自然语言规则解释和战略建议。对《豆地主》(一款回合制纸牌游戏)的评估显示,使用ExInCOACH的学习者体验到了直观的策略内化——所有参与者都表示,与使用基于规则的教程相比,他们掌握高级战术的速度更快,而大多数玩家都非常重视实时情境反馈。一项对比研究表明,使用ExInCOACH训练的玩家与使用传统方法训练的玩家相比,胜率达到70%(14胜/20场),因为他们受益于随着技能进步而发展的适应性指导。为了进一步验证该框架的普遍性,我们还在一款高复杂性即时战略游戏《星际争霸2》中进行了评估。在2v2的协同战斗中,使用ExInCOACH训练的队伍在对阵使用视觉llm (vllm)辅助的队伍时取得了66.7%的胜率,在对阵依靠传统静态游戏维基学习的队伍时取得了令人印象深刻的100%的胜率。认知负荷评估表明,ExInCOACH在涉及实时决策和多单位协作的复杂场景中显著降低了玩家的心理负担和挫败感,同时在信息吸收效率和战术适应性方面也优于传统方法。这项工作提出了一种基于强化学习模型探索和LLM规则解释的游戏教程设计范式,通过针对个人学习环境量身定制的自然语言交互,使人工智能生成的策略易于访问。
{"title":"ExInCOACH: Strategic exploration meets interactive tutoring for context-aware game onboarding","authors":"Rui Hua ,&nbsp;Zhaoyu Huang ,&nbsp;Jinhao Lu ,&nbsp;Yakun Li ,&nbsp;Na Zhao","doi":"10.1016/j.inffus.2026.104151","DOIUrl":"10.1016/j.inffus.2026.104151","url":null,"abstract":"<div><div>Traditional game tutorials often fail to deliver real-time contextual guidance, providing static instructions disconnected from dynamic gameplay states. This limitation stems from their inability to interpret evolving game environments and generate high-quality decisions during live player interactions. We present ExInCOACH, a hybrid framework that synergizes exploratory reinforcement learning (RL) with interactive large language models (LLMs) to enable state-aware adaptive tutoring. Our framework first employs deep RL to discover strategic patterns via self-play, constructing a Q-function. During player onboarding, LLMs map the Q-values of currently legal actions and their usage conditions into natural language rule explanations and strategic advice by analyzing live game states and player decisions.</div><div>Evaluations in Dou Di Zhu (a turn-based card game) reveal that learners using ExInCOACH experienced intuitive strategy internalization-all participants reported grasping advanced tactics faster than through rule-based tutorials, while most players highly valued the real-time contextual feedback. A comparative study demonstrated that players trained with ExInCOACH achieved a 70% win rate (14 wins/20 games) against those onboarded via traditional methods, as they benefited from adaptive guidance that evolved with their skill progression. To further validate the framework’s generalizability, evaluations were also conducted in StarCraft II, a high-complexity real-time strategy (RTS) game. In 2v2 cooperative battles, teams trained with ExInCOACH achieved a 66.7% win rate against teams assisted by Vision LLMs (VLLMs) and an impressive 100% win rate against teams relying on traditional static game wikis for learning. Cognitive load assessments indicated that ExInCOACH significantly reduced players- mental burden and frustration in complex scenarios involving real-time decision-making and multi-unit collaboration, while also outperforming traditional methods in information absorption efficiency and tactical adaptability. This work proposes a game tutorial design paradigm based on RL model exploration &amp; LLM rule interpretation, making AI-generated strategies accessible through natural language interaction tailored to individual learning contexts.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104151"},"PeriodicalIF":15.5,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeoCraft: A Diffusion Model-based 3D Reconstruction Method driven by image and point cloud fusion georaft:一种基于扩散模型的图像和点云融合驱动的三维重建方法
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1016/j.inffus.2026.104149
Weixuan Ma , Yamin Li , Chujin Liu , Hao Zhang , Jie Li , Kansong Chen , Weixuan Gao
With the rapid development of technologies like virtual reality (VR), autonomous driving, and digital twins, the demand for high-precision and realistic multimodal 3D reconstruction has surged. This technology has become a core research focus in computer vision and graphics due to its ability to integrate multi-source data, such as 2D images and point clouds. However, existing methods face challenges such as geometric inconsistency in single-view reconstruction, poor point cloud-to-mesh conversion, and insufficient multimodal feature fusion, limiting their practical application. To address these issues, this paper proposes GeoCraft, a multimodal 3D reconstruction method that generates high-precision 3D models from 2D images through three collaborative stages: Diff2DPoint, Point2DMesh, and Vision3DGen. Specifically, Diff2DPoint generates an initial point cloud with geometric alignment using a diffusion model and projection feature fusion; Point2DMesh converts the point cloud into a high-quality mesh using an autoregressive decoder-only Transformer and Direct Preference Optimization (DPO); Vision3DGen creates high-fidelity 3D objects through multimodal feature alignment. Experiments on the Google Scanned Objects (GSO) and Pix3D datasets show that GeoCraft excels in key metrics. On the GSO dataset, its CMMD is 2.810 and FIDCLIP is 26.420; on Pix3D, CMMD is 3.020 and FIDCLIP is 27.030. GeoCraft significantly outperforms existing 3D reconstruction methods and also demonstrates advantages in computational efficiency, effectively solving key challenges in 3D reconstruction.The code is available at https://github.com/weixuanma/GeoCraft.
随着虚拟现实(VR)、自动驾驶、数字孪生等技术的快速发展,对高精度、逼真的多模态3D重建的需求激增。该技术由于能够集成二维图像和点云等多源数据,已成为计算机视觉和图形学领域的核心研究热点。然而,现有方法存在单视图重构几何不一致、点云-网格转换差、多模态特征融合不足等问题,限制了其实际应用。为了解决这些问题,本文提出了GeoCraft,这是一种多模态3D重建方法,通过三个协作阶段:Diff2DPoint, Point2DMesh和Vision3DGen,从2D图像生成高精度3D模型。具体来说,Diff2DPoint使用扩散模型和投影特征融合生成具有几何对齐的初始点云;Point2DMesh使用自回归解码器转换器和直接偏好优化(DPO)将点云转换成高质量的网格;Vision3DGen通过多模态特征对齐创建高保真3D对象。在谷歌扫描目标(GSO)和Pix3D数据集上的实验表明,GeoCraft在关键指标上表现优异。在GSO数据集上,其CMMD为2.810,FIDCLIP为26.420;在Pix3D上,CMMD为3.020,FIDCLIP为27.030。GeoCraft大大优于现有的三维重建方法,并且在计算效率方面也显示出优势,有效地解决了三维重建中的关键挑战。代码可在https://github.com/weixuanma/GeoCraft上获得。
{"title":"GeoCraft: A Diffusion Model-based 3D Reconstruction Method driven by image and point cloud fusion","authors":"Weixuan Ma ,&nbsp;Yamin Li ,&nbsp;Chujin Liu ,&nbsp;Hao Zhang ,&nbsp;Jie Li ,&nbsp;Kansong Chen ,&nbsp;Weixuan Gao","doi":"10.1016/j.inffus.2026.104149","DOIUrl":"10.1016/j.inffus.2026.104149","url":null,"abstract":"<div><div>With the rapid development of technologies like virtual reality (VR), autonomous driving, and digital twins, the demand for high-precision and realistic multimodal 3D reconstruction has surged. This technology has become a core research focus in computer vision and graphics due to its ability to integrate multi-source data, such as 2D images and point clouds. However, existing methods face challenges such as geometric inconsistency in single-view reconstruction, poor point cloud-to-mesh conversion, and insufficient multimodal feature fusion, limiting their practical application. To address these issues, this paper proposes GeoCraft, a multimodal 3D reconstruction method that generates high-precision 3D models from 2D images through three collaborative stages: Diff2DPoint, Point2DMesh, and Vision3DGen. Specifically, Diff2DPoint generates an initial point cloud with geometric alignment using a diffusion model and projection feature fusion; Point2DMesh converts the point cloud into a high-quality mesh using an autoregressive decoder-only Transformer and Direct Preference Optimization (DPO); Vision3DGen creates high-fidelity 3D objects through multimodal feature alignment. Experiments on the Google Scanned Objects (GSO) and Pix3D datasets show that GeoCraft excels in key metrics. On the GSO dataset, its CMMD is 2.810 and FID<sub>CLIP</sub> is 26.420; on Pix3D, CMMD is 3.020 and FID<sub>CLIP</sub> is 27.030. GeoCraft significantly outperforms existing 3D reconstruction methods and also demonstrates advantages in computational efficiency, effectively solving key challenges in 3D reconstruction.The code is available at <span><span>https://github.com/weixuanma/GeoCraft</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104149"},"PeriodicalIF":15.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GC-Fed: Gradient centralized federated learning with partial client participation GC-Fed:部分客户参与的梯度集中式联邦学习
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-13 DOI: 10.1016/j.inffus.2026.104148
Jungwon Seo , Ferhat Ozgur Catak , Chunming Rong , Kibeom Hong , Minhoe Kim
Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but suffers from client drift in highly heterogeneous data settings. Many existing approaches mitigate drift by providing clients with common reference points, typically derived from past information, to align objectives or gradient directions. However, under severe partial participation, such history-dependent references may become unreliable, as the set of client data distributions participating in each round can vary drastically. To overcome this limitation, we propose a method that mitigates client drift without relying on past information by constraining the update space through Gradient Centralization (GC). Specifically, we introduce Local GC and Global GC, which apply GC at the local and global update stages, respectively, and further present GC-Fed, a hybrid formulation that generalizes both. Theoretical analysis and extensive experiments on benchmark FL tasks demonstrate that GC-Fed effectively alleviates client drift and achieves up to 20 % accuracy improvement under data heterogeneous and partial participation conditions.
联邦学习(FL)支持保护隐私的多源信息融合(MSIF),但在高度异构的数据设置中存在客户端漂移问题。许多现有的方法通过向客户提供公共参考点(通常来自过去的信息)来调整目标或梯度方向,从而减轻了漂移。然而,在严重的部分参与下,这种依赖历史的引用可能变得不可靠,因为参与每一轮的客户端数据分布集可能会有很大的变化。为了克服这一限制,我们提出了一种方法,通过梯度集中化(GC)限制更新空间,在不依赖过去信息的情况下减轻客户端漂移。具体来说,我们介绍了本地GC和全局GC,它们分别在本地和全局更新阶段应用GC,并进一步提出了GC- fed,这是一种推广两者的混合公式。对基准FL任务的理论分析和大量实验表明,GC-Fed有效缓解了客户端漂移,在数据异构和部分参与条件下,准确率提高了20%。
{"title":"GC-Fed: Gradient centralized federated learning with partial client participation","authors":"Jungwon Seo ,&nbsp;Ferhat Ozgur Catak ,&nbsp;Chunming Rong ,&nbsp;Kibeom Hong ,&nbsp;Minhoe Kim","doi":"10.1016/j.inffus.2026.104148","DOIUrl":"10.1016/j.inffus.2026.104148","url":null,"abstract":"<div><div>Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but suffers from client drift in highly heterogeneous data settings. Many existing approaches mitigate drift by providing clients with common reference points, typically derived from past information, to align objectives or gradient directions. However, under severe partial participation, such history-dependent references may become unreliable, as the set of client data distributions participating in each round can vary drastically. To overcome this limitation, we propose a method that mitigates client drift without relying on past information by constraining the update space through Gradient Centralization (GC). Specifically, we introduce <span>Local GC</span> and <span>Global GC</span>, which apply GC at the local and global update stages, respectively, and further present <span>GC-Fed</span>, a hybrid formulation that generalizes both. Theoretical analysis and extensive experiments on benchmark FL tasks demonstrate that <span>GC-Fed</span> effectively alleviates client drift and achieves up to 20 % accuracy improvement under data heterogeneous and partial participation conditions.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104148"},"PeriodicalIF":15.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145962592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SymUnet-DynCFC: Multimodal MRI fusion for robust cartilage segmentation and clinically confirmed moderate-to-severe KOA diagnosis SymUnet-DynCFC:多模态MRI融合稳健软骨分割和临床证实的中重度KOA诊断
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.inffus.2026.104145
Li Li , Jianbing Ma , Beiji Zou , Hao Xu , Shenghui Liao , Wenyi Xiong , Liqiang Zhi
Knee osteoarthritis (KOA) is a globally prevalent degenerative joint disorder. A central challenge in its automated diagnosis is the efficient fusion of multimodal MRI data. This fusion aims to enhance the accuracy and generalizability of clinical cartilage segmentation, while simultaneously minimizing healthcare resource consumption. Therefore, this study introduces dynamic confidence fuzzy control (DynCFC) within the symmetric unet architecture (SymUnet), referred to as SymUnet-DynCFC, which is designed to enhance the accuracy and robustness of cartilage segmentation. Firstly, the SymUnet architecture is developed, with separate inputs from T1W and T2W modalities to facilitate comprehensive segmentation evaluation. Secondly, the DynCFC mechanism is implemented to compute the optimal weighting for each modality, enabling the fusion and optimization of multimodal features. Finally, the performance of the proposed SymUnet-DynCFC method is evaluated on clinical datasets from a multi-campus hospital system. Experimental results show that SymUnet-DynCFC achieves better segmentation performance than the baselines, with mean Dice, IoU, and HD95 values of 87.96 %, 79.93 %, and 1.29, respectively. In particular, SymUnet-DynCFC exhibits improved robustness compared to the baseline methods. This may facilitate automated cartilage segmentation in clinical workflows and could support the assessment of moderate-to-severe KOA by detecting outlier metrics.
膝骨关节炎(KOA)是一种全球流行的退行性关节疾病。其自动化诊断的核心挑战是多模态MRI数据的有效融合。这种融合旨在提高临床软骨分割的准确性和通用性,同时最大限度地减少医疗资源消耗。因此,本研究在对称unet架构(SymUnet)中引入动态置信度模糊控制(DynCFC),简称SymUnet-DynCFC,旨在提高软骨分割的准确性和鲁棒性。首先,开发了SymUnet架构,从T1W和T2W模式中分离输入,以方便全面的分割评估。其次,采用DynCFC机制计算各模态的最优权重,实现多模态特征的融合和优化;最后,在多校区医院系统的临床数据集上对所提出的SymUnet-DynCFC方法进行了性能评估。实验结果表明,SymUnet-DynCFC的分割性能优于基线,Dice均值为87.96%,IoU均值为79.93%,HD95均值为1.29。特别是,与基线方法相比,SymUnet-DynCFC表现出更好的鲁棒性。这可以促进临床工作流程中的自动软骨分割,并可以通过检测异常指标来支持中度至重度KOA的评估。
{"title":"SymUnet-DynCFC: Multimodal MRI fusion for robust cartilage segmentation and clinically confirmed moderate-to-severe KOA diagnosis","authors":"Li Li ,&nbsp;Jianbing Ma ,&nbsp;Beiji Zou ,&nbsp;Hao Xu ,&nbsp;Shenghui Liao ,&nbsp;Wenyi Xiong ,&nbsp;Liqiang Zhi","doi":"10.1016/j.inffus.2026.104145","DOIUrl":"10.1016/j.inffus.2026.104145","url":null,"abstract":"<div><div>Knee osteoarthritis (KOA) is a globally prevalent degenerative joint disorder. A central challenge in its automated diagnosis is the efficient fusion of multimodal MRI data. This fusion aims to enhance the accuracy and generalizability of clinical cartilage segmentation, while simultaneously minimizing healthcare resource consumption. Therefore, this study introduces dynamic confidence fuzzy control (DynCFC) within the symmetric unet architecture (SymUnet), referred to as SymUnet-DynCFC, which is designed to enhance the accuracy and robustness of cartilage segmentation. Firstly, the SymUnet architecture is developed, with separate inputs from T1W and T2W modalities to facilitate comprehensive segmentation evaluation. Secondly, the DynCFC mechanism is implemented to compute the optimal weighting for each modality, enabling the fusion and optimization of multimodal features. Finally, the performance of the proposed SymUnet-DynCFC method is evaluated on clinical datasets from a multi-campus hospital system. Experimental results show that SymUnet-DynCFC achieves better segmentation performance than the baselines, with mean Dice, IoU, and HD95 values of 87.96 %, 79.93 %, and 1.29, respectively. In particular, SymUnet-DynCFC exhibits improved robustness compared to the baseline methods. This may facilitate automated cartilage segmentation in clinical workflows and could support the assessment of moderate-to-severe KOA by detecting outlier metrics.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"130 ","pages":"Article 104145"},"PeriodicalIF":15.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145957304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data science: a natural ecosystem 数据科学:一个自然生态系统
IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.inffus.2025.104113
Emilio Porcu , Roy El Moukari , Laurent Najman , Francisco Herrera , Horst Simon
This manuscript provides a systemic and data-centric view of what we term essential data science, as a natural ecosystem with challenges and missions stemming from the fusion of data universe with its multiple combinations of the 5D complexities (data structure, domain, cardinality, causality, and ethics) with the phases of the data life cycle. Data agents perform tasks driven by specific goals. The data scientist is an abstract entity that comes from the logical organization of data agents with their actions. Data scientists face challenges that are defined according to the missions. We define specific discipline-induced data science, which in turn allows for the definition of pan-data science, a natural ecosystem that integrates specific disciplines with the essential data science. We semantically split the essential data science into computational, and foundational. By formalizing this ecosystemic view, we contribute a general-purpose, fusion-oriented architecture for integrating heterogeneous knowledge, agents, and workflows-relevant to a wide range of disciplines and high-impact applications.
本文提供了一个系统的和以数据为中心的观点,我们称之为基本数据科学,作为一个自然生态系统,其挑战和任务源于数据宇宙与5D复杂性(数据结构、领域、基数、因果关系和伦理)的多种组合与数据生命周期阶段的融合。数据代理执行由特定目标驱动的任务。数据科学家是一个抽象的实体,它来自数据代理及其操作的逻辑组织。数据科学家面临的挑战是根据任务定义的。我们定义了特定学科诱导的数据科学,这反过来又允许定义泛数据科学,这是一个将特定学科与基本数据科学集成在一起的自然生态系统。我们从语义上将基本数据科学分为计算型和基础型。通过形式化这个生态系统视图,我们提供了一个通用的、面向融合的体系结构,用于集成异构知识、代理和工作流程——与广泛的学科和高影响力的应用相关。
{"title":"Data science: a natural ecosystem","authors":"Emilio Porcu ,&nbsp;Roy El Moukari ,&nbsp;Laurent Najman ,&nbsp;Francisco Herrera ,&nbsp;Horst Simon","doi":"10.1016/j.inffus.2025.104113","DOIUrl":"10.1016/j.inffus.2025.104113","url":null,"abstract":"<div><div>This manuscript provides a systemic and data-centric view of what we term <em>essential</em> data science, as a <em>natural</em> ecosystem with challenges and missions stemming from the fusion of data universe with its multiple combinations of the 5D complexities (data structure, domain, cardinality, causality, and ethics) with the phases of the data life cycle. Data agents perform tasks driven by specific <em>goals</em>. The data scientist is an abstract entity that comes from the logical organization of data agents with their actions. Data scientists face challenges that are defined according to the <em>missions</em>. We define specific discipline-induced data science, which in turn allows for the definition of <em>pan</em>-data science, a natural ecosystem that integrates specific disciplines with the essential data science. We semantically split the essential data science into computational, and foundational. By formalizing this ecosystemic view, we contribute a general-purpose, fusion-oriented architecture for integrating heterogeneous knowledge, agents, and workflows-relevant to a wide range of disciplines and high-impact applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"130 ","pages":"Article 104113"},"PeriodicalIF":15.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145957302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Fusion
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1