首页 > 最新文献

Knowledge-Based Systems最新文献

英文 中文
GEOMR: Integrating image geographic features and human reasoning knowledge for image geolocalization GEOMR:将图像地理特征与人类推理知识相结合,实现图像地理定位
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-23 DOI: 10.1016/j.knosys.2026.115391
Jian Fang , Siyi Qian , Shaohui Liu
Worldwide image geolocalization aims to accurately predict the geographic location where a given image was captured. Due to the vast scale of the Earth and the uneven distribution of geographic features, this task remains highly challenging. Traditional methods exhibit clear limitations when handling global-scale data. To address these challenges, we propose GEOMR, an effective and adaptive framework that integrates image geographic features and human reasoning knowledge to enhance global geolocalization accuracy. GEOMR consists of two modules. The first module extracts geographic features from images by jointly learning multimodal features. The second module involves training a multimodal large language model in a two-phase process to enhance its geolocalization reasoning capabilities. The first phase learns human geolocalization reasoning knowledge, enabling the model to utilize geographic cues present in images effectively. The second phase focuses on learning how to use reference information to infer the correct geographic coordinates. Extensive experiments conducted on the IM2GPS3K, YFCC4K, and YFCC26K datasets demonstrate that GEOMR significantly outperforms state-of-the-art methods.
全球图像地理定位旨在准确预测给定图像被捕获的地理位置。由于地球幅员辽阔,地理特征分布不均,这一任务仍然极具挑战性。传统方法在处理全球尺度数据时表现出明显的局限性。为了应对这些挑战,我们提出了一种有效的自适应框架GEOMR,它集成了图像地理特征和人类推理知识,以提高全球地理定位的准确性。GEOMR由两个模块组成。第一个模块通过联合学习多模态特征,从图像中提取地理特征。第二个模块是分两阶段训练一个多模态大语言模型,以增强其地理定位推理能力。第一阶段学习人类地理定位推理知识,使模型能够有效地利用图像中存在的地理线索。第二阶段的重点是学习如何使用参考信息来推断正确的地理坐标。在IM2GPS3K、YFCC4K和YFCC26K数据集上进行的大量实验表明,GEOMR显著优于最先进的方法。
{"title":"GEOMR: Integrating image geographic features and human reasoning knowledge for image geolocalization","authors":"Jian Fang ,&nbsp;Siyi Qian ,&nbsp;Shaohui Liu","doi":"10.1016/j.knosys.2026.115391","DOIUrl":"10.1016/j.knosys.2026.115391","url":null,"abstract":"<div><div>Worldwide image geolocalization aims to accurately predict the geographic location where a given image was captured. Due to the vast scale of the Earth and the uneven distribution of geographic features, this task remains highly challenging. Traditional methods exhibit clear limitations when handling global-scale data. To address these challenges, we propose GEOMR, an effective and adaptive framework that integrates image geographic features and human reasoning knowledge to enhance global geolocalization accuracy. GEOMR consists of two modules. The first module extracts geographic features from images by jointly learning multimodal features. The second module involves training a multimodal large language model in a two-phase process to enhance its geolocalization reasoning capabilities. The first phase learns human geolocalization reasoning knowledge, enabling the model to utilize geographic cues present in images effectively. The second phase focuses on learning how to use reference information to infer the correct geographic coordinates. Extensive experiments conducted on the IM2GPS3K, YFCC4K, and YFCC26K datasets demonstrate that GEOMR significantly outperforms state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115391"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From co-occurrence to coherence: Quantum-informed representation learning for knowledge graph completion 从共现到相干:知识图谱补全的量子表示学习
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-24 DOI: 10.1016/j.knosys.2026.115408
Mankun Zhao , Bingtao Xu , Jiujiang Guo , Jian Yu , Tianyi Xu , Mei Yu
Knowledge graph completion (KGC) aims to infer missing facts by learning latent semantic patterns from observed triples. While existing methods learn superficial semantic co-occurrence through classical probabilistic frameworks, they struggle to capture non-classical semantic properties such as entanglements that govern intrinsic correlations between semantics. These entanglements, critical for disambiguating contextual semantics, cannot be represented in classical probabilistic spaces which lack mathematical tools to represent quantum-like properties. We address this gap with QIKGC, a quantum-informed KGC framework that (i) embeds entity semantics into Hilbert space to explicitly model entanglement, (ii) leverages matrix product states to approximate high-dimensional semantic structures with polynomial complexity, and (iii) treats relations as quantum measurements followed by tomography-based scoring to obtain context-specific entity representations. This is, to our knowledge, the first KGC model that unifies semantic entanglement modeling with trainable quantum operators while remaining efficient on classical hardware. Extensive experiments on four benchmarks demonstrate clear quantitative gains, for example increasing MRR from 0.511 to 0.537 on WN18RR and from 0.904 to 0.926 on Kinship over the best baselines.
知识图补全(KGC)旨在通过从观察到的三元组中学习潜在的语义模式来推断缺失的事实。虽然现有的方法通过经典概率框架学习表面的语义共现,但它们难以捕获非经典语义属性,如控制语义之间内在相关性的纠缠。这些纠缠对于消除上下文语义的歧义至关重要,不能在缺乏数学工具来表示量子性质的经典概率空间中表示。我们用QIKGC解决了这一差距,QIKGC是一个量子通知的KGC框架,它(i)将实体语义嵌入希尔伯特空间以显式地建模纠缠,(ii)利用矩阵乘积状态近似具有多项式复杂性的高维语义结构,以及(iii)将关系视为量子测量,然后基于层析成像的得分以获得特定于上下文的实体表示。据我们所知,这是第一个将语义纠缠建模与可训练量子算子结合起来,同时在经典硬件上保持高效的KGC模型。在四个基准上进行的广泛实验显示了明显的定量收益,例如,在最佳基线上,WN18RR的MRR从0.511增加到0.537,亲属关系的MRR从0.904增加到0.926。
{"title":"From co-occurrence to coherence: Quantum-informed representation learning for knowledge graph completion","authors":"Mankun Zhao ,&nbsp;Bingtao Xu ,&nbsp;Jiujiang Guo ,&nbsp;Jian Yu ,&nbsp;Tianyi Xu ,&nbsp;Mei Yu","doi":"10.1016/j.knosys.2026.115408","DOIUrl":"10.1016/j.knosys.2026.115408","url":null,"abstract":"<div><div>Knowledge graph completion (KGC) aims to infer missing facts by learning latent semantic patterns from observed triples. While existing methods learn superficial semantic co-occurrence through classical probabilistic frameworks, they struggle to capture non-classical semantic properties such as entanglements that govern intrinsic correlations between semantics. These entanglements, critical for disambiguating contextual semantics, cannot be represented in classical probabilistic spaces which lack mathematical tools to represent quantum-like properties. We address this gap with QIKGC, a quantum-informed KGC framework that (i) embeds entity semantics into Hilbert space to explicitly model entanglement, (ii) leverages matrix product states to approximate high-dimensional semantic structures with polynomial complexity, and (iii) treats relations as quantum measurements followed by tomography-based scoring to obtain context-specific entity representations. This is, to our knowledge, the first KGC model that unifies semantic entanglement modeling with trainable quantum operators while remaining efficient on classical hardware. Extensive experiments on four benchmarks demonstrate clear quantitative gains, for example increasing MRR from 0.511 to 0.537 on WN18RR and from 0.904 to 0.926 on Kinship over the best baselines.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115408"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedDCA : Stable and unified Wasserstein adaptation to federated concept drift FedDCA:稳定统一的Wasserstein对联邦概念漂移的适应
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-22 DOI: 10.1016/j.knosys.2026.115342
Liyu Fang , Wu Wen , Xiaolin Zheng
Federated Learning (FL) with concept drift faces three fundamental challenges. First, existing methods lack a drift-aware client representation that can directly reflect changes in data distributions. Second, clustering with drifting clients often causes collaborative instability by contaminating the structure of client groups. Third, many approaches suffer from a methodological disconnect between drift detection and adaptation.
To address these challenges, we propose FedDCA, a stable and unified framework for federated concept drift adaptation. FedDCA introduces a Label Profile (LP), a compact distributional representation that captures each client’s current data concept and enables principled drift-aware similarity measurement. Based on LPs, FedDCA employs Drift-Aware Anchor Clustering, which performs Variational Wasserstein Clustering exclusively on stable clients to form robust anchor centroids, thereby preserving collaborative stability. Drifting clients are then assigned to the nearest anchor, allowing rapid adaptation without destabilizing the overall system. By unifying drift detection and clustering adaptation within the same Wasserstein metric space, FedDCA provides a consistent and effective response to dynamic environments. Extensive experiments demonstrate that FedDCA significantly outperforms state-of-the-art methods in both accuracy and adaptation speed under various concept drift scenarios.
具有概念漂移的联邦学习(FL)面临着三个基本挑战。首先,现有方法缺乏能够直接反映数据分布变化的漂移感知客户机表示。其次,具有漂移客户的集群通常会污染客户群体的结构,从而导致协作不稳定。第三,许多方法在漂移检测和适应之间存在方法论上的脱节。为了解决这些挑战,我们提出了FedDCA,一个稳定和统一的联邦概念漂移适应框架。FedDCA引入了标签配置文件(LP),这是一种紧凑的分布表示,可以捕获每个客户端的当前数据概念,并实现原则性漂移感知相似性测量。在lp的基础上,FedDCA采用了漂移感知锚点聚类(Drift-Aware Anchor Clustering),它只对稳定的客户端执行变分Wasserstein聚类(Variational Wasserstein Clustering),形成鲁棒锚点质心,从而保持协同稳定性。然后将漂移客户端分配到最近的锚点,允许快速适应而不会破坏整个系统。通过在相同的Wasserstein度量空间内统一漂移检测和聚类自适应,FedDCA提供了对动态环境一致和有效的响应。大量的实验表明,在各种概念漂移场景下,FedDCA在精度和自适应速度上都明显优于目前最先进的方法。
{"title":"FedDCA : Stable and unified Wasserstein adaptation to federated concept drift","authors":"Liyu Fang ,&nbsp;Wu Wen ,&nbsp;Xiaolin Zheng","doi":"10.1016/j.knosys.2026.115342","DOIUrl":"10.1016/j.knosys.2026.115342","url":null,"abstract":"<div><div>Federated Learning (FL) with concept drift faces three fundamental challenges. First, existing methods lack a drift-aware client representation that can directly reflect changes in data distributions. Second, clustering with drifting clients often causes collaborative instability by contaminating the structure of client groups. Third, many approaches suffer from a methodological disconnect between drift detection and adaptation.</div><div>To address these challenges, we propose FedDCA, a stable and unified framework for federated concept drift adaptation. FedDCA introduces a Label Profile (LP), a compact distributional representation that captures each client’s current data concept and enables principled drift-aware similarity measurement. Based on LPs, FedDCA employs Drift-Aware Anchor Clustering, which performs Variational Wasserstein Clustering exclusively on stable clients to form robust anchor centroids, thereby preserving collaborative stability. Drifting clients are then assigned to the nearest anchor, allowing rapid adaptation without destabilizing the overall system. By unifying drift detection and clustering adaptation within the same Wasserstein metric space, FedDCA provides a consistent and effective response to dynamic environments. Extensive experiments demonstrate that FedDCA significantly outperforms state-of-the-art methods in both accuracy and adaptation speed under various concept drift scenarios.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115342"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AGPL-KEM : Attribute-guided prompt learning with knowledge experts mixture for few-shot remote sensing image classification AGPL-KEM:基于属性引导的知识专家混合快速学习的少拍遥感图像分类
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-20 DOI: 10.1016/j.knosys.2026.115375
Chunlei Wu , Congzheng Zhu , Qinfu Xu , Xu Liu , Yongzhen Zhang , Leiquan Wang , Jie Wu
Large-scale vision-language models (VLMs) have shown significant success in various computer vision tasks. However, adapting VLMs to remote sensing (RS) tasks remains challenging due to the distinct characteristics of RS imagery, such as spectral heterogeneity, fine-grained textures, and complex structural layouts. Existing methods attempt to encode diverse RS attributes into a unified latent space, but this implicit encoding strategy often leads to attribute conflation, undermining generalization under domain shifts. To address these limitations, we propose Attribute-Guided Prompt Learning with Knowledge Experts Mixture (AGPL-KEM), a prompt learning framework that explicitly disentangles RS semantics through structured domain knowledge. Specifically, AGPL-KEM introduces a Knowledge Experts Mixture module to partition the latent space into attribute-specific subspaces, thereby enhancing the model’s ability to capture and separate key RS attributes. To promote attribute-specific learning and reduce inter-expert redundancy, we design an Attribute-Guided Dual-Loss mechanism comprising an Attribute-Guided Semantic Alignment Loss for expert-attribute consistency and an Expert Semantic Orthogonality Loss that reduces semantic redundancy among experts through orthogonality constraints. Comprehensive experiments conducted on four remote sensing benchmark datasets (PatternNet, RSICD, RESISC45, and MLRSNet) demonstrate that AGPL-KEM achieves state-of-the-art performance, validating its effectiveness and robustness. Codes are available at https://github.com/4wlb/AGPL-KEM.
大规模视觉语言模型(VLMs)在各种计算机视觉任务中取得了显著的成功。然而,由于遥感图像具有光谱非均质性、细粒纹理和复杂的结构布局等特点,使VLMs适应遥感任务仍然具有挑战性。现有方法试图将多种RS属性编码到统一的潜在空间中,但这种隐式编码策略往往导致属性合并,不利于域移位下的泛化。为了解决这些限制,我们提出了基于知识专家混合的属性引导提示学习(AGPL-KEM),这是一个通过结构化领域知识明确地解开RS语义的提示学习框架。具体而言,AGPL-KEM引入了Knowledge Experts Mixture模块,将潜在空间划分为特定属性的子空间,从而增强了模型捕获和分离关键RS属性的能力。为了促进特定属性学习和减少专家间冗余,我们设计了一种属性引导双损失机制,包括属性引导的专家-属性一致性语义对齐损失和专家语义正交性损失,通过正交性约束减少专家间的语义冗余。在四个遥感基准数据集(PatternNet、RSICD、RESISC45和MLRSNet)上进行的综合实验表明,AGPL-KEM达到了最先进的性能,验证了其有效性和鲁棒性。代码可在https://github.com/4wlb/AGPL-KEM上获得。
{"title":"AGPL-KEM : Attribute-guided prompt learning with knowledge experts mixture for few-shot remote sensing image classification","authors":"Chunlei Wu ,&nbsp;Congzheng Zhu ,&nbsp;Qinfu Xu ,&nbsp;Xu Liu ,&nbsp;Yongzhen Zhang ,&nbsp;Leiquan Wang ,&nbsp;Jie Wu","doi":"10.1016/j.knosys.2026.115375","DOIUrl":"10.1016/j.knosys.2026.115375","url":null,"abstract":"<div><div>Large-scale vision-language models (VLMs) have shown significant success in various computer vision tasks. However, adapting VLMs to remote sensing (RS) tasks remains challenging due to the distinct characteristics of RS imagery, such as spectral heterogeneity, fine-grained textures, and complex structural layouts. Existing methods attempt to encode diverse RS attributes into a unified latent space, but this implicit encoding strategy often leads to attribute conflation, undermining generalization under domain shifts. To address these limitations, we propose Attribute-Guided Prompt Learning with Knowledge Experts Mixture (AGPL-KEM), a prompt learning framework that explicitly disentangles RS semantics through structured domain knowledge. Specifically, AGPL-KEM introduces a Knowledge Experts Mixture module to partition the latent space into attribute-specific subspaces, thereby enhancing the model’s ability to capture and separate key RS attributes. To promote attribute-specific learning and reduce inter-expert redundancy, we design an Attribute-Guided Dual-Loss mechanism comprising an Attribute-Guided Semantic Alignment Loss for expert-attribute consistency and an Expert Semantic Orthogonality Loss that reduces semantic redundancy among experts through orthogonality constraints. Comprehensive experiments conducted on four remote sensing benchmark datasets (PatternNet, RSICD, RESISC45, and MLRSNet) demonstrate that AGPL-KEM achieves state-of-the-art performance, validating its effectiveness and robustness. Codes are available at <span><span>https://github.com/4wlb/AGPL-KEM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115375"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Motion data segmentation using robust subspace clustering with noise suppression 基于噪声抑制的鲁棒子空间聚类的运动数据分割
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-26 DOI: 10.1016/j.knosys.2026.115386
Qian Wang , Hong Song , Yungang Hao , Yunzhi Luo , Jingfan Fan , Jian Yang
Numerous applications regard motion segmentation as a fundamental and vital process. A plethora of motion segmentation techniques have been introduced, with the subspace clustering-based method standing out, particularly because of its unsupervised nature. However, these methods often face a challenge in effectively handling nonlinear data with hybrid noise. In the present study, we propose a novel robust subspace clustering methodology, specifically designed to address the complexities inherent in motion segmentation tasks. We’ve termed it as Robust Subspace Clustering with Noise Suppression (RSCNS),which integrates hybrid noise reconstruction with a representation of data relationships. Specifically, we propose a hybrid noise modeling method by joining Correntropy and Cauchy function to suppress noise and outlier pollution. To restore the corrupted data, we treat the motion trajectory feature data matrix as an approximate low-rank matrix and design a truncated weighting nuclear norm regularization constraint. Meanwhile, the block diagonal regularizer (BDR) is incorporated into our model to ensure that motion trajectory features from the same moving object are clustered together. Experimental evaluations are conducted on various video datasets, demonstrating that RSCNS can effectively handle motion segmentation tasks not only in visible light video, but also in invisible light (infrared) video.
许多应用都把运动分割作为一个基本的和重要的过程。已经引入了大量的运动分割技术,其中基于子空间聚类的方法脱颖而出,特别是因为它的无监督性质。然而,这些方法在有效处理带有混合噪声的非线性数据时往往面临挑战。在本研究中,我们提出了一种新的鲁棒子空间聚类方法,专门用于解决运动分割任务中固有的复杂性。我们将其称为具有噪声抑制的鲁棒子空间聚类(RSCNS),它将混合噪声重构与数据关系表示集成在一起。具体来说,我们提出了一种结合Correntropy和Cauchy函数来抑制噪声和离群污染的混合噪声建模方法。为了恢复损坏的数据,我们将运动轨迹特征数据矩阵视为近似的低秩矩阵,并设计了截断加权核范数正则化约束。同时,在模型中引入了块对角正则化(BDR),以保证同一运动对象的运动轨迹特征聚类在一起。在各种视频数据集上进行了实验评估,结果表明RSCNS不仅可以有效地处理可见光视频中的运动分割任务,也可以有效地处理不可见光(红外)视频中的运动分割任务。
{"title":"Motion data segmentation using robust subspace clustering with noise suppression","authors":"Qian Wang ,&nbsp;Hong Song ,&nbsp;Yungang Hao ,&nbsp;Yunzhi Luo ,&nbsp;Jingfan Fan ,&nbsp;Jian Yang","doi":"10.1016/j.knosys.2026.115386","DOIUrl":"10.1016/j.knosys.2026.115386","url":null,"abstract":"<div><div>Numerous applications regard motion segmentation as a fundamental and vital process. A plethora of motion segmentation techniques have been introduced, with the subspace clustering-based method standing out, particularly because of its unsupervised nature. However, these methods often face a challenge in effectively handling nonlinear data with hybrid noise. In the present study, we propose a novel robust subspace clustering methodology, specifically designed to address the complexities inherent in motion segmentation tasks. We’ve termed it as Robust Subspace Clustering with Noise Suppression (RSCNS),which integrates hybrid noise reconstruction with a representation of data relationships. Specifically, we propose a hybrid noise modeling method by joining Correntropy and Cauchy function to suppress noise and outlier pollution. To restore the corrupted data, we treat the motion trajectory feature data matrix as an approximate low-rank matrix and design a truncated weighting nuclear norm regularization constraint. Meanwhile, the block diagonal regularizer (BDR) is incorporated into our model to ensure that motion trajectory features from the same moving object are clustered together. Experimental evaluations are conducted on various video datasets, demonstrating that RSCNS can effectively handle motion segmentation tasks not only in visible light video, but also in invisible light (infrared) video.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115386"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DKC: Data-driven and knowledge-guided causal discovery with application to healthcare data DKC:数据驱动和知识引导的因果关系发现及其在医疗保健数据中的应用
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-22 DOI: 10.1016/j.knosys.2026.115384
Uzma Hasan, Md Osman Gani
Efficient causal discovery is essential for constructing reliable causal graphs that provide actionable insights in domains where randomized experiments are infeasible. This study introduces DKC, a novel causal discovery algorithm that utilizes both observational data and prior knowledge to enable reliable learning of causal graphs that supports decision-making in complex domains such as healthcare. Traditional causal discovery methods often rely exclusively on observational data, which reduces their effectiveness when datasets are noisy, limited in size, or involve intricate causal relationships. Moreover, existing approaches seldom incorporate prior knowledge in a flexible manner, limiting their applicability in real-world scenarios. DKC addresses these challenges by efficiently incorporating causal priors into the discovery process through a tailored scoring criterion that supports both hard and soft constraints. The framework operates in three stages: (i) estimation of a topological ordering of variables, (ii) ranking candidate edges according to likelihood, and (iii) performing a constrained causal search using the proposed score to balance model fit, complexity, and prior knowledge. We establish theoretical guarantees demonstrating that the score is statistically consistent, converging to the true causal structure as sample size grows. Extensive experiments on synthetic datasets of varying scales, as well as real-world healthcare data, confirm that DKC outperforms state-of-the-art baselines in terms of structural accuracy and robustness. By harmonizing data-driven insights with prior knowledge, DKC provides a trustworthy foundation for causal inference across diverse fields. Its application to a clinical problem highlights its potential to guide critical decision-making, while its general framework ensures broad utility in any domains requiring reliable, knowledge-informed causal reasoning.
有效的因果发现对于构建可靠的因果图至关重要,在随机实验不可行的领域提供可操作的见解。本研究介绍了DKC,一种新的因果发现算法,它利用观察数据和先验知识来实现因果图的可靠学习,从而支持医疗保健等复杂领域的决策。传统的因果发现方法通常完全依赖于观测数据,当数据集嘈杂、规模有限或涉及复杂的因果关系时,这降低了它们的有效性。此外,现有的方法很少以灵活的方式纳入先验知识,限制了它们在现实场景中的适用性。DKC通过支持硬约束和软约束的定制评分标准,有效地将因果先验纳入发现过程,从而解决了这些挑战。该框架分三个阶段运行:(i)估计变量的拓扑顺序,(ii)根据似然对候选边进行排序,以及(iii)使用提议的分数执行约束因果搜索以平衡模型拟合,复杂性和先验知识。我们建立了理论保证,证明分数在统计上是一致的,随着样本量的增长收敛到真正的因果结构。在不同规模的合成数据集以及现实世界的医疗保健数据上进行的大量实验证实,DKC在结构准确性和稳健性方面优于最先进的基线。通过将数据驱动的见解与先验知识相协调,DKC为跨不同领域的因果推理提供了可靠的基础。它在临床问题上的应用突出了其指导关键决策的潜力,而其总体框架确保了在任何需要可靠的、知识灵通的因果推理的领域的广泛效用。
{"title":"DKC: Data-driven and knowledge-guided causal discovery with application to healthcare data","authors":"Uzma Hasan,&nbsp;Md Osman Gani","doi":"10.1016/j.knosys.2026.115384","DOIUrl":"10.1016/j.knosys.2026.115384","url":null,"abstract":"<div><div>Efficient causal discovery is essential for constructing reliable causal graphs that provide actionable insights in domains where randomized experiments are infeasible. This study introduces DKC, a novel causal discovery algorithm that utilizes both observational data and prior knowledge to enable reliable learning of causal graphs that supports decision-making in complex domains such as healthcare. Traditional causal discovery methods often rely exclusively on observational data, which reduces their effectiveness when datasets are noisy, limited in size, or involve intricate causal relationships. Moreover, existing approaches seldom incorporate prior knowledge in a flexible manner, limiting their applicability in real-world scenarios. DKC addresses these challenges by efficiently incorporating causal priors into the discovery process through a tailored scoring criterion that supports both hard and soft constraints. The framework operates in three stages: (i) estimation of a topological ordering of variables, (ii) ranking candidate edges according to likelihood, and (iii) performing a constrained causal search using the proposed score to balance model fit, complexity, and prior knowledge. We establish theoretical guarantees demonstrating that the score is statistically consistent, converging to the true causal structure as sample size grows. Extensive experiments on synthetic datasets of varying scales, as well as real-world healthcare data, confirm that DKC outperforms state-of-the-art baselines in terms of structural accuracy and robustness. By harmonizing data-driven insights with prior knowledge, DKC provides a trustworthy foundation for causal inference across diverse fields. Its application to a clinical problem highlights its potential to guide critical decision-making, while its general framework ensures broad utility in any domains requiring reliable, knowledge-informed causal reasoning.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115384"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data privacy preserved student career prediction with deep learning and blockchain based mechanism 基于深度学习和区块链机制的学生职业预测数据隐私保护
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-15 DOI: 10.1016/j.knosys.2026.115352
Mansi Aggarwal , Vaibhav Vyas
In recent days, student success has become a primary strategic objective for most higher education institutions. Due to increasing operational costs and budget cuts, educational institutions are paying more attention to satisfying student enrollment in their plans without compromising the quality and rigour of education. The existing research advancement in big data analytics and machine learning (ML) techniques massively depend on student data to predict student information. But these existing models address some issues in classifying the student data and predicting their performance. This paper proposes Deep learning with blockchain to improve the accuracy of predicting student career success with data privacy. The paper is divided into three phases: pre-processing, feature selection and classification. Initially, in the pre-processing stage, the min-max normalization method is performed to normalize the data. Next, the phase employs a feature selection method by using an advanced technique of improved Levy flight assisted fire hawk optimization (Imp-LeFoP). This method selects the optimal subset of the features. Further, the selected feature is taken as input and passed into a classifier model, namely optimized hybrid densely connected convolutional VGG assisted car-studformer (Opt-Studformer). The hyperparameter present in the classifier model is properly tuned by the dung beetle optimization mechanism (DBO). For student data privacy, the blockchain mechanism is integrated to store the data securely. In addition, the blockchain mechanism uses the improved proof-of-stake (Im-PoS) consensus algorithm for retrieval validation. The performance of career prediction is evaluated by using two datasets: computer science student career prediction and career prediction. Thus, both datasets achieve high accuracy rates of 98.97% and 98.84%, respectively.
近年来,学生的成功已成为大多数高等教育机构的主要战略目标。由于运营成本的增加和预算的削减,教育机构越来越注重在不影响教育质量和严谨性的情况下满足学生的入学计划。现有的大数据分析和机器学习(ML)技术的研究进展很大程度上依赖于学生数据来预测学生信息。但是这些现有的模型在分类学生数据和预测他们的表现方面存在一些问题。本文提出利用区块链进行深度学习,以提高利用数据隐私预测学生职业成功的准确性。本文分为预处理、特征选择和分类三个阶段。首先,在预处理阶段,采用最小-最大归一化方法对数据进行归一化。接下来,采用先进的改进Levy飞行辅助火鹰优化技术(Imp-LeFoP)进行特征选择。该方法选择特征的最优子集。进一步,将选择的特征作为输入,传递给分类器模型,即优化的混合密集连接卷积VGG辅助car-studformer (Opt-Studformer)。通过屎壳郎优化机制(DBO)对分类器模型中的超参数进行了适当的调整。在学生数据隐私方面,集成了区块链机制来安全存储数据。此外,区块链机制使用改进的权益证明(Im-PoS)共识算法进行检索验证。利用计算机科学专业学生职业预测和职业预测两个数据集对职业预测的效果进行了评价。因此,两个数据集的准确率分别达到了98.97%和98.84%。
{"title":"Data privacy preserved student career prediction with deep learning and blockchain based mechanism","authors":"Mansi Aggarwal ,&nbsp;Vaibhav Vyas","doi":"10.1016/j.knosys.2026.115352","DOIUrl":"10.1016/j.knosys.2026.115352","url":null,"abstract":"<div><div>In recent days, student success has become a primary strategic objective for most higher education institutions. Due to increasing operational costs and budget cuts, educational institutions are paying more attention to satisfying student enrollment in their plans without compromising the quality and rigour of education. The existing research advancement in big data analytics and machine learning (ML) techniques massively depend on student data to predict student information. But these existing models address some issues in classifying the student data and predicting their performance. This paper proposes Deep learning with blockchain to improve the accuracy of predicting student career success with data privacy. The paper is divided into three phases: pre-processing, feature selection and classification. Initially, in the pre-processing stage, the min-max normalization method is performed to normalize the data. Next, the phase employs a feature selection method by using an advanced technique of improved Levy flight assisted fire hawk optimization (Imp-LeFoP). This method selects the optimal subset of the features. Further, the selected feature is taken as input and passed into a classifier model, namely optimized hybrid densely connected convolutional VGG assisted car-studformer (Opt-Studformer). The hyperparameter present in the classifier model is properly tuned by the dung beetle optimization mechanism (DBO). For student data privacy, the blockchain mechanism is integrated to store the data securely. In addition, the blockchain mechanism uses the improved proof-of-stake (Im-PoS) consensus algorithm for retrieval validation. The performance of career prediction is evaluated by using two datasets: computer science student career prediction and career prediction. Thus, both datasets achieve high accuracy rates of 98.97% and 98.84%, respectively.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115352"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IGPC-MSOS: A knowledge-preserving transfer learning framework with dynamic mode-switching for handling concept drift in network intrusion detection systems IGPC-MSOS:一种具有动态模式切换的知识保留迁移学习框架,用于处理网络入侵检测系统中的概念漂移
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-20 DOI: 10.1016/j.knosys.2026.115361
Methaq A. Shyaa , Noor Farizah Ibrahim , Zurinahni Binti Zainol , Rosni Abdullah , Mohammed Anbar , Laith Alzubaidi
The rapid evolution of cyber threats poses significant challenges to Intrusion Detection Systems (IDS), particularly in dynamic environments affected by concept drift, where shifting attack behaviors degrade long-term detection performance. Existing adaptive IDS solutions often remain limited by fragmented drift-handling mechanisms, weak knowledge retention, and insufficient integration of complementary learning strategies, leaving exploitable blind spots. This paper introduces a unified adaptive IDS framework based on a mode-switching architecture that integrates Online Sequential Extreme Learning Machine (OSELM), Feature-Adaptive OSELM (FA-OSELM), and Knowledge-Preserving OSELM (KP-OSELM). The proposed Incremental Genetic Programming Combiner with Mode-Switching Online Sequential (IGPC-MSOS) method dynamically selects the most effective operational mode according to detected drift patterns and real-time performance feedback. Experimental evaluations across five benchmark datasets demonstrate that IGPC-MSOS consistently achieves 96%–100% recall, delivers competitive or superior F1-scores (0.96–0.9995), and reduces inference latency compared to the State-of-the-Art Approaches. These results confirm the strong adaptability, robustness, and real-time suitability of the proposed approach for intrusion detection in evolving and high-throughput network environments.
网络威胁的快速演变给入侵检测系统(IDS)带来了重大挑战,特别是在受概念漂移影响的动态环境中,攻击行为的变化会降低长期检测性能。现有的自适应IDS解决方案往往受到碎片化的漂移处理机制、较弱的知识保留和互补学习策略集成不足的限制,留下可利用的盲点。本文介绍了一种基于模式切换体系结构的统一自适应入侵检测系统框架,该框架集成了在线顺序极限学习机(OSELM)、特征自适应OSELM (FA-OSELM)和知识保持OSELM (KP-OSELM)。提出了基于模式切换在线序列(IGPC-MSOS)的增量遗传规划组合方法,该方法根据检测到的漂移模式和实时性能反馈动态选择最有效的工作模式。五个基准数据集的实验评估表明,与最先进的方法相比,IGPC-MSOS始终达到96%-100%的召回率,提供具有竞争力或更高的f1分数(0.96-0.9995),并减少了推理延迟。这些结果证实了该方法在不断发展和高吞吐量的网络环境中具有很强的适应性、鲁棒性和实时性。
{"title":"IGPC-MSOS: A knowledge-preserving transfer learning framework with dynamic mode-switching for handling concept drift in network intrusion detection systems","authors":"Methaq A. Shyaa ,&nbsp;Noor Farizah Ibrahim ,&nbsp;Zurinahni Binti Zainol ,&nbsp;Rosni Abdullah ,&nbsp;Mohammed Anbar ,&nbsp;Laith Alzubaidi","doi":"10.1016/j.knosys.2026.115361","DOIUrl":"10.1016/j.knosys.2026.115361","url":null,"abstract":"<div><div>The rapid evolution of cyber threats poses significant challenges to Intrusion Detection Systems (IDS), particularly in dynamic environments affected by concept drift, where shifting attack behaviors degrade long-term detection performance. Existing adaptive IDS solutions often remain limited by fragmented drift-handling mechanisms, weak knowledge retention, and insufficient integration of complementary learning strategies, leaving exploitable blind spots. This paper introduces a unified adaptive IDS framework based on a mode-switching architecture that integrates Online Sequential Extreme Learning Machine (OSELM), Feature-Adaptive OSELM (FA-OSELM), and Knowledge-Preserving OSELM (KP-OSELM). The proposed Incremental Genetic Programming Combiner with Mode-Switching Online Sequential (IGPC-MSOS) method dynamically selects the most effective operational mode according to detected drift patterns and real-time performance feedback. Experimental evaluations across five benchmark datasets demonstrate that IGPC-MSOS consistently achieves 96%–100% recall, delivers competitive or superior F1-scores (0.96–0.9995), and reduces inference latency compared to the State-of-the-Art Approaches. These results confirm the strong adaptability, robustness, and real-time suitability of the proposed approach for intrusion detection in evolving and high-throughput network environments.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115361"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FreqMambaMark: Wavelet-Mamba-driven robust medical image watermarking FreqMambaMark:小波mamba驱动的鲁棒医学图像水印
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-20 DOI: 10.1016/j.knosys.2026.115344
Zhongxiang He , Yuling Chen , Yixian Yang , Zhi Ouyang , Yun Luo , Long Chen
Digital watermarking technology embeds authentication information into medical images to ensure data authenticity, integrity, and traceability. However, subtle textures and grayscale variations in medical images are crucial for diagnosis. Inappropriate watermark embedding may interfere with clinical interpretation, posing potential risks to diagnostic accuracy. Meanwhile, existing methods struggle to balance watermark imperceptibility and robustness against composite attacks.
To address these challenges, we propose FreqMambaMark, a robust watermarking framework for medical images based on the Mamba architecture in the wavelet domain. The framework decomposes image frequency bands using the Haar wavelet, employs an adaptive residual hiding strategy, and utilizes a convolutional neural network (CNN) for fine-grained watermark embedding, achieving high-fidelity results (PSNR: 43.8 dB, SSIM: 0.971). In watermark extraction, the long-range modeling of the Mamba state-space model enhances the recovery ability of low-frequency components, improving watermark robustness under complex distortions by 10.8%. Furthermore, we introduce a dynamic composite attack training paradigm and validate the framework’s generalization ability on the NIH Chest X-ray, Brain-Tumor-MRI, and COVIDx CT-3 datasets, providing an efficient solution for medical image security.
数字水印技术将认证信息嵌入到医学图像中,确保数据的真实性、完整性和可追溯性。然而,医学图像中的细微纹理和灰度变化对诊断至关重要。不适当的水印嵌入可能会干扰临床解释,对诊断的准确性构成潜在风险。同时,现有的方法难以平衡水印不可感知性和抗复合攻击的鲁棒性。为了解决这些挑战,我们提出了一种基于小波域曼巴结构的医学图像鲁棒水印框架FreqMambaMark。该框架利用Haar小波对图像频带进行分解,采用自适应残差隐藏策略,并利用卷积神经网络(CNN)进行细粒度水印嵌入,获得了高保真效果(PSNR: 43.8 dB, SSIM: 0.971)。在水印提取中,曼巴状态空间模型的远程建模增强了低频分量的恢复能力,使水印在复杂失真条件下的鲁棒性提高了10.8%。此外,我们引入了动态复合攻击训练范式,并在NIH胸部x射线、脑瘤- mri和covid -3数据集上验证了框架的泛化能力,为医学图像安全提供了有效的解决方案。
{"title":"FreqMambaMark: Wavelet-Mamba-driven robust medical image watermarking","authors":"Zhongxiang He ,&nbsp;Yuling Chen ,&nbsp;Yixian Yang ,&nbsp;Zhi Ouyang ,&nbsp;Yun Luo ,&nbsp;Long Chen","doi":"10.1016/j.knosys.2026.115344","DOIUrl":"10.1016/j.knosys.2026.115344","url":null,"abstract":"<div><div>Digital watermarking technology embeds authentication information into medical images to ensure data authenticity, integrity, and traceability. However, subtle textures and grayscale variations in medical images are crucial for diagnosis. Inappropriate watermark embedding may interfere with clinical interpretation, posing potential risks to diagnostic accuracy. Meanwhile, existing methods struggle to balance watermark imperceptibility and robustness against composite attacks.</div><div>To address these challenges, we propose <em>FreqMambaMark</em>, a robust watermarking framework for medical images based on the Mamba architecture in the wavelet domain. The framework decomposes image frequency bands using the Haar wavelet, employs an adaptive residual hiding strategy, and utilizes a convolutional neural network (CNN) for fine-grained watermark embedding, achieving high-fidelity results (PSNR: 43.8 dB, SSIM: 0.971). In watermark extraction, the long-range modeling of the Mamba state-space model enhances the recovery ability of low-frequency components, improving watermark robustness under complex distortions by 10.8%. Furthermore, we introduce a dynamic composite attack training paradigm and validate the framework’s generalization ability on the NIH Chest X-ray, Brain-Tumor-MRI, and COVIDx CT-3 datasets, providing an efficient solution for medical image security.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115344"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146025753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-scale channel attention and feature fusion-aware aggregation for sonar images object detection 声纳图像目标检测中的跨尺度通道关注和特征融合感知聚合
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-20 DOI: 10.1016/j.knosys.2026.115371
Pengfei Shi , Hanren Wang , Qianqian Zhang , Yuanxue Xin
Feature extraction and feature fusion are crucial for sonar image target detection. In terms of feature extraction, due to device limitations and the complexity of the underwater environment, sonar images often exhibit high levels of noise, which results in high similarity between targets and background, thus affecting feature extraction. In terms of feature fusion, transformer-based models rely on self-attention mechanisms, but this leads to a lack of local prior information. The interference from noise and the similarity between targets and background disrupt the computation of global relationships, confusing noisy features with useful ones, leading to insufficient geometric information and ultimately affecting detection accuracy. To address these issues, we propose an advanced detection framework that combines effective feature extraction and multi-scale feature fusion. We introduce a cross-scale channel attention module that dynamically adjusts channel weights by integrating the advantages of the squeeze-and-excitation (SE) module and the efficient multi-scale attention (EMA) module, capturing multi-scale dependencies, suppressing background noise, and enhancing global feature representation. Moreover, to further improve the effectiveness of feature fusion and better leverage geometric information, we design a CNN-based feature fusion perception aggregation network. This network promotes interaction between low-level geometric details and high-level semantic information through skip connections, enhancing feature representation and improving detection accuracy. Experimental results show that our method outperforms some advanced detection models in terms of detection performance.
特征提取和融合是声纳图像目标检测的关键。在特征提取方面,由于设备的限制和水下环境的复杂性,声纳图像往往表现出高水平的噪声,导致目标与背景高度相似,从而影响特征提取。在特征融合方面,基于变压器的模型依赖于自关注机制,但这导致缺乏局部先验信息。噪声的干扰以及目标与背景的相似性破坏了全局关系的计算,使噪声特征与有用特征混淆,导致几何信息不足,最终影响检测精度。为了解决这些问题,我们提出了一种结合有效特征提取和多尺度特征融合的高级检测框架。我们引入了一个跨尺度通道注意模块,该模块通过集成压缩激励(SE)模块和高效多尺度注意(EMA)模块的优点,动态调整通道权重,捕获多尺度依赖关系,抑制背景噪声,增强全局特征表示。此外,为了进一步提高特征融合的有效性,更好地利用几何信息,我们设计了一个基于cnn的特征融合感知聚合网络。该网络通过跳过连接促进低级几何细节与高级语义信息的交互,增强特征表示,提高检测精度。实验结果表明,该方法在检测性能上优于一些先进的检测模型。
{"title":"Cross-scale channel attention and feature fusion-aware aggregation for sonar images object detection","authors":"Pengfei Shi ,&nbsp;Hanren Wang ,&nbsp;Qianqian Zhang ,&nbsp;Yuanxue Xin","doi":"10.1016/j.knosys.2026.115371","DOIUrl":"10.1016/j.knosys.2026.115371","url":null,"abstract":"<div><div>Feature extraction and feature fusion are crucial for sonar image target detection. In terms of feature extraction, due to device limitations and the complexity of the underwater environment, sonar images often exhibit high levels of noise, which results in high similarity between targets and background, thus affecting feature extraction. In terms of feature fusion, transformer-based models rely on self-attention mechanisms, but this leads to a lack of local prior information. The interference from noise and the similarity between targets and background disrupt the computation of global relationships, confusing noisy features with useful ones, leading to insufficient geometric information and ultimately affecting detection accuracy. To address these issues, we propose an advanced detection framework that combines effective feature extraction and multi-scale feature fusion. We introduce a cross-scale channel attention module that dynamically adjusts channel weights by integrating the advantages of the squeeze-and-excitation (SE) module and the efficient multi-scale attention (EMA) module, capturing multi-scale dependencies, suppressing background noise, and enhancing global feature representation. Moreover, to further improve the effectiveness of feature fusion and better leverage geometric information, we design a CNN-based feature fusion perception aggregation network. This network promotes interaction between low-level geometric details and high-level semantic information through skip connections, enhancing feature representation and improving detection accuracy. Experimental results show that our method outperforms some advanced detection models in terms of detection performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115371"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge-Based Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1