首页 > 最新文献

Knowledge-Based Systems最新文献

英文 中文
DOREMI: Optimizing long tail predictions in document-level relation extraction DOREMI:优化文档级关系提取中的长尾预测
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115359
Laura Menotti, Stefano Marchesin, Gianmaria Silvello
Document-Level Relation Extraction (DocRE) presents significant challenges due to its reliance on cross-sentence context and the long-tail distribution of relation types, where many relations have scarce training examples. In this work, we introduce DOcument-level Relation Extraction optiMizing the long taIl (DOREMI), an iterative framework that enhances underrepresented relations through minimal yet targeted manual annotations. Unlike previous approaches that rely on large-scale noisy data or heuristic denoising, DOREMI actively selects the most informative examples to improve training efficiency and robustness. DOREMI can be applied to any existing DocRE model and is effective at mitigating long-tail biases, offering a scalable solution to improve generalization on rare relations.
文档级关系提取(DocRE)由于依赖于跨句上下文和关系类型的长尾分布而面临重大挑战,其中许多关系缺乏训练样例。在这项工作中,我们引入了文档级关系提取优化长尾(DOREMI),这是一个迭代框架,通过最小但有针对性的手动注释来增强未充分表示的关系。与以往依赖于大规模噪声数据或启发式去噪的方法不同,DOREMI主动选择信息量最大的样本来提高训练效率和鲁棒性。DOREMI可以应用于任何现有的DocRE模型,并且有效地减轻了长尾偏差,提供了一个可扩展的解决方案来提高对稀有关系的泛化。
{"title":"DOREMI: Optimizing long tail predictions in document-level relation extraction","authors":"Laura Menotti,&nbsp;Stefano Marchesin,&nbsp;Gianmaria Silvello","doi":"10.1016/j.knosys.2026.115359","DOIUrl":"10.1016/j.knosys.2026.115359","url":null,"abstract":"<div><div>Document-Level Relation Extraction (DocRE) presents significant challenges due to its reliance on cross-sentence context and the long-tail distribution of relation types, where many relations have scarce training examples. In this work, we introduce <strong>DO</strong>cument-level <strong>R</strong>elation <strong>E</strong>xtraction opti<strong>M</strong>izing the long ta<strong>I</strong>l (DOREMI), an iterative framework that enhances underrepresented relations through minimal yet targeted manual annotations. Unlike previous approaches that rely on large-scale noisy data or heuristic denoising, DOREMI actively selects the most informative examples to improve training efficiency and robustness. DOREMI can be applied to any existing DocRE model and is effective at mitigating long-tail biases, offering a scalable solution to improve generalization on rare relations.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115359"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PDPA: A prompt-based dual persona-aware approach for empathetic response generation PDPA:移情反应生成的基于提示的双重角色感知方法
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115390
Wei Zhang , Changhong Jiang , Ming Xia , Lulu Wang , Zhongtian Hu , Jiashi Lin , Ronghan Li
Maintaining personality consistency is essential for improving the performance of empathetic dialogue systems. However, existing approaches to persona-aware empathetic response generation commonly exhibit two fundamental limitations in persona information extraction: (1) an inherent trade-off between the richness of information and contextual consistency, and (2) a unidirectional extraction strategy that considers only one interlocutor in the dialogue history. To address these limitations, this study proposes a method that utilizes Pre-trained Language Models (PLMs) and Large Language Models (LLMs) to extract dense persona information from all historical utterances of each participant in the training set, based on their participant IDs. Building on this, we introduce PDPA, a prompt-driven framework that jointly models user and agent perspectives. Specifically, a novel prompt template with three special tokens is designed to explicitly distinguish persona information from dialogue history during feature extraction. Furthermore, a persona-aware heterogeneous graph is constructed to enhance the aggregation of discourse structure, personality traits, complete dialogue history, and external knowledge. Finally, to ensure the effective use of refined persona information together with essential contextual details during generation, a dialogue decoder equipped with a dynamic pointer network is employed. Experimental evaluations demonstrate that the proposed model consistently outperforms strong baselines on two datasets derived from the EMPATHETICDIALOGUES benchmark. In particular, compared with its backbone BART, PDPA achieves notable improvements in emotion classification accuracy, with an increase of 4.73% when assisted by LLM-generated persona information and 4.36% when assisted by PLM-generated persona information, highlighting the effectiveness of our approach.
保持人格一致性对于提高移情对话系统的性能至关重要。然而,现有的角色感知共情反应生成方法在角色信息提取方面普遍存在两个基本限制:(1)信息丰富性和上下文一致性之间的内在权衡;(2)单向提取策略只考虑对话历史中的一个对话者。为了解决这些限制,本研究提出了一种利用预训练语言模型(PLMs)和大型语言模型(LLMs)的方法,根据参与者id从训练集中每个参与者的所有历史话语中提取密集的角色信息。在此基础上,我们引入PDPA,这是一个联合建模用户和代理透视图的提示驱动框架。具体来说,设计了一个带有三个特殊标记的新颖提示模板,以便在特征提取过程中明确区分人物角色信息和对话历史。在此基础上,构建了一个人物感知的异构图,增强了话语结构、人格特征、完整对话历史和外部知识的聚合。最后,为了确保在生成过程中有效地利用精炼的人物角色信息和必要的上下文细节,采用了配备动态指针网络的对话解码器。实验评估表明,所提出的模型在来自EMPATHETICDIALOGUES基准的两个数据集上始终优于强基线。与主干BART相比,PDPA在情绪分类准确率上取得了显著的提高,在llm生成的人物信息辅助下提高了4.73%,在plm生成的人物信息辅助下提高了4.36%,凸显了我们方法的有效性。
{"title":"PDPA: A prompt-based dual persona-aware approach for empathetic response generation","authors":"Wei Zhang ,&nbsp;Changhong Jiang ,&nbsp;Ming Xia ,&nbsp;Lulu Wang ,&nbsp;Zhongtian Hu ,&nbsp;Jiashi Lin ,&nbsp;Ronghan Li","doi":"10.1016/j.knosys.2026.115390","DOIUrl":"10.1016/j.knosys.2026.115390","url":null,"abstract":"<div><div>Maintaining personality consistency is essential for improving the performance of empathetic dialogue systems. However, existing approaches to persona-aware empathetic response generation commonly exhibit two fundamental limitations in persona information extraction: (1) an inherent trade-off between the richness of information and contextual consistency, and (2) a unidirectional extraction strategy that considers only one interlocutor in the dialogue history. To address these limitations, this study proposes a method that utilizes Pre-trained Language Models (PLMs) and Large Language Models (LLMs) to extract dense persona information from all historical utterances of each participant in the training set, based on their participant IDs. Building on this, we introduce PDPA, a prompt-driven framework that jointly models user and agent perspectives. Specifically, a novel prompt template with three special tokens is designed to explicitly distinguish persona information from dialogue history during feature extraction. Furthermore, a persona-aware heterogeneous graph is constructed to enhance the aggregation of discourse structure, personality traits, complete dialogue history, and external knowledge. Finally, to ensure the effective use of refined persona information together with essential contextual details during generation, a dialogue decoder equipped with a dynamic pointer network is employed. Experimental evaluations demonstrate that the proposed model consistently outperforms strong baselines on two datasets derived from the EMPATHETICDIALOGUES benchmark. In particular, compared with its backbone BART, PDPA achieves notable improvements in emotion classification accuracy, with an increase of 4.73% when assisted by LLM-generated persona information and 4.36% when assisted by PLM-generated persona information, highlighting the effectiveness of our approach.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115390"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedDCA : Stable and unified Wasserstein adaptation to federated concept drift FedDCA:稳定统一的Wasserstein对联邦概念漂移的适应
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115342
Liyu Fang , Wu Wen , Xiaolin Zheng
Federated Learning (FL) with concept drift faces three fundamental challenges. First, existing methods lack a drift-aware client representation that can directly reflect changes in data distributions. Second, clustering with drifting clients often causes collaborative instability by contaminating the structure of client groups. Third, many approaches suffer from a methodological disconnect between drift detection and adaptation.
To address these challenges, we propose FedDCA, a stable and unified framework for federated concept drift adaptation. FedDCA introduces a Label Profile (LP), a compact distributional representation that captures each client’s current data concept and enables principled drift-aware similarity measurement. Based on LPs, FedDCA employs Drift-Aware Anchor Clustering, which performs Variational Wasserstein Clustering exclusively on stable clients to form robust anchor centroids, thereby preserving collaborative stability. Drifting clients are then assigned to the nearest anchor, allowing rapid adaptation without destabilizing the overall system. By unifying drift detection and clustering adaptation within the same Wasserstein metric space, FedDCA provides a consistent and effective response to dynamic environments. Extensive experiments demonstrate that FedDCA significantly outperforms state-of-the-art methods in both accuracy and adaptation speed under various concept drift scenarios.
具有概念漂移的联邦学习(FL)面临着三个基本挑战。首先,现有方法缺乏能够直接反映数据分布变化的漂移感知客户机表示。其次,具有漂移客户的集群通常会污染客户群体的结构,从而导致协作不稳定。第三,许多方法在漂移检测和适应之间存在方法论上的脱节。为了解决这些挑战,我们提出了FedDCA,一个稳定和统一的联邦概念漂移适应框架。FedDCA引入了标签配置文件(LP),这是一种紧凑的分布表示,可以捕获每个客户端的当前数据概念,并实现原则性漂移感知相似性测量。在lp的基础上,FedDCA采用了漂移感知锚点聚类(Drift-Aware Anchor Clustering),它只对稳定的客户端执行变分Wasserstein聚类(Variational Wasserstein Clustering),形成鲁棒锚点质心,从而保持协同稳定性。然后将漂移客户端分配到最近的锚点,允许快速适应而不会破坏整个系统。通过在相同的Wasserstein度量空间内统一漂移检测和聚类自适应,FedDCA提供了对动态环境一致和有效的响应。大量的实验表明,在各种概念漂移场景下,FedDCA在精度和自适应速度上都明显优于目前最先进的方法。
{"title":"FedDCA : Stable and unified Wasserstein adaptation to federated concept drift","authors":"Liyu Fang ,&nbsp;Wu Wen ,&nbsp;Xiaolin Zheng","doi":"10.1016/j.knosys.2026.115342","DOIUrl":"10.1016/j.knosys.2026.115342","url":null,"abstract":"<div><div>Federated Learning (FL) with concept drift faces three fundamental challenges. First, existing methods lack a drift-aware client representation that can directly reflect changes in data distributions. Second, clustering with drifting clients often causes collaborative instability by contaminating the structure of client groups. Third, many approaches suffer from a methodological disconnect between drift detection and adaptation.</div><div>To address these challenges, we propose FedDCA, a stable and unified framework for federated concept drift adaptation. FedDCA introduces a Label Profile (LP), a compact distributional representation that captures each client’s current data concept and enables principled drift-aware similarity measurement. Based on LPs, FedDCA employs Drift-Aware Anchor Clustering, which performs Variational Wasserstein Clustering exclusively on stable clients to form robust anchor centroids, thereby preserving collaborative stability. Drifting clients are then assigned to the nearest anchor, allowing rapid adaptation without destabilizing the overall system. By unifying drift detection and clustering adaptation within the same Wasserstein metric space, FedDCA provides a consistent and effective response to dynamic environments. Extensive experiments demonstrate that FedDCA significantly outperforms state-of-the-art methods in both accuracy and adaptation speed under various concept drift scenarios.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115342"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DKC: Data-driven and knowledge-guided causal discovery with application to healthcare data DKC:数据驱动和知识引导的因果关系发现及其在医疗保健数据中的应用
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115384
Uzma Hasan, Md Osman Gani
Efficient causal discovery is essential for constructing reliable causal graphs that provide actionable insights in domains where randomized experiments are infeasible. This study introduces DKC, a novel causal discovery algorithm that utilizes both observational data and prior knowledge to enable reliable learning of causal graphs that supports decision-making in complex domains such as healthcare. Traditional causal discovery methods often rely exclusively on observational data, which reduces their effectiveness when datasets are noisy, limited in size, or involve intricate causal relationships. Moreover, existing approaches seldom incorporate prior knowledge in a flexible manner, limiting their applicability in real-world scenarios. DKC addresses these challenges by efficiently incorporating causal priors into the discovery process through a tailored scoring criterion that supports both hard and soft constraints. The framework operates in three stages: (i) estimation of a topological ordering of variables, (ii) ranking candidate edges according to likelihood, and (iii) performing a constrained causal search using the proposed score to balance model fit, complexity, and prior knowledge. We establish theoretical guarantees demonstrating that the score is statistically consistent, converging to the true causal structure as sample size grows. Extensive experiments on synthetic datasets of varying scales, as well as real-world healthcare data, confirm that DKC outperforms state-of-the-art baselines in terms of structural accuracy and robustness. By harmonizing data-driven insights with prior knowledge, DKC provides a trustworthy foundation for causal inference across diverse fields. Its application to a clinical problem highlights its potential to guide critical decision-making, while its general framework ensures broad utility in any domains requiring reliable, knowledge-informed causal reasoning.
有效的因果发现对于构建可靠的因果图至关重要,在随机实验不可行的领域提供可操作的见解。本研究介绍了DKC,一种新的因果发现算法,它利用观察数据和先验知识来实现因果图的可靠学习,从而支持医疗保健等复杂领域的决策。传统的因果发现方法通常完全依赖于观测数据,当数据集嘈杂、规模有限或涉及复杂的因果关系时,这降低了它们的有效性。此外,现有的方法很少以灵活的方式纳入先验知识,限制了它们在现实场景中的适用性。DKC通过支持硬约束和软约束的定制评分标准,有效地将因果先验纳入发现过程,从而解决了这些挑战。该框架分三个阶段运行:(i)估计变量的拓扑顺序,(ii)根据似然对候选边进行排序,以及(iii)使用提议的分数执行约束因果搜索以平衡模型拟合,复杂性和先验知识。我们建立了理论保证,证明分数在统计上是一致的,随着样本量的增长收敛到真正的因果结构。在不同规模的合成数据集以及现实世界的医疗保健数据上进行的大量实验证实,DKC在结构准确性和稳健性方面优于最先进的基线。通过将数据驱动的见解与先验知识相协调,DKC为跨不同领域的因果推理提供了可靠的基础。它在临床问题上的应用突出了其指导关键决策的潜力,而其总体框架确保了在任何需要可靠的、知识灵通的因果推理的领域的广泛效用。
{"title":"DKC: Data-driven and knowledge-guided causal discovery with application to healthcare data","authors":"Uzma Hasan,&nbsp;Md Osman Gani","doi":"10.1016/j.knosys.2026.115384","DOIUrl":"10.1016/j.knosys.2026.115384","url":null,"abstract":"<div><div>Efficient causal discovery is essential for constructing reliable causal graphs that provide actionable insights in domains where randomized experiments are infeasible. This study introduces DKC, a novel causal discovery algorithm that utilizes both observational data and prior knowledge to enable reliable learning of causal graphs that supports decision-making in complex domains such as healthcare. Traditional causal discovery methods often rely exclusively on observational data, which reduces their effectiveness when datasets are noisy, limited in size, or involve intricate causal relationships. Moreover, existing approaches seldom incorporate prior knowledge in a flexible manner, limiting their applicability in real-world scenarios. DKC addresses these challenges by efficiently incorporating causal priors into the discovery process through a tailored scoring criterion that supports both hard and soft constraints. The framework operates in three stages: (i) estimation of a topological ordering of variables, (ii) ranking candidate edges according to likelihood, and (iii) performing a constrained causal search using the proposed score to balance model fit, complexity, and prior knowledge. We establish theoretical guarantees demonstrating that the score is statistically consistent, converging to the true causal structure as sample size grows. Extensive experiments on synthetic datasets of varying scales, as well as real-world healthcare data, confirm that DKC outperforms state-of-the-art baselines in terms of structural accuracy and robustness. By harmonizing data-driven insights with prior knowledge, DKC provides a trustworthy foundation for causal inference across diverse fields. Its application to a clinical problem highlights its potential to guide critical decision-making, while its general framework ensures broad utility in any domains requiring reliable, knowledge-informed causal reasoning.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115384"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedCLIP-Distill: Heterogeneous federated cross-modal knowledge distillation for multi-domain visual recognition FedCLIP-Distill:面向多领域视觉识别的异构联邦跨模态知识蒸馏
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115383
Yuankun Xia, Hui Wang, Yufeng Zhou
Federated learning (FL) for multi-domain visual recognition confronts significant challenges due to heterogeneous data distributions and domain shifts, which severely impair the semantic generalization capability of existing methods. To address these challenges, we propose FedCLIP-Distill, a novel framework that employs dual-domain knowledge distillation (KD) and contrastive relational distillation (CRD) to leverage the powerful visual-language alignment of CLIP in heterogeneous FL environments. Our approach employs a centralized CLIP teacher model to distill robust visual-textual semantics into lightweight client-side student models, thereby enabling effective local domain adaptation. We provide a theoretical convergence analysis proving that our distillation mechanism effectively mitigates domain gaps and facilitates robust convergence under non-IID settings. Extensive experiments on Office-Caltech10 and DomainNet benchmarks show that FedCLIP-Distill outperforms other methods: it achieves an average cross-domain accuracy of 98.5% on Office-Caltech10 and 80.50% on DomainNet. In different heterogeneous situations (e.g., Dirichlet α = 0.5, 9.52% higher than FedCLIP), demonstrating significant improvements in accuracy and generalization under heterogeneous scenarios. The source code is available at https://github.com/Yuankun-Xia/FedCLIP-Distill.
多领域视觉识别中的联邦学习由于数据的异构分布和领域的转移而面临着巨大的挑战,严重影响了现有方法的语义泛化能力。为了解决这些挑战,我们提出了FedCLIP-Distill,这是一个采用双领域知识蒸馏(KD)和对比关系蒸馏(CRD)的新框架,以利用CLIP在异构FL环境中强大的视觉语言校准功能。我们的方法采用集中式CLIP教师模型,将鲁棒的视觉文本语义提取到轻量级的客户端学生模型中,从而实现有效的局部领域适应。我们提供了一个理论收敛分析,证明我们的蒸馏机制有效地减轻了域差距,并促进了非iid设置下的鲁棒收敛。在Office-Caltech10和DomainNet基准测试上的大量实验表明,FedCLIP-Distill优于其他方法:它在Office-Caltech10和DomainNet上的平均跨域准确率分别达到98.5%和80.50%。在不同的异构情况下(例如,Dirichlet α = 0.5,比FedCLIP高9.52%),显示出在异构场景下准确性和泛化性的显著提高。源代码可从https://github.com/Yuankun-Xia/FedCLIP-Distill获得。
{"title":"FedCLIP-Distill: Heterogeneous federated cross-modal knowledge distillation for multi-domain visual recognition","authors":"Yuankun Xia,&nbsp;Hui Wang,&nbsp;Yufeng Zhou","doi":"10.1016/j.knosys.2026.115383","DOIUrl":"10.1016/j.knosys.2026.115383","url":null,"abstract":"<div><div>Federated learning (FL) for multi-domain visual recognition confronts significant challenges due to heterogeneous data distributions and domain shifts, which severely impair the semantic generalization capability of existing methods. To address these challenges, we propose FedCLIP-Distill, a novel framework that employs dual-domain knowledge distillation (KD) and contrastive relational distillation (CRD) to leverage the powerful visual-language alignment of CLIP in heterogeneous FL environments. Our approach employs a centralized CLIP teacher model to distill robust visual-textual semantics into lightweight client-side student models, thereby enabling effective local domain adaptation. We provide a theoretical convergence analysis proving that our distillation mechanism effectively mitigates domain gaps and facilitates robust convergence under non-IID settings. Extensive experiments on Office-Caltech10 and DomainNet benchmarks show that FedCLIP-Distill outperforms other methods: it achieves an average cross-domain accuracy of 98.5% on Office-Caltech10 and 80.50% on DomainNet. In different heterogeneous situations (e.g., Dirichlet <em>α</em> = 0.5, 9.52% higher than FedCLIP), demonstrating significant improvements in accuracy and generalization under heterogeneous scenarios. The source code is available at <span><span>https://github.com/Yuankun-Xia/FedCLIP-Distill</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115383"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Heterogeneous Graph Learning with Semantic-Aware Meta-Path Diffusion and Dual Optimization 基于语义感知元路径扩散和对偶优化的异构图学习
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115385
Guanghua Ding , Rui Tang , Xian Mo
Heterogeneous graph learning aims to extract semantic and structural information from multiple node types, edges, and meta-paths, learning low-dimensional embeddings that preserve core characteristics to support downstream tasks. To address the core challenges of insufficient semantic mining and weak learning synergy in heterogeneous graph learning, this paper proposes a heterogeneous graph learning method integrating Semantic-aware Meta-path perturbation and Collaborative Dual-learning optimization(SMCD). First, the method constructs auxiliary meta-paths based on the original meta-paths, and then designs two augmentation schemes to generate augmented views: For semantic-level augmentation, it performs edge perturbation based on semantic similarity, and enhances the semantics of core meta-paths with the semantics of auxiliary meta-paths via a diffusion model; For task-level augmentation, it utilizes a diffusion model and semantic weights to select the top-k semantically relevant nodes for each node in the core meta-path graph, reconstructing the meta-paths graph structure. Then, a two-stage attention aggregation graph encoder is adopted to output the final node embeddings. Finally, a self-supervised and supervised (i.e., Dual-learning) collaborative optimization strategy that flexibly adapts to label distribution is used to optimize the objective-this not only balances the discriminability and generality of representations but also adapts to scenarios with different degrees of label scarcity. Experimental results on three public datasets illustrate that our proposed method achieves remarkable advantages in both node classification and node clustering tasks. Our datasets and source code are available.1
异构图学习旨在从多个节点类型、边缘和元路径中提取语义和结构信息,学习保留核心特征的低维嵌入,以支持下游任务。针对异构图学习中语义挖掘不足和学习协同能力弱的核心问题,提出了一种集成语义感知元路径扰动和协同双学习优化(SMCD)的异构图学习方法。该方法首先在原始元路径的基础上构建辅助元路径,然后设计两种增强方案生成增强视图:语义级增强,基于语义相似度进行边缘扰动,通过扩散模型利用辅助元路径的语义增强核心元路径的语义;对于任务级增强,它利用扩散模型和语义权重为核心元路径图中的每个节点选择top-k个语义相关节点,重构元路径图结构。然后,采用两阶段注意力聚集图编码器输出最终节点嵌入。最后,采用一种灵活适应标签分布的自监督和监督(即双学习)协同优化策略对目标进行优化,这不仅平衡了表征的可辨别性和一般性,而且适应了不同标签稀缺程度的场景。在三个公共数据集上的实验结果表明,我们的方法在节点分类和节点聚类任务上都取得了显著的优势。我们的数据集和源代码是可用的
{"title":"Enhancing Heterogeneous Graph Learning with Semantic-Aware Meta-Path Diffusion and Dual Optimization","authors":"Guanghua Ding ,&nbsp;Rui Tang ,&nbsp;Xian Mo","doi":"10.1016/j.knosys.2026.115385","DOIUrl":"10.1016/j.knosys.2026.115385","url":null,"abstract":"<div><div>Heterogeneous graph learning aims to extract semantic and structural information from multiple node types, edges, and meta-paths, learning low-dimensional embeddings that preserve core characteristics to support downstream tasks. To address the core challenges of insufficient semantic mining and weak learning synergy in heterogeneous graph learning, this paper proposes a heterogeneous graph learning method integrating <u>S</u>emantic-aware <u>M</u>eta-path perturbation and <u>C</u>ollaborative <u>D</u>ual-learning optimization(SMCD). First, the method constructs auxiliary meta-paths based on the original meta-paths, and then designs two augmentation schemes to generate augmented views: For semantic-level augmentation, it performs edge perturbation based on semantic similarity, and enhances the semantics of core meta-paths with the semantics of auxiliary meta-paths via a diffusion model; For task-level augmentation, it utilizes a diffusion model and semantic weights to select the top-k semantically relevant nodes for each node in the core meta-path graph, reconstructing the meta-paths graph structure. Then, a two-stage attention aggregation graph encoder is adopted to output the final node embeddings. Finally, a self-supervised and supervised (i.e., Dual-learning) collaborative optimization strategy that flexibly adapts to label distribution is used to optimize the objective-this not only balances the discriminability and generality of representations but also adapts to scenarios with different degrees of label scarcity. Experimental results on three public datasets illustrate that our proposed method achieves remarkable advantages in both node classification and node clustering tasks. Our datasets and source code are available.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115385"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransXV2S-NET: A novel hybrid deep learning architecture with dual-contextual graph attention for multi-class skin lesion classification TransXV2S-NET:一种新的基于双上下文图关注的混合深度学习架构,用于多类皮肤病变分类
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115407
Adnan Saeed , Khurram Shehzad , Muhammad Ghulam Abbas Malik , Saim Ahmed , Ahmad Taher Azar
Accurate early-stage diagnosis of skin lesions remains challenging for dermatologists due to visual complexity and subtle inter-class differences. Traditional computer-assisted diagnostic tools struggle to capture detailed patterns and contextual relationships, especially under varying imaging conditions. In this study, we introduce TransXV2S-Net, a new hybrid deep-learning model based on multiple branches designed for automated skin lesion classification. These branches enable to extract features at different stages from skin lesions separately and learn complex combinations between them. These branches include an EfficientNetV2S, Swin Transformer, and a modified Xception architecture, a new feature extraction method, as well as a Dual-Contextual Graph Attention Network (DCGAN) that is proposed to make the network focus on discriminative parts of skin lesions. A novel Dual-Contextual Graph Attention Network (DCGAN) enhances discriminative feature learning through dual-path attention mechanisms and graph-based operations that effectively capture both local textural details and global contextual patterns. The Gray World Standard Deviation (GWSD) preprocessing algorithm improves lesion visibility and removes imaging artifacts Benchmarking against an 8-class skin cancer dataset confirmed the model's efficacy, yielding 95.26% accuracy, 94.30% recall, and an AUC-ROC of 99.62%. Further validation on the HAM10000 dataset demonstrates exceptional performance with 95% accuracy, confirming the model's robustness and generalization capability.
由于视觉复杂性和微妙的类间差异,准确的早期诊断皮肤病变仍然是皮肤科医生的挑战。传统的计算机辅助诊断工具很难捕获详细的模式和上下文关系,特别是在不同的成像条件下。在这项研究中,我们引入了TransXV2S-Net,这是一种新的基于多分支的混合深度学习模型,旨在实现皮肤病变的自动分类。这些分支能够分别从皮肤病变中提取不同阶段的特征,并学习它们之间的复杂组合。这些分支包括高效netv2s、Swin Transformer、改进的异常架构、一种新的特征提取方法,以及提出的双上下文图注意网络(Dual-Contextual Graph Attention Network, DCGAN),该网络将重点放在皮肤病变的鉴别部分。一种新的双上下文图注意网络(DCGAN)通过双路径注意机制和基于图的操作来增强判别特征学习,有效地捕获局部纹理细节和全局上下文模式。灰色世界标准偏差(GWSD)预处理算法提高了病变可见性并消除了成像伪像,对8类皮肤癌数据集的基准测试证实了该模型的有效性,准确率为95.26%,召回率为94.30%,AUC-ROC为99.62%。在HAM10000数据集上的进一步验证表明,该模型具有优异的性能,准确率达到95%,证实了模型的鲁棒性和泛化能力。
{"title":"TransXV2S-NET: A novel hybrid deep learning architecture with dual-contextual graph attention for multi-class skin lesion classification","authors":"Adnan Saeed ,&nbsp;Khurram Shehzad ,&nbsp;Muhammad Ghulam Abbas Malik ,&nbsp;Saim Ahmed ,&nbsp;Ahmad Taher Azar","doi":"10.1016/j.knosys.2026.115407","DOIUrl":"10.1016/j.knosys.2026.115407","url":null,"abstract":"<div><div>Accurate early-stage diagnosis of skin lesions remains challenging for dermatologists due to visual complexity and subtle inter-class differences. Traditional computer-assisted diagnostic tools struggle to capture detailed patterns and contextual relationships, especially under varying imaging conditions. In this study, we introduce TransXV2S-Net, a new hybrid deep-learning model based on multiple branches designed for automated skin lesion classification. These branches enable to extract features at different stages from skin lesions separately and learn complex combinations between them. These branches include an EfficientNetV2S, Swin Transformer, and a modified Xception architecture, a new feature extraction method, as well as a Dual-Contextual Graph Attention Network (DCGAN) that is proposed to make the network focus on discriminative parts of skin lesions. A novel Dual-Contextual Graph Attention Network (DCGAN) enhances discriminative feature learning through dual-path attention mechanisms and graph-based operations that effectively capture both local textural details and global contextual patterns. The Gray World Standard Deviation (GWSD) preprocessing algorithm improves lesion visibility and removes imaging artifacts Benchmarking against an 8-class skin cancer dataset confirmed the model's efficacy, yielding 95.26% accuracy, 94.30% recall, and an AUC-ROC of 99.62%. Further validation on the HAM10000 dataset demonstrates exceptional performance with 95% accuracy, confirming the model's robustness and generalization capability.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115407"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OACI: Object-aware contextual integration for image captioning OACI:用于图像字幕的对象感知上下文集成
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.knosys.2026.115374
Shuhan Xu , Mengya Han , Wei Yu , Zheng He , Xin Zhou , Yong Luo
Image captioning is a fundamental task in visual understanding, aiming to generate textual descriptions for given images. Current image captioning methods are gradually shifting towards a fully end-to-end paradigm, which leverages pre-trained vision models to process images directly and generate captions, eliminating the need for separating object detectors. These methods typically rely on global features, neglecting the precise perception of local ones. The lack of fine-grained focus on the object may result in suboptimal prototype features contaminated by surrounding noise, and thus negatively affect the generation of object-related captions. To address this issue, we propose a novel method termed object-aware context integration (OACI), which captures the salient prototypes of individual objects and understands their relationships by leveraging the global context of the entire scene. Specifically, we propose an object-aware prototype learning (OAPL) module that focuses on regions containing objects to enhance object perception and selects the most confident regions for learning object prototypes. Moreover, a class affinity constraint (CAC) is designed to facilitate the learning of these prototypes. To understand the relationships between objects, we further propose an object-context integration (OCI) module that integrates global context with local object prototypes, enhancing the understanding of image content and improving the generated image captions. We conduct extensive experiments on the popular MSCOCO, Flickr8k and Flickr30k datasets, and the results demonstrate that integrating global context with local object details significantly improves the quality of generated captions, validating the effectiveness of the proposed OACI method.
图像字幕是视觉理解中的一项基本任务,旨在为给定图像生成文本描述。当前的图像字幕方法正逐渐转向完全端到端范式,它利用预训练的视觉模型直接处理图像并生成字幕,从而消除了分离目标检测器的需要。这些方法通常依赖于全局特征,而忽略了对局部特征的精确感知。缺乏对对象的细粒度关注可能会导致受周围噪声污染的次优原型特征,从而对对象相关标题的生成产生负面影响。为了解决这个问题,我们提出了一种称为对象感知上下文集成(OACI)的新方法,该方法捕获单个对象的显著原型,并通过利用整个场景的全局上下文来理解它们之间的关系。具体来说,我们提出了一个对象感知原型学习(OAPL)模块,该模块关注包含对象的区域来增强对象感知,并选择最自信的区域来学习对象原型。此外,还设计了类关联约束(CAC)来促进这些原型的学习。为了理解对象之间的关系,我们进一步提出了一个对象-上下文集成(OCI)模块,该模块将全局上下文与局部对象原型集成在一起,增强了对图像内容的理解,并改进了生成的图像标题。我们在流行的MSCOCO、Flickr8k和Flickr30k数据集上进行了大量的实验,结果表明,将全局上下文与局部对象细节相结合可以显著提高生成字幕的质量,验证了所提出的OACI方法的有效性。
{"title":"OACI: Object-aware contextual integration for image captioning","authors":"Shuhan Xu ,&nbsp;Mengya Han ,&nbsp;Wei Yu ,&nbsp;Zheng He ,&nbsp;Xin Zhou ,&nbsp;Yong Luo","doi":"10.1016/j.knosys.2026.115374","DOIUrl":"10.1016/j.knosys.2026.115374","url":null,"abstract":"<div><div>Image captioning is a fundamental task in visual understanding, aiming to generate textual descriptions for given images. Current image captioning methods are gradually shifting towards a fully end-to-end paradigm, which leverages pre-trained vision models to process images directly and generate captions, eliminating the need for separating object detectors. These methods typically rely on global features, neglecting the precise perception of local ones. The lack of fine-grained focus on the object may result in suboptimal prototype features contaminated by surrounding noise, and thus negatively affect the generation of object-related captions. To address this issue, we propose a novel method termed object-aware context integration (OACI), which captures the salient prototypes of individual objects and understands their relationships by leveraging the global context of the entire scene. Specifically, we propose an object-aware prototype learning (OAPL) module that focuses on regions containing objects to enhance object perception and selects the most confident regions for learning object prototypes. Moreover, a class affinity constraint (CAC) is designed to facilitate the learning of these prototypes. To understand the relationships between objects, we further propose an object-context integration (OCI) module that integrates global context with local object prototypes, enhancing the understanding of image content and improving the generated image captions. We conduct extensive experiments on the popular MSCOCO, Flickr8k and Flickr30k datasets, and the results demonstrate that integrating global context with local object details significantly improves the quality of generated captions, validating the effectiveness of the proposed OACI method.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115374"},"PeriodicalIF":7.6,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-domain time-frequency Mamba: A more effective model for long-term time series forecasting 跨域时频曼巴:一种更有效的长期时间序列预测模型
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-21 DOI: 10.1016/j.knosys.2026.115341
Yuhang Duan, Lin Lin, Jinyuan Liu, Qing Zhang, Xin Fan
Long-term time series forecasting (LTSF) is crucial in domains such as smart energy systems and industrial Internet of Things. Existing methods face intertwined challenges in LTSF. Single-domain modeling often fails to capture local fluctuations and global trends, resulting in incomplete temporal representations. While attention-based models effectively capture long-range dependencies, their quadratic computational complexity limits their efficiency and scalability. Moreover, cross-scale conflicts frequently occur in long-term forecasting. Short-term patterns may interfere with long-term trends, thereby degrading prediction accuracy. To address these issues, we propose cross-domain time-frequency Mamba (CDTF-Mamba), which synergistically models time series in both the time and frequency domains. CDTF-Mamba’s time-domain pyramid Mamba component disentangles multiscale patterns, while the frequency-domain decomposition Mamba component stabilizes state evolution while mitigating nonstationarity. We perform extensive experiments on 13 widely used benchmark datasets. Experimental results demonstrate that CDTF-Mamba achieves superior accuracy while maintaining high efficiency and strong scalability compared with state-of-the-art methods.
长期时间序列预测(LTSF)在智能能源系统和工业物联网等领域至关重要。现有方法在LTSF中面临着错综复杂的挑战。单域建模经常不能捕获局部波动和全局趋势,导致不完整的时间表示。虽然基于注意力的模型可以有效地捕获远程依赖关系,但它们的二次计算复杂性限制了它们的效率和可扩展性。此外,在长期预测中经常出现跨尺度冲突。短期模式可能干扰长期趋势,从而降低预测的准确性。为了解决这些问题,我们提出了跨域时频曼巴(CDTF-Mamba),它在时间和频率域协同建模时间序列。CDTF-Mamba的时域金字塔曼巴分量解开了多尺度模式,而频域分解曼巴分量稳定了状态演变,同时减轻了非平稳性。我们在13个广泛使用的基准数据集上进行了广泛的实验。实验结果表明,与现有方法相比,CDTF-Mamba在保持高效率和较强可扩展性的同时,具有优越的精度。
{"title":"Cross-domain time-frequency Mamba: A more effective model for long-term time series forecasting","authors":"Yuhang Duan,&nbsp;Lin Lin,&nbsp;Jinyuan Liu,&nbsp;Qing Zhang,&nbsp;Xin Fan","doi":"10.1016/j.knosys.2026.115341","DOIUrl":"10.1016/j.knosys.2026.115341","url":null,"abstract":"<div><div>Long-term time series forecasting (LTSF) is crucial in domains such as smart energy systems and industrial Internet of Things. Existing methods face intertwined challenges in LTSF. Single-domain modeling often fails to capture local fluctuations and global trends, resulting in incomplete temporal representations. While attention-based models effectively capture long-range dependencies, their quadratic computational complexity limits their efficiency and scalability. Moreover, cross-scale conflicts frequently occur in long-term forecasting. Short-term patterns may interfere with long-term trends, thereby degrading prediction accuracy. To address these issues, we propose cross-domain time-frequency Mamba (CDTF-Mamba), which synergistically models time series in both the time and frequency domains. CDTF-Mamba’s time-domain pyramid Mamba component disentangles multiscale patterns, while the frequency-domain decomposition Mamba component stabilizes state evolution while mitigating nonstationarity. We perform extensive experiments on 13 widely used benchmark datasets. Experimental results demonstrate that CDTF-Mamba achieves superior accuracy while maintaining high efficiency and strong scalability compared with state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115341"},"PeriodicalIF":7.6,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdaptTrack: Perception field adaptation with contrastive attention for robust visual tracking 基于对比注意的感知场自适应鲁棒视觉跟踪
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-21 DOI: 10.1016/j.knosys.2026.115369
Yongjun Wang, Xiaohui Hao
While transformer-based methods have advanced visual object tracking, existing approaches often struggle with complex scenarios due to their reliance on fixed perception fields, limited discriminative capabilities, and insufficient predictive modeling. Current solutions utilizing attention mechanisms and feature learning techniques have made progress but face inherent limitations in adapting to dynamic scenes and maintaining robust target discrimination. We propose AdaptTrack, an innovative Transformer-based tracking framework that systematically addresses three critical limitations in existing approaches: suboptimal perception field adaptation for capturing target-specific information, insufficient target-background discrimination in cluttered environments, and inadequate predictive modeling during challenging scenarios. The framework introduces three key technical components: (1) an Adaptive Perception Field Guidance Network that dynamically optimizes feature extraction through scene-aware field configuration, (2) a Contrastive-Guided Contextual Attention mechanism that enhances discrimination through structured contrast learning, and (3) a Predictive State Transition Network that improves robustness via probabilistic state modeling. Through these innovations, our approach effectively addresses the limitations of current methods through dynamic field adaptation, explicit contrast modeling, and robust state prediction. Extensive evaluations demonstrate state-of-the-art performance on seven benchmarks (77.3% AO on GOT-10k, 73.3% AUC on LaSOT, 85.4% AUC on TrackingNet) while maintaining real-time efficiency at 32.6 FPS.
虽然基于变压器的方法具有先进的视觉目标跟踪,但现有的方法往往难以处理复杂的场景,因为它们依赖于固定的感知场,有限的判别能力和不足的预测建模。目前利用注意机制和特征学习技术的解决方案取得了进展,但在适应动态场景和保持强大的目标识别方面存在固有的局限性。我们提出了AdaptTrack,一个创新的基于变压器的跟踪框架,系统地解决了现有方法中的三个关键限制:捕获目标特定信息的次优感知场适应,混乱环境中的目标背景区分不足,以及在具有挑战性的场景中不充分的预测建模。该框架引入了三个关键技术组件:(1)自适应感知场引导网络,通过场景感知场配置动态优化特征提取;(2)对比引导上下文注意机制,通过结构化对比学习增强辨别能力;(3)预测状态转移网络,通过概率状态建模提高鲁棒性。通过这些创新,我们的方法通过动态场适应、显式对比建模和鲁棒状态预测有效地解决了当前方法的局限性。广泛的评估表明,在7个基准测试(GOT-10k上77.3%的AO, LaSOT上73.3%的AUC, TrackingNet上85.4%的AUC)上,最先进的性能,同时保持了32.6 FPS的实时效率。
{"title":"AdaptTrack: Perception field adaptation with contrastive attention for robust visual tracking","authors":"Yongjun Wang,&nbsp;Xiaohui Hao","doi":"10.1016/j.knosys.2026.115369","DOIUrl":"10.1016/j.knosys.2026.115369","url":null,"abstract":"<div><div>While transformer-based methods have advanced visual object tracking, existing approaches often struggle with complex scenarios due to their reliance on fixed perception fields, limited discriminative capabilities, and insufficient predictive modeling. Current solutions utilizing attention mechanisms and feature learning techniques have made progress but face inherent limitations in adapting to dynamic scenes and maintaining robust target discrimination. We propose AdaptTrack, an innovative Transformer-based tracking framework that systematically addresses three critical limitations in existing approaches: suboptimal perception field adaptation for capturing target-specific information, insufficient target-background discrimination in cluttered environments, and inadequate predictive modeling during challenging scenarios. The framework introduces three key technical components: (1) an Adaptive Perception Field Guidance Network that dynamically optimizes feature extraction through scene-aware field configuration, (2) a Contrastive-Guided Contextual Attention mechanism that enhances discrimination through structured contrast learning, and (3) a Predictive State Transition Network that improves robustness via probabilistic state modeling. Through these innovations, our approach effectively addresses the limitations of current methods through dynamic field adaptation, explicit contrast modeling, and robust state prediction. Extensive evaluations demonstrate state-of-the-art performance on seven benchmarks (77.3% AO on GOT-10k, 73.3% AUC on LaSOT, 85.4% AUC on TrackingNet) while maintaining real-time efficiency at 32.6 FPS.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115369"},"PeriodicalIF":7.6,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge-Based Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1