首页 > 最新文献

Knowledge-Based Systems最新文献

英文 中文
CoGMoE: Sparse and specialized framework for multi-agent collaborative perception via graph mixture-of-experts CoGMoE:基于图混合专家的多智能体协同感知的稀疏专用框架
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-14 DOI: 10.1016/j.knosys.2026.115329
Xingpeng Li , Enwen Hu , Siyuan Jin , Baoding Zhou , Jingrong Liu
Multi-agent collaborative perception significantly improves autonomous driving safety by sharing complementary information to overcome individual limitations owing to occlusions. A primary goal is to navigate the critical trade-off between perception performance and communication bandwidth. However, existing methods struggle to achieve this balance, treating all information equally without considering each agent’s specific situation. To address this issue, this study proposes CoGMoE, a novel collaborative perception method that models the V2V communication as a structured, hierarchical reasoning process. Specifically, CoGMoE provides three distinct advantages: i) it selects a sparse set of semantically salient keypoints from each vehicle, significantly reducing communication overhead while preserving important information; ii) it constructs a hierarchical communication graph that establishes direct alignment links between the corresponding position areas of different vehicles, explicitly separating them from the internal links used for context reasoning; and iii) it uses a graph mixture-of-experts (GraphMoE) architecture governed by multi-round expert deliberation to dynamically assign experts for each link type, achieving superior robustness using iterative feature refinement. Extensive experiments on both simulated and real-world datasets demonstrate that our proposed CoGMoE outperforms state-of-the-art collaborative perception methods in achieving detection accuracy and communication bandwidth trade-off.
多智能体协同感知通过共享互补信息,克服因遮挡造成的个体局限性,显著提高了自动驾驶的安全性。主要目标是导航感知性能和通信带宽之间的关键权衡。然而,现有的方法很难达到这种平衡,平等地对待所有信息,而不考虑每个agent的具体情况。为了解决这一问题,本研究提出了CoGMoE,这是一种新颖的协作感知方法,将V2V通信建模为结构化的分层推理过程。具体来说,CoGMoE提供了三个明显的优势:1)它从每辆车中选择一组语义上显著的关键点,在保留重要信息的同时显著降低了通信开销;Ii)构建一个分层通信图,在不同车辆的对应位置区域之间建立直接的对齐链接,明确地将它们与用于上下文推理的内部链接分开;iii)使用由多轮专家审议控制的专家混合图(GraphMoE)架构,为每个链路类型动态分配专家,通过迭代特征细化实现卓越的鲁棒性。在模拟和现实世界数据集上进行的大量实验表明,我们提出的CoGMoE在实现检测精度和通信带宽权衡方面优于最先进的协作感知方法。
{"title":"CoGMoE: Sparse and specialized framework for multi-agent collaborative perception via graph mixture-of-experts","authors":"Xingpeng Li ,&nbsp;Enwen Hu ,&nbsp;Siyuan Jin ,&nbsp;Baoding Zhou ,&nbsp;Jingrong Liu","doi":"10.1016/j.knosys.2026.115329","DOIUrl":"10.1016/j.knosys.2026.115329","url":null,"abstract":"<div><div>Multi-agent collaborative perception significantly improves autonomous driving safety by sharing complementary information to overcome individual limitations owing to occlusions. A primary goal is to navigate the critical trade-off between perception performance and communication bandwidth. However, existing methods struggle to achieve this balance, treating all information equally without considering each agent’s specific situation. To address this issue, this study proposes CoGMoE, a novel collaborative perception method that models the V2V communication as a structured, hierarchical reasoning process. Specifically, CoGMoE provides three distinct advantages: i) it selects a sparse set of semantically salient keypoints from each vehicle, significantly reducing communication overhead while preserving important information; ii) it constructs a hierarchical communication graph that establishes direct alignment links between the corresponding position areas of different vehicles, explicitly separating them from the internal links used for context reasoning; and iii) it uses a graph mixture-of-experts (GraphMoE) architecture governed by multi-round expert deliberation to dynamically assign experts for each link type, achieving superior robustness using iterative feature refinement. Extensive experiments on both simulated and real-world datasets demonstrate that our proposed CoGMoE outperforms state-of-the-art collaborative perception methods in achieving detection accuracy and communication bandwidth trade-off.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115329"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal summarization via coarse-and-fine granularity synergy and region counterfactual reasoning filter 基于粗细粒度协同和区域反事实推理过滤的多模态总结
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-24 DOI: 10.1016/j.knosys.2026.115356
Rulong Liu , Qing He , Yuji Wang , Nisuo Du , Zhihao Yang
Multimodal Summarization (MS) generates high-quality summaries by integrating textual and visual information. However, existing MS research faces several challenges, including (1) ignoring fine-grained key information between visual and textual modalities and interaction with coarse-grained information, (2) cross-modal semantic inconsistency, which hinders alignment and fusion of visual and textual feature spaces, and (3) ignoring inherent heterogeneity of an image when filtering visual information, which causes excessive filtering or excessive retention. To address these issues, we propose Coarse-and-Fine Granularity Synergy and Region Counterfactual Reasoning Filter (CFCR) for MS. Specifically, we design Coarse-and-Fine Granularity Synergy (CFS) to capture both global (coarse-grained) and important detailed (fine-grained) information in text and image modalities. Based on this, we design Dual-granularity Contrastive Learning (DCL) for mapping coarse-grained and fine-grained visual features into the text semantic space, thereby reducing semantic inconsistency caused by modality differences at dual granularity levels, and facilitating cross-modal alignment. To address the issue of excessive filtering or excessive retention in visual information filtering, we design a Region Counterfactual Reasoning Filter (RCF) that employs Counterfactual Reasoning to determine the validity of image regions and generate category labels. These labels are then used to train Image Region Selector to select regions beneficial for summarization. Extensive experiments on the representative MMSS and MSMO dataset show that CFCR outperforms multiple strong baselines, particularly in terms of selecting and focusing on critical details, demonstrating its effectiveness in MS.
多模态摘要(MS)通过整合文本信息和视觉信息生成高质量的摘要。然而,现有的MS研究面临着一些挑战,包括:(1)忽略了视觉和文本模式之间的细粒度关键信息以及与粗粒度信息的交互;(2)跨模态语义不一致,阻碍了视觉和文本特征空间的对齐和融合;(3)在过滤视觉信息时忽略了图像固有的异质性,导致过度过滤或过度保留。为了解决这些问题,我们提出了粗粒度和细粒度协同以及区域反事实推理过滤器(CFCR),具体来说,我们设计了粗粒度和细粒度协同(CFS)来捕获文本和图像模式中的全局(粗粒度)和重要的详细(细粒度)信息。在此基础上,我们设计了双粒度对比学习(dual -granularity Contrastive Learning, DCL),将粗粒度和细粒度的视觉特征映射到文本语义空间中,从而减少双粒度水平上模态差异导致的语义不一致,促进跨模态对齐。为了解决视觉信息过滤中过度过滤或过度保留的问题,我们设计了一个区域反事实推理过滤器(RCF),该过滤器使用反事实推理来确定图像区域的有效性并生成类别标签。然后使用这些标签来训练图像区域选择器来选择有利于摘要的区域。在代表性的MMSS和MSMO数据集上进行的大量实验表明,CFCR优于多个强基线,特别是在选择和关注关键细节方面,证明了其在MS中的有效性。
{"title":"Multimodal summarization via coarse-and-fine granularity synergy and region counterfactual reasoning filter","authors":"Rulong Liu ,&nbsp;Qing He ,&nbsp;Yuji Wang ,&nbsp;Nisuo Du ,&nbsp;Zhihao Yang","doi":"10.1016/j.knosys.2026.115356","DOIUrl":"10.1016/j.knosys.2026.115356","url":null,"abstract":"<div><div>Multimodal Summarization (MS) generates high-quality summaries by integrating textual and visual information. However, existing MS research faces several challenges, including (1) ignoring fine-grained key information between visual and textual modalities and interaction with coarse-grained information, (2) cross-modal semantic inconsistency, which hinders alignment and fusion of visual and textual feature spaces, and (3) ignoring inherent heterogeneity of an image when filtering visual information, which causes excessive filtering or excessive retention. To address these issues, we propose Coarse-and-Fine Granularity Synergy and Region Counterfactual Reasoning Filter (CFCR) for MS. Specifically, we design Coarse-and-Fine Granularity Synergy (CFS) to capture both global (coarse-grained) and important detailed (fine-grained) information in text and image modalities. Based on this, we design Dual-granularity Contrastive Learning (DCL) for mapping coarse-grained and fine-grained visual features into the text semantic space, thereby reducing semantic inconsistency caused by modality differences at dual granularity levels, and facilitating cross-modal alignment. To address the issue of excessive filtering or excessive retention in visual information filtering, we design a Region Counterfactual Reasoning Filter (RCF) that employs Counterfactual Reasoning to determine the validity of image regions and generate category labels. These labels are then used to train Image Region Selector to select regions beneficial for summarization. Extensive experiments on the representative MMSS and MSMO dataset show that CFCR outperforms multiple strong baselines, particularly in terms of selecting and focusing on critical details, demonstrating its effectiveness in MS.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115356"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual prompts guided cross-domain transformer for unified day-night image dehazing 双提示引导跨域变压器统一昼夜图像去雾
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-20 DOI: 10.1016/j.knosys.2026.115362
Jianlei Liu , Jiaming Niu , Xiang Chen , Yuting Pang , Shilong Wang
Although considerable progress has been made in image dehazing, most existing methods are constrained to a single degradation type or specific haze pattern. However, in real-world environments, haze manifests in diverse forms owing to variations in illumination, day-night transitions, and other coupled degradation factors. A new task has been assigned to address the following challenges: unified day-night image dehazing (UDND), with the aim to restore haze-degraded images across daytime and nighttime conditions within a single unified framework. For this task, we propose UDNDformer, a cross-domain Transformer guided by dual-prompt learning, which integrates both hard prompt learning (HPL) and soft prompt learning (SPL). The HPL module reconstructs scene before encoding transferable haze representations in a frozen form, ensuring consistent degradation modeling across domains. By contrast, the SPL module employs learnable tensors that interact with encoded features to adaptively capture temporal haze variations and dynamically modulate restoration during decoding for condition-aware guidance. This dual-prompt design enables UDNDformer to achieve adaptive haze perception and flexible degradation modeling under diverse illumination conditions, thereby markedly enhancing the restoration quality in unified day-night scenarios. Extensive experimentation demonstrates that UDNDformer consistently outperforms state-of-the-art methods across multiple day-night benchmarks and demonstrates notable improvements in downstream vision tasks, validating its effectiveness and strong generalizability to real-world applications.
尽管在图像去雾方面已经取得了相当大的进展,但大多数现有的方法都局限于单一的降解类型或特定的雾霾模式。然而,在现实环境中,由于光照变化、昼夜转换以及其他耦合降解因素,雾霾表现为多种形式。一项新的任务被分配来应对以下挑战:统一的昼夜图像去雾(UDND),目的是在一个统一的框架内恢复白天和夜间条件下的雾退化图像。为此,我们提出了一种基于双提示学习的跨域转换器UDNDformer,它集成了硬提示学习(HPL)和软提示学习(SPL)。HPL模块在以冻结形式编码可转移的雾霾表示之前重建场景,确保跨域一致的退化建模。相比之下,SPL模块采用可学习的张量,与编码特征相互作用,自适应捕获时间雾霾变化,并在解码过程中动态调制恢复,以实现条件感知制导。这种双提示设计使UDNDformer能够在不同光照条件下实现自适应雾霾感知和灵活的退化建模,从而显著提高统一日夜场景下的恢复质量。大量的实验表明,UDNDformer在多个昼夜基准测试中始终优于最先进的方法,并在下游视觉任务中表现出显著的改进,验证了其有效性和对现实世界应用的强大通用性。
{"title":"Dual prompts guided cross-domain transformer for unified day-night image dehazing","authors":"Jianlei Liu ,&nbsp;Jiaming Niu ,&nbsp;Xiang Chen ,&nbsp;Yuting Pang ,&nbsp;Shilong Wang","doi":"10.1016/j.knosys.2026.115362","DOIUrl":"10.1016/j.knosys.2026.115362","url":null,"abstract":"<div><div>Although considerable progress has been made in image dehazing, most existing methods are constrained to a single degradation type or specific haze pattern. However, in real-world environments, haze manifests in diverse forms owing to variations in illumination, day-night transitions, and other coupled degradation factors. A new task has been assigned to address the following challenges: unified day-night image dehazing (UDND), with the aim to restore haze-degraded images across daytime and nighttime conditions within a single unified framework. For this task, we propose UDNDformer, a cross-domain Transformer guided by dual-prompt learning, which integrates both hard prompt learning (HPL) and soft prompt learning (SPL). The HPL module reconstructs scene before encoding transferable haze representations in a frozen form, ensuring consistent degradation modeling across domains. By contrast, the SPL module employs learnable tensors that interact with encoded features to adaptively capture temporal haze variations and dynamically modulate restoration during decoding for condition-aware guidance. This dual-prompt design enables UDNDformer to achieve adaptive haze perception and flexible degradation modeling under diverse illumination conditions, thereby markedly enhancing the restoration quality in unified day-night scenarios. Extensive experimentation demonstrates that UDNDformer consistently outperforms state-of-the-art methods across multiple day-night benchmarks and demonstrates notable improvements in downstream vision tasks, validating its effectiveness and strong generalizability to real-world applications.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115362"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PoseDefCycleGAN: Identity-preserving face frontalization with deformable convolutions and pose-aware supervision PoseDefCycleGAN:具有可变形卷积和姿态感知监督的身份保持人脸正面化
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-27 DOI: 10.1016/j.knosys.2026.115358
Shakeel Muhammad Ibrahim , Shujaat Khan , Young-Woong Ko , Jeong-Gun Lee
Face recognition systems have achieved impressive accuracy in controlled environments but continue to face challenges under extreme pose variations. To address this limitation, we propose a novel face frontalization framework, PoseDefCycleGAN, that combines the strengths of CycleGAN, deformable convolution, and pose-guided supervision. Our method leverages deformable convolution in the final layer of the generator to dynamically adapt the receptive field, enabling better reconstruction of complex facial geometries. Additionally, we incorporate a lightweight pose classification network to enforce pose-aware regularization, encouraging the generation of semantically consistent frontal images. The proposed model is trained using unpaired data and optimized with a combination of adversarial, cycle consistency, identity-preserving, and pose regularization losses. Extensive experiments on MultiPIE, AFW, and LFW datasets demonstrate that the method improves both visual fidelity and face recognition, particularly at extreme yaw angles: on MultiPIE we reduce FID to 15.90 (from 18.32 with CycleGAN) and achieve 98.9% rank-1 accuracy at  ± 90; on LFW we obtain 90.20% accuracy with LPIPS=0.3052. Quantitative evaluations further validate the contribution of deformable convolutions and pose supervision. Our work presents a robust solution for pose-invariant face recognition and establishes a strong benchmark for identity-preserving face frontalization. Model implementation is available on the author’s GitHub page https://github.com/Shak97/PoseDefCycleGAN.
人脸识别系统在受控环境中取得了令人印象深刻的准确性,但在极端姿势变化下仍面临挑战。为了解决这一限制,我们提出了一种新的人脸前端化框架PoseDefCycleGAN,它结合了CycleGAN、可变形卷积和姿态引导监督的优点。我们的方法利用生成器最后一层的可变形卷积来动态调整接受野,从而更好地重建复杂的面部几何形状。此外,我们结合了一个轻量级的姿势分类网络来强制姿势感知正则化,鼓励生成语义一致的正面图像。该模型使用非配对数据进行训练,并结合对抗性、周期一致性、身份保持和姿态正则化损失进行优化。在MultiPIE、AFW和LFW数据集上进行的大量实验表明,该方法提高了视觉保真度和人脸识别,特别是在极端偏航角下:在MultiPIE上,我们将FID从使用CycleGAN时的18.32降至15.90,在 ± 90°时达到98.9%的rank-1精度;在LFW上,我们获得90.20%的准确率,LPIPS=0.3052。定量评估进一步验证了变形卷积和位姿监督的贡献。我们的工作提出了一种姿态不变人脸识别的鲁棒解决方案,并为保持身份的人脸正面化建立了一个强大的基准。模型实现可在作者的GitHub页面https://github.com/Shak97/PoseDefCycleGAN上获得。
{"title":"PoseDefCycleGAN: Identity-preserving face frontalization with deformable convolutions and pose-aware supervision","authors":"Shakeel Muhammad Ibrahim ,&nbsp;Shujaat Khan ,&nbsp;Young-Woong Ko ,&nbsp;Jeong-Gun Lee","doi":"10.1016/j.knosys.2026.115358","DOIUrl":"10.1016/j.knosys.2026.115358","url":null,"abstract":"<div><div>Face recognition systems have achieved impressive accuracy in controlled environments but continue to face challenges under extreme pose variations. To address this limitation, we propose a novel face frontalization framework, PoseDefCycleGAN, that combines the strengths of CycleGAN, deformable convolution, and pose-guided supervision. Our method leverages deformable convolution in the final layer of the generator to dynamically adapt the receptive field, enabling better reconstruction of complex facial geometries. Additionally, we incorporate a lightweight pose classification network to enforce pose-aware regularization, encouraging the generation of semantically consistent frontal images. The proposed model is trained using unpaired data and optimized with a combination of adversarial, cycle consistency, identity-preserving, and pose regularization losses. Extensive experiments on MultiPIE, AFW, and LFW datasets demonstrate that the method improves both visual fidelity and face recognition, particularly at extreme yaw angles: on MultiPIE we reduce FID to 15.90 (from 18.32 with CycleGAN) and achieve 98.9% rank-1 accuracy at  ± 90<sup>∘</sup>; on LFW we obtain 90.20% accuracy with LPIPS=0.3052. Quantitative evaluations further validate the contribution of deformable convolutions and pose supervision. Our work presents a robust solution for pose-invariant face recognition and establishes a strong benchmark for identity-preserving face frontalization. Model implementation is available on the author’s GitHub page <span><span>https://github.com/Shak97/PoseDefCycleGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115358"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Affective computing in the era of large language models: A survey from the NLP perspective 大语言模型时代的情感计算:基于NLP视角的调查
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-25 DOI: 10.1016/j.knosys.2026.115411
Yiqun Zhang , Xiaocui Yang , Xingle Xu , Zeran Gao , Yijie Huang , Shiyi Mu , Shi Feng , Daling Wang , Yifei Zhang , Kaisong Song , Ge Yu
Affective Computing (AC) integrates computer science, psychology, and cognitive science to enable machines to recognize, interpret, and simulate human emotions across domains such as social media, finance, healthcare, and education. AC commonly centers on two task families: Affective Understanding (AU) and Affective Generation (AG). While fine-tuned pre-trained language models (PLMs) have achieved solid AU performance, they often generalize poorly across tasks and remain limited for AG, especially in producing diverse, emotionally appropriate responses. The advent of Large Language Models (LLMs) (e.g., ChatGPT and LLaMA) has catalyzed a paradigm shift by offering in-context learning, broader world knowledge, and stronger sequence generation. This survey presents an Natural Language Processing (NLP)-oriented overview of AC in the LLM era. We (i) consolidate traditional AC tasks and preliminary LLM-based studies; (ii) review adaptation techniques that improve AU/AG, including Instruction Tuning (full and parameter-efficient methods), Prompt Engineering (zero/few-shot, chain-of-thought, agent-based prompting), and Reinforcement Learning (RL). For the latter, we summarize RL from human preferences, verifiable/programmatic rewards, and model(s) feedback, which provide preference- or rule-grounded optimization signals that can help steer AU/AG toward empathy, safety, and planning, achieving finer-grained or multi-objective control. To assess progress, we compile benchmarks and evaluation practices for both AU and AG. We also discuss open challenges–from ethics, data quality, and safety to robust evaluation and resource efficiency–and outline research directions. We hope this survey clarifies the landscape and offers practical guidance for building affect-aware, reliable, and responsible LLM systems.
情感计算(AC)集成了计算机科学、心理学和认知科学,使机器能够识别、解释和模拟社交媒体、金融、医疗保健和教育等领域的人类情感。情感交流通常以情感理解(AU)和情感产生(AG)两个任务族为中心。虽然经过微调的预训练语言模型(PLMs)已经取得了坚实的AU性能,但它们通常在任务之间泛化得很差,并且对于AG来说仍然有限,特别是在产生多样化、情感上适当的反应方面。大型语言模型(llm)的出现(例如,ChatGPT和LLaMA)通过提供上下文学习、更广泛的世界知识和更强大的序列生成,催化了范式的转变。这项调查提出了一个自然语言处理(NLP)为导向的概述在法学硕士时代的交流。我们(i)整合传统的交流任务和基于法学硕士的初步研究;(ii)回顾改进AU/AG的适应技术,包括指令调整(完整和参数有效的方法),提示工程(零/几次,思维链,基于代理的提示)和强化学习(RL)。对于后者,我们从人类偏好、可验证/可编程奖励和模型反馈中总结了强化学习,这些反馈提供了基于偏好或规则的优化信号,可以帮助引导AU/AG走向同理心、安全和规划,实现更细粒度或多目标控制。为了评估进展情况,我们为非盟和总队编制了基准和评估方法。我们还讨论了开放性挑战——从伦理、数据质量和安全到稳健评估和资源效率——并概述了研究方向。我们希望这项调查能够澄清现状,并为建立具有影响意识、可靠和负责任的法学硕士系统提供实用指导。
{"title":"Affective computing in the era of large language models: A survey from the NLP perspective","authors":"Yiqun Zhang ,&nbsp;Xiaocui Yang ,&nbsp;Xingle Xu ,&nbsp;Zeran Gao ,&nbsp;Yijie Huang ,&nbsp;Shiyi Mu ,&nbsp;Shi Feng ,&nbsp;Daling Wang ,&nbsp;Yifei Zhang ,&nbsp;Kaisong Song ,&nbsp;Ge Yu","doi":"10.1016/j.knosys.2026.115411","DOIUrl":"10.1016/j.knosys.2026.115411","url":null,"abstract":"<div><div>Affective Computing (AC) integrates computer science, psychology, and cognitive science to enable machines to recognize, interpret, and simulate human emotions across domains such as social media, finance, healthcare, and education. AC commonly centers on two task families: Affective Understanding (AU) and Affective Generation (AG). While fine-tuned pre-trained language models (PLMs) have achieved solid AU performance, they often generalize poorly across tasks and remain limited for AG, especially in producing diverse, emotionally appropriate responses. The advent of Large Language Models (LLMs) (e.g., ChatGPT and LLaMA) has catalyzed a paradigm shift by offering in-context learning, broader world knowledge, and stronger sequence generation. This survey presents an Natural Language Processing (NLP)-oriented overview of AC in the LLM era. We (i) consolidate traditional AC tasks and preliminary LLM-based studies; (ii) review adaptation techniques that improve AU/AG, including Instruction Tuning (full and parameter-efficient methods), Prompt Engineering (zero/few-shot, chain-of-thought, agent-based prompting), and Reinforcement Learning (RL). For the latter, we summarize RL from human preferences, verifiable/programmatic rewards, and model(s) feedback, which provide preference- or rule-grounded optimization signals that can help steer AU/AG toward empathy, safety, and planning, achieving finer-grained or multi-objective control. To assess progress, we compile benchmarks and evaluation practices for both AU and AG. We also discuss open challenges–from ethics, data quality, and safety to robust evaluation and resource efficiency–and outline research directions. We hope this survey clarifies the landscape and offers practical guidance for building affect-aware, reliable, and responsible LLM systems.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115411"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-contrast feature cross entanglement network for joint MR image reconstruction and super-resolution 多对比度特征交叉纠缠网络联合磁共振图像重建和超分辨率
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-19 DOI: 10.1016/j.knosys.2026.115368
Guoqing Ge , Weisheng Li , Yucheng Shu , Xiaoyu Qiao
Reconstruction and super-resolution (SR) provide effective solutions for accelerating multi-contrast magnetic resonance (MR) imaging by leveraging auxiliary contrast information to restore target contrast from an undersampled counterpart. Although recent advances have explored the joint optimization of reconstruction and SR, most existing frameworks still adopt shallow concatenation or independent decoding branches, thereby failing to fully exploit the inherent complementarity and hierarchical correlations between the two tasks. Additionally, auxiliary contrast information is typically integrated in an isotropic and coarse-grained manner, neglecting directional and structure-specific dependencies across anatomical regions, thus weakening its ability to provide discriminative guidance for the target contrast reconstruction. To address these limitations, we propose a multi-contrast feature cross entanglement network (MFCE-Net) that facilitates comprehensive feature interaction across modalities and tasks. In detail, we first introduce a multi-branch feature guidance module to facilitate multi-scale and direction-aware feature transfer across modalities. Furthermore, within the designed top-down architecture, we incorporate an attention mechanism that allows the SR branch to capture global structures while preserving fine textures by proposing a feature representation enhancement module. Finally, we design a feature entanglement interaction (FEI) module that employs a cross-weighting mechanism across spatial and channel dimensions to facilitate deep feature sharing and mutual reinforcement between the reconstruction and SR tasks. Extensive experiments are conducted with various advanced multi-contrast MR imaging methods on fastMRI, BraTS2019 and clinical in-house datasets, and the results demonstrate the superiority of our model. The code is released at https://github.com/coolggq/MFCE-Net.
重建和超分辨率(SR)为加速多对比度磁共振(MR)成像提供了有效的解决方案,通过利用辅助对比度信息从欠采样对应物中恢复目标对比度。虽然近年来已经探索了重建和SR的联合优化,但大多数现有框架仍然采用浅连接或独立解码分支,从而未能充分利用这两个任务之间固有的互补性和层次相关性。此外,辅助对比度信息通常以各向同性和粗粒度的方式集成,忽略了跨解剖区域的方向性和结构特异性依赖,从而削弱了其为目标对比度重建提供鉴别指导的能力。为了解决这些限制,我们提出了一个多对比特征交叉纠缠网络(MFCE-Net),促进了跨模式和任务的综合特征交互。我们首先介绍了一个多分支特征引导模块,以促进多尺度和方向感知的特征跨模态传输。此外,在设计的自顶向下架构中,我们结合了一个注意机制,通过提出一个特征表示增强模块,允许SR分支捕获全局结构,同时保留精细纹理。最后,我们设计了一个特征纠缠交互(FEI)模块,该模块采用跨空间和通道维度的交叉加权机制,以促进重建和SR任务之间的深度特征共享和相互增强。在fastMRI、BraTS2019和临床内部数据集上使用各种先进的多对比磁共振成像方法进行了大量实验,结果证明了我们模型的优越性。该代码发布在https://github.com/coolggq/MFCE-Net。
{"title":"Multi-contrast feature cross entanglement network for joint MR image reconstruction and super-resolution","authors":"Guoqing Ge ,&nbsp;Weisheng Li ,&nbsp;Yucheng Shu ,&nbsp;Xiaoyu Qiao","doi":"10.1016/j.knosys.2026.115368","DOIUrl":"10.1016/j.knosys.2026.115368","url":null,"abstract":"<div><div>Reconstruction and super-resolution (SR) provide effective solutions for accelerating multi-contrast magnetic resonance (MR) imaging by leveraging auxiliary contrast information to restore target contrast from an undersampled counterpart. Although recent advances have explored the joint optimization of reconstruction and SR, most existing frameworks still adopt shallow concatenation or independent decoding branches, thereby failing to fully exploit the inherent complementarity and hierarchical correlations between the two tasks. Additionally, auxiliary contrast information is typically integrated in an isotropic and coarse-grained manner, neglecting directional and structure-specific dependencies across anatomical regions, thus weakening its ability to provide discriminative guidance for the target contrast reconstruction. To address these limitations, we propose a multi-contrast feature cross entanglement network (MFCE-Net) that facilitates comprehensive feature interaction across modalities and tasks. In detail, we first introduce a multi-branch feature guidance module to facilitate multi-scale and direction-aware feature transfer across modalities. Furthermore, within the designed top-down architecture, we incorporate an attention mechanism that allows the SR branch to capture global structures while preserving fine textures by proposing a feature representation enhancement module. Finally, we design a feature entanglement interaction (FEI) module that employs a cross-weighting mechanism across spatial and channel dimensions to facilitate deep feature sharing and mutual reinforcement between the reconstruction and SR tasks. Extensive experiments are conducted with various advanced multi-contrast MR imaging methods on fastMRI, BraTS2019 and clinical in-house datasets, and the results demonstrate the superiority of our model. The code is released at <span><span>https://github.com/coolggq/MFCE-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115368"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mutual masked image consistency and feature adversarial training for semi-supervised medical image segmentation 半监督医学图像分割的互掩膜图像一致性和特征对抗训练
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-23 DOI: 10.1016/j.knosys.2026.115349
Wei Li , Linye Ma , Wenyi Zhao , Huihua Yang
Semi-supervised medical image segmentation (SSMIS) aims to alleviate the burden of extensive pixel/voxel-wise annotations by effectively leveraging unlabeled data. While prevalent approaches relying on pseudo-labeling or consistency regularization have shown promise, they are often prone to confirmation bias due to limited feature diversity. Furthermore, existing mixed sampling strategies utilized to expand the training scale frequently generate synthetic data that deviates from real-world distributions, potentially misleading the learning process. To address these challenges, we introduce a novel framework called Mutual Masked Image Consistency and Feature Adversarial Training (MCFAT-Net). Our approach enhances model diversity through a multi-perspective strategy, fostering global-local consistency to improve generalization. Specifically, MCFAT-Net comprises a shared encoder and dual classifiers that leverage Mutual Feature Adversarial Training to inject perturbations, ensuring sub-network divergence and decision boundary smoothness. Moreover, we integrate a dual-level data augmentation strategy: Cross-Set CutMix operating at the inter-sample level to capture global dataset structures, and Mutual Masked Image Consistency operating at the intra-sample level to refine fine-grained local representations. This combination enables the simultaneous capture of pairwise structures across the entire dataset and individual part-object relationships. Extensive experiments on three public datasets demonstrate that MCFAT-Net achieves superior performance compared to state-of-the-art methods.
半监督医学图像分割(SSMIS)旨在通过有效利用未标记数据来减轻大量像素/体素注释的负担。虽然依赖于伪标签或一致性正则化的流行方法已经显示出希望,但由于有限的特征多样性,它们往往容易产生确认偏差。此外,现有用于扩展训练规模的混合采样策略经常生成偏离真实分布的合成数据,可能会误导学习过程。为了解决这些挑战,我们引入了一个新的框架,称为互掩膜图像一致性和特征对抗训练(MCFAT-Net)。我们的方法通过多视角策略增强模型多样性,促进全局-局部一致性以提高泛化。具体来说,MCFAT-Net包括一个共享编码器和双分类器,它们利用互特征对抗训练注入扰动,确保子网络发散和决策边界平滑。此外,我们还集成了一种双级数据增强策略:在样本间级别操作的Cross-Set CutMix捕获全局数据集结构,以及在样本内级别操作的Mutual masking Image Consistency以细化细粒度的局部表示。这种组合可以同时捕获整个数据集和单个部分-对象关系的成对结构。在三个公共数据集上进行的大量实验表明,与最先进的方法相比,MCFAT-Net实现了卓越的性能。
{"title":"Mutual masked image consistency and feature adversarial training for semi-supervised medical image segmentation","authors":"Wei Li ,&nbsp;Linye Ma ,&nbsp;Wenyi Zhao ,&nbsp;Huihua Yang","doi":"10.1016/j.knosys.2026.115349","DOIUrl":"10.1016/j.knosys.2026.115349","url":null,"abstract":"<div><div>Semi-supervised medical image segmentation (SSMIS) aims to alleviate the burden of extensive pixel/voxel-wise annotations by effectively leveraging unlabeled data. While prevalent approaches relying on pseudo-labeling or consistency regularization have shown promise, they are often prone to confirmation bias due to limited feature diversity. Furthermore, existing mixed sampling strategies utilized to expand the training scale frequently generate synthetic data that deviates from real-world distributions, potentially misleading the learning process. To address these challenges, we introduce a novel framework called Mutual Masked Image Consistency and Feature Adversarial Training (MCFAT-Net). Our approach enhances model diversity through a multi-perspective strategy, fostering global-local consistency to improve generalization. Specifically, MCFAT-Net comprises a shared encoder and dual classifiers that leverage Mutual Feature Adversarial Training to inject perturbations, ensuring sub-network divergence and decision boundary smoothness. Moreover, we integrate a dual-level data augmentation strategy: Cross-Set CutMix operating at the inter-sample level to capture global dataset structures, and Mutual Masked Image Consistency operating at the intra-sample level to refine fine-grained local representations. This combination enables the simultaneous capture of pairwise structures across the entire dataset and individual part-object relationships. Extensive experiments on three public datasets demonstrate that MCFAT-Net achieves superior performance compared to state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115349"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedCLIP-Distill: Heterogeneous federated cross-modal knowledge distillation for multi-domain visual recognition FedCLIP-Distill:面向多领域视觉识别的异构联邦跨模态知识蒸馏
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-22 DOI: 10.1016/j.knosys.2026.115383
Yuankun Xia, Hui Wang, Yufeng Zhou
Federated learning (FL) for multi-domain visual recognition confronts significant challenges due to heterogeneous data distributions and domain shifts, which severely impair the semantic generalization capability of existing methods. To address these challenges, we propose FedCLIP-Distill, a novel framework that employs dual-domain knowledge distillation (KD) and contrastive relational distillation (CRD) to leverage the powerful visual-language alignment of CLIP in heterogeneous FL environments. Our approach employs a centralized CLIP teacher model to distill robust visual-textual semantics into lightweight client-side student models, thereby enabling effective local domain adaptation. We provide a theoretical convergence analysis proving that our distillation mechanism effectively mitigates domain gaps and facilitates robust convergence under non-IID settings. Extensive experiments on Office-Caltech10 and DomainNet benchmarks show that FedCLIP-Distill outperforms other methods: it achieves an average cross-domain accuracy of 98.5% on Office-Caltech10 and 80.50% on DomainNet. In different heterogeneous situations (e.g., Dirichlet α = 0.5, 9.52% higher than FedCLIP), demonstrating significant improvements in accuracy and generalization under heterogeneous scenarios. The source code is available at https://github.com/Yuankun-Xia/FedCLIP-Distill.
多领域视觉识别中的联邦学习由于数据的异构分布和领域的转移而面临着巨大的挑战,严重影响了现有方法的语义泛化能力。为了解决这些挑战,我们提出了FedCLIP-Distill,这是一个采用双领域知识蒸馏(KD)和对比关系蒸馏(CRD)的新框架,以利用CLIP在异构FL环境中强大的视觉语言校准功能。我们的方法采用集中式CLIP教师模型,将鲁棒的视觉文本语义提取到轻量级的客户端学生模型中,从而实现有效的局部领域适应。我们提供了一个理论收敛分析,证明我们的蒸馏机制有效地减轻了域差距,并促进了非iid设置下的鲁棒收敛。在Office-Caltech10和DomainNet基准测试上的大量实验表明,FedCLIP-Distill优于其他方法:它在Office-Caltech10和DomainNet上的平均跨域准确率分别达到98.5%和80.50%。在不同的异构情况下(例如,Dirichlet α = 0.5,比FedCLIP高9.52%),显示出在异构场景下准确性和泛化性的显著提高。源代码可从https://github.com/Yuankun-Xia/FedCLIP-Distill获得。
{"title":"FedCLIP-Distill: Heterogeneous federated cross-modal knowledge distillation for multi-domain visual recognition","authors":"Yuankun Xia,&nbsp;Hui Wang,&nbsp;Yufeng Zhou","doi":"10.1016/j.knosys.2026.115383","DOIUrl":"10.1016/j.knosys.2026.115383","url":null,"abstract":"<div><div>Federated learning (FL) for multi-domain visual recognition confronts significant challenges due to heterogeneous data distributions and domain shifts, which severely impair the semantic generalization capability of existing methods. To address these challenges, we propose FedCLIP-Distill, a novel framework that employs dual-domain knowledge distillation (KD) and contrastive relational distillation (CRD) to leverage the powerful visual-language alignment of CLIP in heterogeneous FL environments. Our approach employs a centralized CLIP teacher model to distill robust visual-textual semantics into lightweight client-side student models, thereby enabling effective local domain adaptation. We provide a theoretical convergence analysis proving that our distillation mechanism effectively mitigates domain gaps and facilitates robust convergence under non-IID settings. Extensive experiments on Office-Caltech10 and DomainNet benchmarks show that FedCLIP-Distill outperforms other methods: it achieves an average cross-domain accuracy of 98.5% on Office-Caltech10 and 80.50% on DomainNet. In different heterogeneous situations (e.g., Dirichlet <em>α</em> = 0.5, 9.52% higher than FedCLIP), demonstrating significant improvements in accuracy and generalization under heterogeneous scenarios. The source code is available at <span><span>https://github.com/Yuankun-Xia/FedCLIP-Distill</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115383"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal householder transformation embedding for temporal knowledge graph completion 时态知识图补全的时态户主变换嵌入
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-24 DOI: 10.1016/j.knosys.2026.115406
Zhiyu Xu , Kai Lin , Pengpeng Qiu , Tong Shen , Fu Zhang
Knowledge Graph Embedding (KGE) has been widely used to address the incompleteness of Knowledge Graph (KG) by predicting missing facts. Temporal Knowledge Graph Embedding (TKGE) extends KGE by incorporating temporal information into fact representations. However, most existing research focuses on static graphs and ignores the temporal dynamics of facts in TKG, which poses significant challenges for link prediction. Furthermore, current TKGE models still struggle with effectively capturing and representing crucial relation patterns, including symmetry, antisymmetry, inversion, composition, and temporal, along with complex relation mapping properties like 1-to-N, N-to-1, and N-to-N. To overcome these challenges, we propose a Temporal Householder Transformation Embedding model called TeHTE, which fuses temporal information with Householder transformation to capture both static and temporal features within TKG effectively. In the static module, TeHTE constructs static entity embeddings by reflecting the head entity through a transfer matrix and represents each relation with a pair of vectors to capture relational semantics. In the temporal module, TeHTE integrates temporal information into the entity representation through the time transfer matrix and shared time window, thereby enhancing its ability to capture temporal features. To further enhance modeling capacity, TeHTE learns a set of Householder transformations parameterized by relations to obtain structural embeddings for entities. Moreover, we theoretically demonstrate the ability of TeHTE to model various relation patterns and mapping properties. Experimental results on four benchmark datasets indicate that TeHTE substantially surpasses most existing TKGE approaches on temporal link prediction tasks. Ablation studies further validate the contribution of each component within the TeHTE framework.
知识图嵌入(Knowledge Graph Embedding, KGE)被广泛应用于通过预测缺失事实来解决知识图的不完备性问题。时间知识图嵌入(TKGE)通过将时间信息整合到事实表示中来扩展知识图嵌入。然而,现有的研究大多集中在静态图上,忽略了TKG中事实的时间动态,这给链路预测带来了重大挑战。此外,当前的TKGE模型仍然难以有效地捕获和表示关键的关系模式,包括对称、反对称、反转、组合和时间,以及复杂的关系映射属性,如1到n、n到1和n到n。为了克服这些挑战,我们提出了一种称为TeHTE的时间户主转换嵌入模型,该模型将时间信息与户主转换融合在一起,以有效地捕获TKG中的静态和时间特征。在静态模块中,TeHTE通过传递矩阵反映头部实体来构造静态实体嵌入,并用一对向量表示每个关系,以捕获关系语义。在时间模块中,TeHTE通过时间传递矩阵和共享时间窗口将时间信息整合到实体表示中,增强了对时间特征的捕捉能力。为了进一步增强建模能力,TeHTE学习了一组由关系参数化的Householder转换,以获得实体的结构嵌入。此外,我们从理论上证明了tete对各种关系模式和映射属性建模的能力。在4个基准数据集上的实验结果表明,在时间链路预测任务上,TeHTE方法大大优于现有的大多数TKGE方法。消融研究进一步验证了TeHTE框架内每个组成部分的贡献。
{"title":"Temporal householder transformation embedding for temporal knowledge graph completion","authors":"Zhiyu Xu ,&nbsp;Kai Lin ,&nbsp;Pengpeng Qiu ,&nbsp;Tong Shen ,&nbsp;Fu Zhang","doi":"10.1016/j.knosys.2026.115406","DOIUrl":"10.1016/j.knosys.2026.115406","url":null,"abstract":"<div><div>Knowledge Graph Embedding (KGE) has been widely used to address the incompleteness of Knowledge Graph (KG) by predicting missing facts. Temporal Knowledge Graph Embedding (TKGE) extends KGE by incorporating temporal information into fact representations. However, most existing research focuses on static graphs and ignores the temporal dynamics of facts in TKG, which poses significant challenges for link prediction. Furthermore, current TKGE models still struggle with effectively capturing and representing crucial relation patterns, including <em>symmetry, antisymmetry, inversion, composition</em>, and <em>temporal</em>, along with complex relation mapping properties like 1<em>-to-N, N-to-</em>1, and <em>N-to-N</em>. To overcome these challenges, we propose a <strong>Te</strong>mporal <strong>H</strong>ouseholder <strong>T</strong>ransformation <strong>E</strong>mbedding model called TeHTE, which fuses temporal information with Householder transformation to capture both static and temporal features within TKG effectively. In the static module, TeHTE constructs static entity embeddings by reflecting the head entity through a transfer matrix and represents each relation with a pair of vectors to capture relational semantics. In the temporal module, TeHTE integrates temporal information into the entity representation through the time transfer matrix and shared time window, thereby enhancing its ability to capture temporal features. To further enhance modeling capacity, TeHTE learns a set of Householder transformations parameterized by relations to obtain structural embeddings for entities. Moreover, we theoretically demonstrate the ability of TeHTE to model various relation patterns and mapping properties. Experimental results on four benchmark datasets indicate that TeHTE substantially surpasses most existing TKGE approaches on temporal link prediction tasks. Ablation studies further validate the contribution of each component within the TeHTE framework.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115406"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCSNNet: Siamese convolutional spiking neural network for childhood medulloblastoma detection using microscopic images SCSNNet:用显微镜图像检测儿童髓母细胞瘤的暹罗卷积脉冲神经网络
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-25 Epub Date: 2026-01-16 DOI: 10.1016/j.knosys.2026.115357
Ramesh Kumar Ramaswamy , Aruna Rajendiran , J. Jude Moses Anto Devakanth , Santhosh Kumar Balan
The highly dangerous brain tumor is known as Childhood medulloblastoma (CMB), which affects children predominantly and leads to a notable death rate. Traditionally, the standard approach for diagnosis has involved histopathology. However, histopathology is complex, consumes more time, and demands specialized expertise, which increases the risk of misdiagnosis. Thus, the new model named Siamese Convolutional Spiking Neural Network (SCSNNet) is implemented to overcome the misdiagnosis. The microscopic images sourced from the IEEE Data Port are referred to as the input and denoising is done by the Weiner filter. Then, the denoised image is segmented by the EffiSegNet technique. The system then derives essential characteristics from the input images, employing the Location Directional Number (LDN) combined with Haar Wavelet analysis and histogram-based descriptors. These extracted features are then forwarded to the classification stage, where the proposed SCSNNet framework operates. SCSNNet integrates the strengths of a Siamese Convolutional Neural Network (SCNN) with those of a Deep Spiking Neural Network (DSNN), enabling robust identification of childhood medulloblastoma (CMB). The model distinguishes between healthy tissue and CMB cases, achieving strong performance with an accuracy of 92.28%, a True Positive Rate (TPR) of 93.21%, and a True Negative Rate (TNR) of 91.48% when evaluated on k-group 9.
这种高度危险的脑肿瘤被称为儿童髓母细胞瘤(CMB),主要影响儿童,死亡率很高。传统上,诊断的标准方法包括组织病理学。然而,组织病理学是复杂的,需要更多的时间,并需要专业的专家,这增加了误诊的风险。为此,提出了Siamese卷积尖峰神经网络(SCSNNet)模型来克服误诊断。来自IEEE数据端口的显微图像被称为输入,去噪由韦纳滤波器完成。然后,利用EffiSegNet技术对去噪后的图像进行分割。然后,系统利用位置方向数(LDN)结合Haar小波分析和基于直方图的描述符,从输入图像中提取基本特征。然后将这些提取的特征转发到分类阶段,在此阶段,建议的SCSNNet框架将运行。SCSNNet整合了暹罗卷积神经网络(SCNN)和深脉冲神经网络(DSNN)的优势,能够对儿童髓母细胞瘤(CMB)进行稳健的识别。该模型区分了健康组织和CMB病例,在k组9上评估时,准确率为92.28%,真阳性率(TPR)为93.21%,真阴性率(TNR)为91.48%。
{"title":"SCSNNet: Siamese convolutional spiking neural network for childhood medulloblastoma detection using microscopic images","authors":"Ramesh Kumar Ramaswamy ,&nbsp;Aruna Rajendiran ,&nbsp;J. Jude Moses Anto Devakanth ,&nbsp;Santhosh Kumar Balan","doi":"10.1016/j.knosys.2026.115357","DOIUrl":"10.1016/j.knosys.2026.115357","url":null,"abstract":"<div><div>The highly dangerous brain tumor is known as Childhood medulloblastoma (CMB), which affects children predominantly and leads to a notable death rate. Traditionally, the standard approach for diagnosis has involved histopathology. However, histopathology is complex, consumes more time, and demands specialized expertise, which increases the risk of misdiagnosis. Thus, the new model named Siamese Convolutional Spiking Neural Network (SCSN<img>Net) is implemented to overcome the misdiagnosis. The microscopic images sourced from the IEEE Data Port are referred to as the input and denoising is done by the Weiner filter. Then, the denoised image is segmented by the EffiSegNet technique. The system then derives essential characteristics from the input images, employing the Location Directional Number (LDN) combined with Haar Wavelet analysis and histogram-based descriptors. These extracted features are then forwarded to the classification stage, where the proposed SCSN<img>Net framework operates. SCSN<img>Net integrates the strengths of a Siamese Convolutional Neural Network (SCNN) with those of a Deep Spiking Neural Network (DSNN), enabling robust identification of childhood medulloblastoma (CMB). The model distinguishes between healthy tissue and CMB cases, achieving strong performance with an accuracy of 92.28%, a True Positive Rate (TPR) of 93.21%, and a True Negative Rate (TNR) of 91.48% when evaluated on k-group 9.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"337 ","pages":"Article 115357"},"PeriodicalIF":7.6,"publicationDate":"2026-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge-Based Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1