首页 > 最新文献

Neurocomputing最新文献

英文 中文
Video tampering detection with forgery trace-aware swin transformer 带伪造痕迹感知旋转变压器的视频篡改检测
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132995
Zhentao Hu , Shengjia Zhang , Fuyi Liu
The proliferation of sophisticated video editing tools has escalated video forgery into a severe security crisis. While existing forensic methods can achieve preliminary results in pixel-level tampering localization on conventional datasets, they often struggle to effectively address scenarios involving tampering with multiple small objects, which represent a prevalent and challenging scenario in real-world forensic tasks that has long been overlooked by major studies. To address this core challenge, we propose a novel universal video forensic vision network (VFVNet). VFVNet employs a dual-stream architecture: a multi-view feature extractor (MVFE) capturing diverse low-to-mid-level clues (e.g., texture, artifacts) from multiple perspectives, and a Contextual Semantic Feature Extractor (CSFE) modeling higher-level semantic context and spatial relationships to detect unnatural placements or variations in forensic traces. Then, fused features from both streams enter a Swin Deep Attention Module (SDAM), which explores latent feature correlations across scales and locations. SDAM’s deep attention refines the representation, amplifying tampering-relevant cues while suppressing background noise. Finally, attention-guided features combined with initially extracted features enable precise discrimination between authentic and manipulated regions for pixel-level localization. Extensive experiments on three public and self-built datasets demonstrate that VFVNet achieves state-of-the-art performance in pixel-level localization. On average, it delivers relative improvements of 13.6% in F1-score and 14.9% in MCC over the prior best method, while sustaining superior AUC.
复杂的视频编辑工具的激增使视频伪造升级为严重的安全危机。虽然现有的取证方法可以在传统数据集上实现像素级篡改定位的初步结果,但它们往往难以有效地解决涉及篡改多个小物体的场景,这是现实世界取证任务中普遍存在且具有挑战性的场景,长期以来一直被主要研究所忽视。为了解决这一核心挑战,我们提出了一种新的通用视频法医视觉网络(VFVNet)。VFVNet采用双流架构:一个多视图特征提取器(MVFE)从多个角度捕获各种低到中级线索(例如,纹理,工件),一个上下文语义特征提取器(CSFE)建模高级语义上下文和空间关系,以检测不自然的位置或法医痕迹的变化。然后,来自两个流的融合特征进入Swin深度注意模块(SDAM),该模块探索跨尺度和位置的潜在特征相关性。SDAM的深度注意改进了表征,放大了与篡改相关的线索,同时抑制了背景噪声。最后,将注意力引导特征与最初提取的特征相结合,可以精确区分真实区域和被操纵区域,从而实现像素级定位。在三个公共和自建数据集上的大量实验表明,VFVNet在像素级定位方面取得了最先进的性能。平均而言,与之前的最佳方法相比,它在f1评分和MCC方面的相对改善分别为13.6%和14.9%,同时保持了较好的AUC。
{"title":"Video tampering detection with forgery trace-aware swin transformer","authors":"Zhentao Hu ,&nbsp;Shengjia Zhang ,&nbsp;Fuyi Liu","doi":"10.1016/j.neucom.2026.132995","DOIUrl":"10.1016/j.neucom.2026.132995","url":null,"abstract":"<div><div>The proliferation of sophisticated video editing tools has escalated video forgery into a severe security crisis. While existing forensic methods can achieve preliminary results in pixel-level tampering localization on conventional datasets, they often struggle to effectively address scenarios involving tampering with multiple small objects, which represent a prevalent and challenging scenario in real-world forensic tasks that has long been overlooked by major studies. To address this core challenge, we propose a novel universal video forensic vision network (VFVNet). VFVNet employs a dual-stream architecture: a multi-view feature extractor (MVFE) capturing diverse low-to-mid-level clues (e.g., texture, artifacts) from multiple perspectives, and a Contextual Semantic Feature Extractor (CSFE) modeling higher-level semantic context and spatial relationships to detect unnatural placements or variations in forensic traces. Then, fused features from both streams enter a Swin Deep Attention Module (SDAM), which explores latent feature correlations across scales and locations. SDAM’s deep attention refines the representation, amplifying tampering-relevant cues while suppressing background noise. Finally, attention-guided features combined with initially extracted features enable precise discrimination between authentic and manipulated regions for pixel-level localization. Extensive experiments on three public and self-built datasets demonstrate that VFVNet achieves state-of-the-art performance in pixel-level localization. On average, it delivers relative improvements of 13.6% in F1-score and 14.9% in MCC over the prior best method, while sustaining superior AUC.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132995"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CBERR: Community-based effective resistance graph rewiring CBERR:基于社区的有效电阻图重新布线
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.133016
Xiaomian Xu , Yuda Zhang , Jing Li , Zhiyi Li , Zhiguo Zhang
GNNs are state-of-the-art for graph-structured data, yet their performance is limited by inherent topological flaws such as over-smoothing, over-squashing, and under-reaching. Graph rewiring has emerged as a promising solution, but existing methods often overemphasize connectivity, overlook community complexity, and suffer from high computational costs. To overcome these issues, we propose CBERR, a novel rewiring algorithm that synergistically optimizes global and local graph properties via a divide-and-conquer approach. CBERR comprises two complementary modules: (1) an inter-community feature alignment module that enhances global feature-structure consistency by adjusting connections based on node similarity, and (2) an intra-community effective resistance minimization module that strengthens local connectivity and alleviates over-squashing via greedy edge updates. Theoretical analysis reveals intrinsic relationships among community structure, spectral gap, and effective resistance, justifying CBERR’s design. Extensive experiments show that CBERR outperforms existing rewiring methods (e.g., DIGL, FoSR, BORF, GTR, ComFy) on tasks including node classification. By operating at the subgraph level, CBERR significantly reduces computation and ensures scalability to large graphs. This work advances graph structure learning theory and offers a practical tool for enhancing GNN performance. Future work may incorporate overlapping community detection and non-greedy optimization for broader adaptability.
gnn是最先进的图结构数据,但它们的性能受到固有拓扑缺陷的限制,如过度平滑、过度压缩和欠延伸。图重新布线已经成为一种很有前途的解决方案,但是现有的方法往往过分强调连接性,忽略了社区的复杂性,并且计算成本高。为了克服这些问题,我们提出了CBERR,这是一种新的重新布线算法,通过分而治之的方法协同优化全局和局部图属性。CBERR包括两个互补模块:(1)社区间特征对齐模块,该模块通过基于节点相似度调整连接来增强全局特征结构一致性;(2)社区内有效阻力最小化模块,该模块通过贪婪边缘更新来增强局部连通性并缓解过度挤压。理论分析揭示了群落结构、谱隙和有效阻力之间的内在关系,为CBERR的设计提供了依据。大量实验表明,CBERR在包括节点分类在内的任务上优于现有的重新布线方法(例如,DIGL, FoSR, BORF, GTR, ComFy)。通过在子图级别操作,CBERR显著减少了计算量,并确保了对大型图的可扩展性。这项工作推进了图结构学习理论,为提高GNN性能提供了一个实用的工具。未来的工作可能包括重叠社区检测和非贪婪优化,以获得更广泛的适应性。
{"title":"CBERR: Community-based effective resistance graph rewiring","authors":"Xiaomian Xu ,&nbsp;Yuda Zhang ,&nbsp;Jing Li ,&nbsp;Zhiyi Li ,&nbsp;Zhiguo Zhang","doi":"10.1016/j.neucom.2026.133016","DOIUrl":"10.1016/j.neucom.2026.133016","url":null,"abstract":"<div><div>GNNs are state-of-the-art for graph-structured data, yet their performance is limited by inherent topological flaws such as over-smoothing, over-squashing, and under-reaching. Graph rewiring has emerged as a promising solution, but existing methods often overemphasize connectivity, overlook community complexity, and suffer from high computational costs. To overcome these issues, we propose <strong>CBERR</strong>, a novel rewiring algorithm that synergistically optimizes global and local graph properties via a divide-and-conquer approach. <strong>CBERR</strong> comprises two complementary modules: <strong>(1)</strong> an inter-community feature alignment module that enhances global feature-structure consistency by adjusting connections based on node similarity, and <strong>(2)</strong> an intra-community effective resistance minimization module that strengthens local connectivity and alleviates over-squashing via greedy edge updates. Theoretical analysis reveals intrinsic relationships among community structure, spectral gap, and effective resistance, justifying <strong>CBERR</strong>’s design. Extensive experiments show that <strong>CBERR</strong> outperforms existing rewiring methods (e.g., DIGL, FoSR, BORF, GTR, ComFy) on tasks including node classification. By operating at the subgraph level, <strong>CBERR</strong> significantly reduces computation and ensures scalability to large graphs. This work advances graph structure learning theory and offers a practical tool for enhancing GNN performance. Future work may incorporate overlapping community detection and non-greedy optimization for broader adaptability.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 133016"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment of spatio-temporal predictors in the presence of missing and heterogeneous data 在缺失和异质数据的情况下评估时空预测因子
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132963
Daniele Zambon , Cesare Alippi
Deep learning methods achieve remarkable predictive performance in modeling complex, large-scale data. However, assessing the quality of derived models has become increasingly challenging, as more classical statistical assumptions may no longer apply. These difficulties are particularly pronounced for spatio-temporal data, which exhibit dependencies across both space and time and are often characterized by nonlinear dynamics, time variance, and missing observations, hence calling for new accuracy assessment methodologies. This paper introduces a residual correlation analysis framework for assessing the optimality of spatio-temporal relational-enabled neural predictive models, notably in settings with incomplete and heterogeneous data. By leveraging the principle that residual correlation indicates information not captured by the model, enabling the identification and localization of regions in space and time where predictive performance can be improved. A strength of the proposed approach is that it operates under minimal assumptions, allowing for robust evaluation of deep learning models applied to multivariate time series, even in the presence of missing and heterogeneous data. In detail, the methodology constructs tailored spatio-temporal graphs to encode sparse spatial and temporal dependencies and employs asymptotically distribution-free summary statistics to detect time intervals and spatial regions where the model underperforms. The effectiveness of what proposed is demonstrated through experiments on both synthetic and real-world datasets using state-of-the-art predictive models.
深度学习方法在复杂的大规模数据建模方面取得了显著的预测性能。然而,评估衍生模型的质量变得越来越具有挑战性,因为更经典的统计假设可能不再适用。这些困难对于时空数据尤其明显,时空数据在空间和时间上都表现出依赖性,并且通常以非线性动力学、时间方差和缺失观测为特征,因此需要新的准确性评估方法。本文介绍了一个残差相关分析框架,用于评估时空关系支持的神经预测模型的最优性,特别是在数据不完整和异构的情况下。通过利用残差相关性表示模型未捕获的信息的原理,可以在空间和时间上识别和定位可以改进预测性能的区域。该方法的优势在于它在最小的假设下运行,即使在存在缺失和异构数据的情况下,也可以对应用于多变量时间序列的深度学习模型进行稳健评估。详细地说,该方法构建定制的时空图来编码稀疏的空间和时间依赖性,并使用渐近无分布的汇总统计来检测模型表现不佳的时间间隔和空间区域。通过使用最先进的预测模型在合成和现实世界数据集上的实验证明了所提出的有效性。
{"title":"Assessment of spatio-temporal predictors in the presence of missing and heterogeneous data","authors":"Daniele Zambon ,&nbsp;Cesare Alippi","doi":"10.1016/j.neucom.2026.132963","DOIUrl":"10.1016/j.neucom.2026.132963","url":null,"abstract":"<div><div>Deep learning methods achieve remarkable predictive performance in modeling complex, large-scale data. However, assessing the quality of derived models has become increasingly challenging, as more classical statistical assumptions may no longer apply. These difficulties are particularly pronounced for spatio-temporal data, which exhibit dependencies across both space and time and are often characterized by nonlinear dynamics, time variance, and missing observations, hence calling for new accuracy assessment methodologies. This paper introduces a residual correlation analysis framework for assessing the optimality of spatio-temporal relational-enabled neural predictive models, notably in settings with incomplete and heterogeneous data. By leveraging the principle that residual correlation indicates information not captured by the model, enabling the identification and localization of regions in space and time where predictive performance can be improved. A strength of the proposed approach is that it operates under minimal assumptions, allowing for robust evaluation of deep learning models applied to multivariate time series, even in the presence of missing and heterogeneous data. In detail, the methodology constructs tailored spatio-temporal graphs to encode sparse spatial and temporal dependencies and employs asymptotically distribution-free summary statistics to detect time intervals and spatial regions where the model underperforms. The effectiveness of what proposed is demonstrated through experiments on both synthetic and real-world datasets using state-of-the-art predictive models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132963"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing shape bias for object detection 增强物体检测的形状偏差
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-06 DOI: 10.1016/j.neucom.2026.132931
Jiwen Tang , Gu Wang , Ruida Zhang , Xiangyang Ji
Convolutional Neural Networks (CNNs) are widely used for object detection tasks, whereas recent studies have shown that they rely more on texture rather than shape for object recognition, a phenomenon known as texture bias. This bias makes them vulnerable to image corruptions, domain shifts, and adversarial perturbations, posing significant challenges for real-world deployment, especially in safety-critical and industrial applications. Despite its significance, texture bias in object detection remains largely underexplored. To address this gap, we first conduct a comprehensive analysis of texture bias across multiple widely-used CNN-based detection architectures, demonstrating the widespread presence and detrimental impact of this issue. Motivated by these findings, we propose a simple yet effective method, TexDrop, to increase shape bias in CNNs and therefore improve their accuracy and robustness. Specifically, TexDrop randomly drops out the texture and color of the training images through straightforward edge detection, forcing models to learn to detect objects based on their shape, thus increasing shape bias. Unlike prior approaches that require architectural modifications, extensive additional training data or complex regularization schemes, TexDrop is model-agnostic, easy to integrate into existing training pipelines, and incurs negligible computational overhead. Intensive experiments on Pascal VOC, COCO, and various corrupted COCO datasets demonstrate that TexDrop not only improves detection performance across multiple architectures but also consistently enhances robustness against various image corruptions and texture variations. Our study provides empirical insights into texture dependence in object detectors and contributes a practical solution for developing more robust and reliable object detection systems in real-world applications.
卷积神经网络(cnn)被广泛用于目标检测任务,然而最近的研究表明,它们更多地依赖于纹理而不是形状来识别目标,这种现象被称为纹理偏差。这种偏差使它们容易受到图像损坏、域移位和对抗性扰动的影响,对现实世界的部署构成了重大挑战,特别是在安全关键和工业应用中。尽管具有重要意义,但纹理偏差在目标检测中的应用仍未得到充分的研究。为了解决这一差距,我们首先对多种广泛使用的基于cnn的检测架构进行了纹理偏差的综合分析,证明了该问题的广泛存在和有害影响。基于这些发现,我们提出了一种简单而有效的方法TexDrop来增加cnn的形状偏差,从而提高其准确性和鲁棒性。具体来说,TexDrop通过直接的边缘检测随机掉去训练图像的纹理和颜色,迫使模型学习根据物体的形状来检测物体,从而增加形状偏差。与之前需要修改架构、大量额外训练数据或复杂正则化方案的方法不同,TexDrop与模型无关,易于集成到现有的训练管道中,并且产生的计算开销可以忽略不计。在Pascal VOC、COCO和各种损坏的COCO数据集上进行的大量实验表明,TexDrop不仅提高了跨多个架构的检测性能,而且始终增强了对各种图像损坏和纹理变化的鲁棒性。我们的研究为物体检测器的纹理依赖性提供了经验见解,并为在实际应用中开发更健壮和可靠的物体检测系统提供了实用的解决方案。
{"title":"Enhancing shape bias for object detection","authors":"Jiwen Tang ,&nbsp;Gu Wang ,&nbsp;Ruida Zhang ,&nbsp;Xiangyang Ji","doi":"10.1016/j.neucom.2026.132931","DOIUrl":"10.1016/j.neucom.2026.132931","url":null,"abstract":"<div><div>Convolutional Neural Networks (CNNs) are widely used for object detection tasks, whereas recent studies have shown that they rely more on texture rather than shape for object recognition, a phenomenon known as texture bias. This bias makes them vulnerable to image corruptions, domain shifts, and adversarial perturbations, posing significant challenges for real-world deployment, especially in safety-critical and industrial applications. Despite its significance, texture bias in object detection remains largely underexplored. To address this gap, we first conduct a comprehensive analysis of texture bias across multiple widely-used CNN-based detection architectures, demonstrating the widespread presence and detrimental impact of this issue. Motivated by these findings, we propose a simple yet effective method, TexDrop, to increase shape bias in CNNs and therefore improve their accuracy and robustness. Specifically, TexDrop randomly drops out the texture and color of the training images through straightforward edge detection, forcing models to learn to detect objects based on their shape, thus increasing shape bias. Unlike prior approaches that require architectural modifications, extensive additional training data or complex regularization schemes, TexDrop is model-agnostic, easy to integrate into existing training pipelines, and incurs negligible computational overhead. Intensive experiments on Pascal VOC, COCO, and various corrupted COCO datasets demonstrate that TexDrop not only improves detection performance across multiple architectures but also consistently enhances robustness against various image corruptions and texture variations. Our study provides empirical insights into texture dependence in object detectors and contributes a practical solution for developing more robust and reliable object detection systems in real-world applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132931"},"PeriodicalIF":6.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeomFlow: Geometry-aware adaptive diffusion model via Hessian information GeomFlow:基于Hessian信息的几何感知自适应扩散模型
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-06 DOI: 10.1016/j.neucom.2026.132934
Haozhuo Cao , Xiaorui Wang , Liyang Yu , Wangcai Ding
Score-based generative models have achieved remarkable fidelity by employing stochastic differential equations (SDEs) to progressively transform data into noise. However, standard approaches suffer from a fundamental geometric mismatch: they apply spatially uniform diffusion dynamics across data manifolds that exhibit highly heterogeneous curvature. This “one-size-fits-all” strategy leads to inefficient sampling—over-computing in flat regions while under-resolving complex high-frequency details. To address this, we propose GeomFlow, a novel Geometry-Aware Adaptive Diffusion Model. GeomFlow synergizes two key mechanisms: a global learnable noise schedule that optimizes the macroscopic noise progression, and a geometric complexity estimator that utilizes a robust stochastic approximation of the Hessian trace to actively modulate local diffusion strength. Theoretically, we prove that our geometry-aware reverse process is equivalent to Riemannian preconditioned Langevin dynamics, enabling accelerated convergence and better escape from saddle points. Extensive experiments demonstrate that it achieves highly competitive performance on CIFAR-10 (FID 2.14) and CelebA-HQ, demonstrating superior structural understanding in conditional generation and image inpainting tasks with significant improvements in preserving semantic consistency and recovering missing texture details.
基于分数的生成模型通过采用随机微分方程(SDEs)逐步将数据转换为噪声,获得了显著的保真度。然而,标准方法存在一个基本的几何不匹配:它们在数据流形上应用空间均匀的扩散动力学,而这些数据流形表现出高度异质的曲率。这种“一刀切”的策略导致在平坦区域采样效率低下——过度计算,同时对复杂的高频细节分辨率不足。为了解决这个问题,我们提出了一种新的几何感知自适应扩散模型GeomFlow。GeomFlow协同了两个关键机制:优化宏观噪声进程的全局可学习噪声调度,以及利用Hessian轨迹的鲁棒随机逼近来主动调节局部扩散强度的几何复杂性估计器。理论上,我们证明了我们的几何感知逆向过程等价于黎曼预条件朗之万动力学,能够加速收敛并更好地脱离鞍点。大量的实验表明,它在CIFAR-10 (FID 2.14)和CelebA-HQ上取得了极具竞争力的性能,在条件生成和图像绘制任务中表现出卓越的结构理解能力,在保持语义一致性和恢复缺失的纹理细节方面有显著改进。
{"title":"GeomFlow: Geometry-aware adaptive diffusion model via Hessian information","authors":"Haozhuo Cao ,&nbsp;Xiaorui Wang ,&nbsp;Liyang Yu ,&nbsp;Wangcai Ding","doi":"10.1016/j.neucom.2026.132934","DOIUrl":"10.1016/j.neucom.2026.132934","url":null,"abstract":"<div><div>Score-based generative models have achieved remarkable fidelity by employing stochastic differential equations (SDEs) to progressively transform data into noise. However, standard approaches suffer from a fundamental geometric mismatch: they apply spatially uniform diffusion dynamics across data manifolds that exhibit highly heterogeneous curvature. This “one-size-fits-all” strategy leads to inefficient sampling—over-computing in flat regions while under-resolving complex high-frequency details. To address this, we propose GeomFlow, a novel Geometry-Aware Adaptive Diffusion Model. GeomFlow synergizes two key mechanisms: a global learnable noise schedule that optimizes the macroscopic noise progression, and a geometric complexity estimator that utilizes a robust stochastic approximation of the Hessian trace to actively modulate local diffusion strength. Theoretically, we prove that our geometry-aware reverse process is equivalent to Riemannian preconditioned Langevin dynamics, enabling accelerated convergence and better escape from saddle points. Extensive experiments demonstrate that it achieves highly competitive performance on CIFAR-10 (FID 2.14) and CelebA-HQ, demonstrating superior structural understanding in conditional generation and image inpainting tasks with significant improvements in preserving semantic consistency and recovering missing texture details.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 132934"},"PeriodicalIF":6.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CGEM: A cognitive-guided network for human-aligned entity matching CGEM:一种认知导向的人类对齐实体匹配网络
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-06 DOI: 10.1016/j.neucom.2026.132950
Xin Liu, Xiaojun Li, Junping Yao, Yanfei Liu, Qinggang Fan, Haifeng Sun, Chengrong Dong
Deep learning (DL) has advanced entity matching (EM), yet limited interpretability is particularly problematic for real-world deployment in decision-support settings, highlighting the need for models aligned with human reasoning as well as strong performance. Existing approaches improve interpretability but rarely reflect how humans make decisions. We propose Cognitive-Guided Entity Matching (CGEM), a human-aligned framework that reconceptualizes EM as a cognitive process rather than a purely technical task. CGEM is grounded in established theories: it introduces complexity-guided gating inspired by Cognitive Load Theory; builds holistic semantic representation grounded in Frame Semantics; and employs core-attribute reasoning following Cue Validity Theory to ensure diagnostic features govern final decisions. CGEM thus explicitly models complexity, contextuality, and diagnosticity, which remain underexplored in EM research. Experiments on DeepMatcher benchmarks show that CGEM delivers its strongest improvements on the Amazon–Google, Abt–Buy, iTunes–Amazon, and Walmart–Amazon datasets, yielding gains of up to 9.34% over DITTO (2023) and 5.51% over AttendEM (2024), and further exceeds large language model (LLM)–based EM methods on multiple benchmarks. To the best of our knowledge, CGEM is the first EM framework grounded in cognitive decision-making theories, advancing entity matching with human-aligned reasoning, strong predictive performance, and improved interpretability.
深度学习(DL)具有先进的实体匹配(EM),但在决策支持设置的实际部署中,有限的可解释性尤其成问题,这突出了对与人类推理和强大性能相一致的模型的需求。现有的方法提高了可解释性,但很少反映人类如何做出决定。我们提出认知导向实体匹配(CGEM),这是一个与人类一致的框架,它将EM重新定义为一个认知过程,而不是纯粹的技术任务。CGEM以已有的理论为基础:它引入了受认知负荷理论启发的复杂性导向门控;构建基于框架语义的整体语义表征;并采用遵循提示效度理论的核心属性推理,以确保诊断特征支配最终决策。因此,CGEM明确地模拟了复杂性、情境性和诊断性,这些在EM研究中仍未得到充分的探索。在DeepMatcher基准测试上的实验表明,CGEM在Amazon-Google、Abt-Buy、iTunes-Amazon和Walmart-Amazon数据集上表现出了最强的改进,比DITTO(2023)和AttendEM(2024)的收益分别高出9.34%和5.51%,并且在多个基准测试上进一步超过了基于大型语言模型(LLM)的EM方法。据我们所知,CGEM是第一个基于认知决策理论的EM框架,通过与人类一致的推理、强大的预测性能和改进的可解释性来推进实体匹配。
{"title":"CGEM: A cognitive-guided network for human-aligned entity matching","authors":"Xin Liu,&nbsp;Xiaojun Li,&nbsp;Junping Yao,&nbsp;Yanfei Liu,&nbsp;Qinggang Fan,&nbsp;Haifeng Sun,&nbsp;Chengrong Dong","doi":"10.1016/j.neucom.2026.132950","DOIUrl":"10.1016/j.neucom.2026.132950","url":null,"abstract":"<div><div>Deep learning (DL) has advanced entity matching (EM), yet limited interpretability is particularly problematic for real-world deployment in decision-support settings, highlighting the need for models aligned with human reasoning as well as strong performance. Existing approaches improve interpretability but rarely reflect how humans make decisions. We propose Cognitive-Guided Entity Matching (CGEM), a human-aligned framework that reconceptualizes EM as a cognitive process rather than a purely technical task. CGEM is grounded in established theories: it introduces complexity-guided gating inspired by Cognitive Load Theory; builds holistic semantic representation grounded in Frame Semantics; and employs core-attribute reasoning following Cue Validity Theory to ensure diagnostic features govern final decisions. CGEM thus explicitly models complexity, contextuality, and diagnosticity, which remain underexplored in EM research. Experiments on DeepMatcher benchmarks show that CGEM delivers its strongest improvements on the Amazon–Google, Abt–Buy, iTunes–Amazon, and Walmart–Amazon datasets, yielding gains of up to 9.34% over DITTO (2023) and 5.51% over AttendEM (2024), and further exceeds large language model (LLM)–based EM methods on multiple benchmarks. To the best of our knowledge, CGEM is the first EM framework grounded in cognitive decision-making theories, advancing entity matching with human-aligned reasoning, strong predictive performance, and improved interpretability.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132950"},"PeriodicalIF":6.5,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TabNSA: Native sparse attention for efficient tabular data learning TabNSA:用于高效表格数据学习的原生稀疏关注
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.neucom.2026.132928
Ali Eslamian , Qiang Cheng
Tabular data poses unique challenges for deep learning due to its heterogeneous feature types, lack of spatial structure, and often limited sample sizes. We propose TabNSA, a novel deep learning framework that integrates Native Sparse Attention (NSA) with a TabMixer backbone to efficiently model tabular data. TabNSA tackles computational and representational challenges by dynamically focusing on relevant feature subsets per instance. The NSA module employs a hierarchical sparse attention mechanism, including token compression, selective preservation, and localized sliding windows, to significantly reduce the quadratic complexity of standard attention operations while addressing feature heterogeneity. Complementing this, the TabMixer backbone captures complex, non-linear dependencies through parallel multilayer perceptron (MLP) branches with independent parameters. These modules are synergistically combined via element-wise summation and mean pooling, enabling TabNSA to model both global context and fine-grained interactions. Extensive experiments across supervised and transfer learning settings show that TabNSA consistently outperforms state-of-the-art deep learning models. Furthermore, by augmenting TabNSA with a fine-tuned large language model (LLM), we enable it to effectively address Few-Shot Learning challenges through language-guided generalization on diverse tabular benchmarks. Code available on: https://github.com/aseslamian/TabNSA.
表格数据由于其异构的特征类型、缺乏空间结构和通常有限的样本量,给深度学习带来了独特的挑战。我们提出了一种新的深度学习框架TabNSA,它将原生稀疏注意(NSA)与TabMixer主干集成在一起,以有效地对表格数据建模。TabNSA通过动态关注每个实例的相关特征子集来解决计算和表示方面的挑战。NSA模块采用分层稀疏关注机制,包括令牌压缩、选择性保存和局部滑动窗口,在解决特征异质性的同时显著降低了标准关注操作的二次复杂度。作为补充,TabMixer主干通过具有独立参数的并行多层感知器(MLP)分支捕获复杂的非线性依赖关系。这些模块通过元素求和和均值池协同组合,使TabNSA能够建模全局上下文和细粒度交互。在监督学习和迁移学习设置中进行的大量实验表明,TabNSA始终优于最先进的深度学习模型。此外,通过使用微调的大型语言模型(LLM)来增强TabNSA,我们使其能够通过语言引导的对各种表格基准的泛化来有效地解决Few-Shot Learning挑战。代码可在:https://github.com/aseslamian/TabNSA。
{"title":"TabNSA: Native sparse attention for efficient tabular data learning","authors":"Ali Eslamian ,&nbsp;Qiang Cheng","doi":"10.1016/j.neucom.2026.132928","DOIUrl":"10.1016/j.neucom.2026.132928","url":null,"abstract":"<div><div>Tabular data poses unique challenges for deep learning due to its heterogeneous feature types, lack of spatial structure, and often limited sample sizes. We propose TabNSA, a novel deep learning framework that integrates Native Sparse Attention (NSA) with a TabMixer backbone to efficiently model tabular data. TabNSA tackles computational and representational challenges by dynamically focusing on relevant feature subsets per instance. The NSA module employs a hierarchical sparse attention mechanism, including token compression, selective preservation, and localized sliding windows, to significantly reduce the quadratic complexity of standard attention operations while addressing feature heterogeneity. Complementing this, the TabMixer backbone captures complex, non-linear dependencies through parallel multilayer perceptron (MLP) branches with independent parameters. These modules are synergistically combined via element-wise summation and mean pooling, enabling TabNSA to model both global context and fine-grained interactions. Extensive experiments across supervised and transfer learning settings show that TabNSA consistently outperforms state-of-the-art deep learning models. Furthermore, by augmenting TabNSA with a fine-tuned large language model (LLM), we enable it to effectively address Few-Shot Learning challenges through language-guided generalization on diverse tabular benchmarks. <strong>Code available on:</strong> <span><span>https://github.com/aseslamian/TabNSA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132928"},"PeriodicalIF":6.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tuning metaheuristic parameters with the use of Large Language Models 使用大型语言模型调优元启发式参数
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.neucom.2026.132976
Alicja Martinek , Ewelina Bartuzi-Trokielewicz , Szymon Łukasik , Amir H. Gandomi
Since their explosion in popularity, the impact of Large Language Models (LLMs) has been evident in almost every aspect of life. This study examines whether LLMs can be utilized for tuning metaheuristic algorithms through the selection of their parameters. To verify this hypothesis, ten instances each of three well-known combinatorial optimization problems, Graph Coloring, Job-Shop Scheduling, and Traveling Salesman, were solved using heuristic optimizers guided by LLMs, including genetic algorithm, ant colony optimization, particle swarm optimization, and simulated annealing. Parameter values were generated by prompting several state-of-the-art LLMs with problem complexity descriptors and the set of tunable parameters. A two-stage procedure was employed: an initial run based on general problem characteristics, followed by a feedback run that used performance metrics such as average fitness, variance, and convergence behavior. Default settings from the Python-based Mealpy library served as the baseline for comparison.
Results, aggregated over 900 optimizer runs, show that LLMs are capable of proposing parameter configurations that outperform defaults in terms of final objective value and convergence speed. This effect is particularly pronounced in simulated annealing and Traveling Salesman problem settings. The findings suggest that LLMs possess a high degree of generalization and contextual understanding in the domain of optimization and can serve as practical assistants in heuristic algorithm design and tuning.
自从大语言模型(llm)大受欢迎以来,它的影响几乎在生活的各个方面都很明显。本研究考察了llm是否可以通过参数的选择来调整元启发式算法。为了验证这一假设,使用遗传算法、蚁群优化、粒子群优化和模拟退火等启发式优化器,对图着色、作业车间调度和旅行推销员这三个著名的组合优化问题分别求解了10个实例。参数值是通过使用问题复杂性描述符和一组可调参数提示几个最先进的llm来生成的。采用了两个阶段的程序:基于一般问题特征的初始运行,然后是使用平均适应度、方差和收敛行为等性能指标的反馈运行。基于python的Mealpy库的默认设置用作比较的基线。在900次优化器运行中汇总的结果表明,llm能够提出在最终目标值和收敛速度方面优于默认值的参数配置。这种效果在模拟退火和旅行推销员问题设置中特别明显。研究结果表明,llm在优化领域具有高度的泛化和上下文理解能力,可以作为启发式算法设计和调优的实用助手。
{"title":"Tuning metaheuristic parameters with the use of Large Language Models","authors":"Alicja Martinek ,&nbsp;Ewelina Bartuzi-Trokielewicz ,&nbsp;Szymon Łukasik ,&nbsp;Amir H. Gandomi","doi":"10.1016/j.neucom.2026.132976","DOIUrl":"10.1016/j.neucom.2026.132976","url":null,"abstract":"<div><div>Since their explosion in popularity, the impact of Large Language Models (LLMs) has been evident in almost every aspect of life. This study examines whether LLMs can be utilized for tuning metaheuristic algorithms through the selection of their parameters. To verify this hypothesis, ten instances each of three well-known combinatorial optimization problems, Graph Coloring, Job-Shop Scheduling, and Traveling Salesman, were solved using heuristic optimizers guided by LLMs, including genetic algorithm, ant colony optimization, particle swarm optimization, and simulated annealing. Parameter values were generated by prompting several state-of-the-art LLMs with problem complexity descriptors and the set of tunable parameters. A two-stage procedure was employed: an initial run based on general problem characteristics, followed by a feedback run that used performance metrics such as average fitness, variance, and convergence behavior. Default settings from the Python-based Mealpy library served as the baseline for comparison.</div><div>Results, aggregated over 900 optimizer runs, show that LLMs are capable of proposing parameter configurations that outperform defaults in terms of final objective value and convergence speed. This effect is particularly pronounced in simulated annealing and Traveling Salesman problem settings. The findings suggest that LLMs possess a high degree of generalization and contextual understanding in the domain of optimization and can serve as practical assistants in heuristic algorithm design and tuning.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132976"},"PeriodicalIF":6.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WAVE++: Capturing within-task variance for continual relation extraction with adaptive prompting 捕获任务内方差,通过自适应提示持续提取关系
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.neucom.2026.132915
Bao-Ngoc Dao , Minh Le , Quang Nguyen , Luyen Ngo Dinh , Nam Le Hai, Linh Ngo Van
Memory-based approaches have shown strong performance in Continual Relation Extraction (CRE). However, storing examples from previous tasks increases memory usage and raises privacy concerns. Recently, prompt-based methods have emerged as a promising alternative, as they do not rely on storing past samples. Despite this progress, current prompt-based techniques face several core challenges in CRE, particularly in accurately identifying task identities and mitigating catastrophic forgetting. Existing prompt selection strategies often suffer from inaccuracies, lack robust mechanisms to prevent forgetting in shared parameters, and struggle to handle both cross-task and within-task variations. In this paper, we propose WAVE++, a novel approach inspired by the connection between prefix-tuning and mixture of experts. Specifically, we introduce task-specific prompt pools that enhance flexibility and adaptability across diverse tasks while avoiding boundary-spanning risks; this design more effectively captures both within-task and cross-task variations. To further refine relation classification, we incorporate label descriptions that provide richer, more global context, enabling the model to better distinguish among different relations. We also propose a training-free mechanism to improve task prediction during inference. Moreover, we integrate a generative model to consolidate prior knowledge within the shared parameters, thereby removing the need for explicit data storage. Extensive experiments demonstrate that WAVE++ outperforms state-of-the-art prompt-based and rehearsal-based methods, offering a more robust solution for continual relation extraction. Our code is publicly available at https://github.com/PiDinosauR2804/WAVE-CRE-PLUS-PLUS.
基于记忆的方法在连续关系提取(CRE)中表现出了很强的性能。但是,存储以前任务中的示例会增加内存使用并引起隐私问题。最近,基于提示的方法已经成为一种有前途的替代方法,因为它们不依赖于存储过去的样本。尽管取得了这些进展,但目前基于提示的技术在CRE中面临着几个核心挑战,特别是在准确识别任务身份和减轻灾难性遗忘方面。现有的提示选择策略往往存在不准确性,缺乏强大的机制来防止共享参数的遗忘,并且难以处理跨任务和任务内的变化。在本文中,我们提出了wav++,这是一种新颖的方法,灵感来自前缀调优和专家混合之间的联系。具体来说,我们引入了特定于任务的提示池,增强了跨不同任务的灵活性和适应性,同时避免了跨边界风险;这种设计更有效地捕获任务内和跨任务的变化。为了进一步改进关系分类,我们合并了标签描述,提供更丰富、更全局的上下文,使模型能够更好地区分不同的关系。我们还提出了一种无需训练的机制来改进推理过程中的任务预测。此外,我们集成了一个生成模型来整合共享参数中的先验知识,从而消除了显式数据存储的需要。大量的实验表明,wave++优于最先进的基于提示和基于预演的方法,为连续关系提取提供了更强大的解决方案。我们的代码可以在https://github.com/PiDinosauR2804/WAVE-CRE-PLUS-PLUS上公开获得。
{"title":"WAVE++: Capturing within-task variance for continual relation extraction with adaptive prompting","authors":"Bao-Ngoc Dao ,&nbsp;Minh Le ,&nbsp;Quang Nguyen ,&nbsp;Luyen Ngo Dinh ,&nbsp;Nam Le Hai,&nbsp;Linh Ngo Van","doi":"10.1016/j.neucom.2026.132915","DOIUrl":"10.1016/j.neucom.2026.132915","url":null,"abstract":"<div><div>Memory-based approaches have shown strong performance in Continual Relation Extraction (CRE). However, storing examples from previous tasks increases memory usage and raises privacy concerns. Recently, prompt-based methods have emerged as a promising alternative, as they do not rely on storing past samples. Despite this progress, current prompt-based techniques face several core challenges in CRE, particularly in accurately identifying task identities and mitigating catastrophic forgetting. Existing prompt selection strategies often suffer from inaccuracies, lack robust mechanisms to prevent forgetting in shared parameters, and struggle to handle both cross-task and within-task variations. In this paper, we propose <strong>WAVE++</strong>, a novel approach inspired by the connection between prefix-tuning and mixture of experts. Specifically, we introduce task-specific prompt pools that enhance flexibility and adaptability across diverse tasks while avoiding boundary-spanning risks; this design more effectively captures both within-task and cross-task variations. To further refine relation classification, we incorporate label descriptions that provide richer, more global context, enabling the model to better distinguish among different relations. We also propose a training-free mechanism to improve task prediction during inference. Moreover, we integrate a generative model to consolidate prior knowledge within the shared parameters, thereby removing the need for explicit data storage. Extensive experiments demonstrate that WAVE++ outperforms state-of-the-art prompt-based and rehearsal-based methods, offering a more robust solution for continual relation extraction. Our code is publicly available at <span><span>https://github.com/PiDinosauR2804/WAVE-CRE-PLUS-PLUS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132915"},"PeriodicalIF":6.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The double-edged sword: A critical review of foundational medical datasets for AI benchmarks, biases, and the future of equitable healthcare 双刃剑:对人工智能基准、偏见和公平医疗的未来的基础医疗数据集进行批判性审查
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.neucom.2026.132919
Rabie A. Ramadan , Nadim K.M. Madi , Sallam O.F. Khairy , Kamal Aldin Yousif , Muataz Salam Al-Daweri , Alrajhi Waleed Khalid
The advancement of Artificial Intelligence (AI) has revolutionized medical diagnostics and treatment. Large-scale public datasets are fueling research in this field. Therefore, this systematic review is a comprehensive analysis of 13 foundational medical datasets. It evaluates the characteristics, performance metrics, and inherent biases of datasets across medical imaging, electronic health records, and genomics. The published literature is systematically reviewed to categorize these datasets, with a focus on performance metrics for everyday machine learning tasks. Additionally, this research documents evidence of systemic bias and limitations that affect model generalizability and clinical equity. Our analysis reveals compelling evidence that significant limitations temper the remarkable progress of algorithms. It has been frequently observed that AI models suffer dramatic accuracy drops when tested beyond their training distribution, with the Area Under the Curve consistently declining from 0.95 to 0.63. The research also identified consistent patterns of systemic bias that threaten the equitable application of healthcare. This bias stems from unrepresentative sampling, subjective annotation practices, label noise, and Natural Language Processing-derived ground-truth labels. Our findings demonstrate the urgent need for a paradigm shift in the development of medical applications. The AI and medical communities must prioritize generating diverse datasets and mitigating systematic bias. This study provides evidence-based recommendations and a technical toolkit to address these challenges and reduce any health disparities.
人工智能(AI)的进步彻底改变了医疗诊断和治疗。大规模的公共数据集正在推动这一领域的研究。因此,本系统综述是对13个基础医学数据集的综合分析。它评估了医学成像、电子健康记录和基因组学数据集的特征、性能指标和固有偏差。系统地回顾已发表的文献,对这些数据集进行分类,重点关注日常机器学习任务的性能指标。此外,本研究记录了影响模型普遍性和临床公平性的系统性偏见和局限性的证据。我们的分析揭示了令人信服的证据,表明显著的局限性抑制了算法的显著进步。人们经常观察到,当测试超出其训练分布时,人工智能模型的准确性会急剧下降,曲线下面积(Area Under the Curve)从0.95持续下降到0.63。该研究还确定了系统性偏见的一致模式,威胁到医疗保健的公平应用。这种偏差源于非代表性采样、主观注释实践、标签噪声和自然语言处理派生的基础真值标签。我们的研究结果表明,迫切需要在医学应用的发展模式转变。人工智能和医学界必须优先考虑生成多样化的数据集和减轻系统性偏见。这项研究提供了基于证据的建议和技术工具包,以应对这些挑战并减少任何健康差距。
{"title":"The double-edged sword: A critical review of foundational medical datasets for AI benchmarks, biases, and the future of equitable healthcare","authors":"Rabie A. Ramadan ,&nbsp;Nadim K.M. Madi ,&nbsp;Sallam O.F. Khairy ,&nbsp;Kamal Aldin Yousif ,&nbsp;Muataz Salam Al-Daweri ,&nbsp;Alrajhi Waleed Khalid","doi":"10.1016/j.neucom.2026.132919","DOIUrl":"10.1016/j.neucom.2026.132919","url":null,"abstract":"<div><div>The advancement of Artificial Intelligence (AI) has revolutionized medical diagnostics and treatment. Large-scale public datasets are fueling research in this field. Therefore, this systematic review is a comprehensive analysis of 13 foundational medical datasets. It evaluates the characteristics, performance metrics, and inherent biases of datasets across medical imaging, electronic health records, and genomics. The published literature is systematically reviewed to categorize these datasets, with a focus on performance metrics for everyday machine learning tasks. Additionally, this research documents evidence of systemic bias and limitations that affect model generalizability and clinical equity. Our analysis reveals compelling evidence that significant limitations temper the remarkable progress of algorithms. It has been frequently observed that AI models suffer dramatic accuracy drops when tested beyond their training distribution, with the Area Under the Curve consistently declining from 0.95 to 0.63. The research also identified consistent patterns of systemic bias that threaten the equitable application of healthcare. This bias stems from unrepresentative sampling, subjective annotation practices, label noise, and Natural Language Processing-derived ground-truth labels. Our findings demonstrate the urgent need for a paradigm shift in the development of medical applications. The AI and medical communities must prioritize generating diverse datasets and mitigating systematic bias. This study provides evidence-based recommendations and a technical toolkit to address these challenges and reduce any health disparities.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132919"},"PeriodicalIF":6.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1