首页 > 最新文献

Neurocomputing最新文献

英文 中文
Dynamic patch selection and dual-granularity alignment for cross-modal retrieval 跨模态检索的动态补丁选择和双粒度对齐
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-09 DOI: 10.1016/j.neucom.2026.132999
Zhenghui Luo, Min Meng, Jigang Wu
Cross-modal retrieval aims to establish semantic associations between heterogeneous modalities, among which image-text retrieval is a key application scenario that seeks to achieve efficient semantic alignment between images and texts. Existing approaches often rely on fixed patch selection strategies for fine-grained alignment. However, such static strategies struggle to adapt to complex scene variations. Moreover, fine-grained alignment methods tend to fall into local optima by overemphasizing local feature details while neglecting global semantic context. Such limitations significantly hinder both retrieval accuracy and generalization performance. To address these challenges, we propose a Dynamic Patch Selection and Dual-Granularity Alignment (DPSDGA) framework that jointly enhances global semantic consistency and local feature interactions for robust cross-modal alignment. Specifically, we introduce a dynamic sparse module that adaptively adjusts the number of retained visual patches based on scene complexity, effectively filtering redundant information while preserving critical semantic features. Furthermore, we design a dual-granularity alignment mechanism, which combines global contrastive learning with local fine-grained alignment to enhance semantic consistency across modalities. Extensive experiments on two benchmark datasets, Flickr30k and MS-COCO, demonstrate that our method significantly outperforms existing approaches in image-text retrieval.
跨模态检索旨在建立异构模态之间的语义关联,其中图像-文本检索是实现图像和文本之间高效语义对齐的关键应用场景。现有的方法通常依赖于固定的补丁选择策略来进行细粒度对齐。然而,这种静态策略难以适应复杂的场景变化。此外,细粒度对齐方法由于过分强调局部特征细节而忽略全局语义上下文,容易陷入局部最优。这些限制严重影响了检索精度和泛化性能。为了解决这些挑战,我们提出了一个动态补丁选择和双粒度对齐(DPSDGA)框架,该框架共同增强了全局语义一致性和局部特征交互,以实现稳健的跨模态对齐。具体来说,我们引入了一个动态稀疏模块,该模块可以根据场景复杂性自适应调整保留的视觉补丁的数量,有效地过滤冗余信息,同时保留关键的语义特征。此外,我们设计了一种双粒度对齐机制,将全局对比学习与局部细粒度对齐相结合,以增强模态之间的语义一致性。在两个基准数据集(Flickr30k和MS-COCO)上进行的大量实验表明,我们的方法在图像文本检索方面明显优于现有的方法。
{"title":"Dynamic patch selection and dual-granularity alignment for cross-modal retrieval","authors":"Zhenghui Luo,&nbsp;Min Meng,&nbsp;Jigang Wu","doi":"10.1016/j.neucom.2026.132999","DOIUrl":"10.1016/j.neucom.2026.132999","url":null,"abstract":"<div><div>Cross-modal retrieval aims to establish semantic associations between heterogeneous modalities, among which image-text retrieval is a key application scenario that seeks to achieve efficient semantic alignment between images and texts. Existing approaches often rely on fixed patch selection strategies for fine-grained alignment. However, such static strategies struggle to adapt to complex scene variations. Moreover, fine-grained alignment methods tend to fall into local optima by overemphasizing local feature details while neglecting global semantic context. Such limitations significantly hinder both retrieval accuracy and generalization performance. To address these challenges, we propose a Dynamic Patch Selection and Dual-Granularity Alignment (DPSDGA) framework that jointly enhances global semantic consistency and local feature interactions for robust cross-modal alignment. Specifically, we introduce a dynamic sparse module that adaptively adjusts the number of retained visual patches based on scene complexity, effectively filtering redundant information while preserving critical semantic features. Furthermore, we design a dual-granularity alignment mechanism, which combines global contrastive learning with local fine-grained alignment to enhance semantic consistency across modalities. Extensive experiments on two benchmark datasets, Flickr30k and MS-COCO, demonstrate that our method significantly outperforms existing approaches in image-text retrieval.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 132999"},"PeriodicalIF":6.5,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconstruction error-based anomaly detection with few outlying examples 基于重构错误的异常检测方法
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-09 DOI: 10.1016/j.neucom.2026.133002
Fabrizio Angiulli, Fabio Fassetti, Luca Ferragina
Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detection which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data.
When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit.
We propose AE--SAD, a novel reconstruction error-based architecture that explicitly leverages labeled anomalies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies.
Extensive experiments demonstrate that AE--SAD consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, AE--SAD maintains its advantage even in challenging scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.
基于重构误差的神经结构构成了一种经典的深度学习异常检测方法,并已显示出良好的性能。它包括训练一个自动编码器来重建一组被认为代表正态的例子,然后指出那些显示出足够大的重建误差的数据为异常。不幸的是,这些体系结构往往也能很好地重建数据中的异常。当训练集中存在异常时,这种现象更为明显。特别是,当这些异常被标记时,一种称为半监督的设置,训练自动编码器的最佳方法是忽略异常并最小化正常数据上的重建误差。当一个足够大且具有代表性的异常示例集可用时,问题本质上转向分类任务,可以有效地应用标准监督策略。相反,在这项工作中,我们关注的是更具挑战性的场景,其中只有有限数量的异常示例可用,并且这些示例不足以代表异常可能表现出的广泛可变性。我们提出了AE- SAD,这是一种新的基于重构错误的体系结构,它明确地利用标记异常来指导模型。我们的方法引入了一种新的损失公式,迫使异常根据转换函数进行重构,有效地将它们推到正常数据的描述之外。该策略增加了正常和异常样本重建误差之间的分离,从而提高了对可见和未见异常的检测。大量实验表明,AE- SAD始终优于标准的自动编码器和最具竞争力的半监督异常检测深度学习技术,取得了最先进的结果。特别是,我们的方法在各种基准测试中表现优异,包括向量数据、高维数据集和图像域。此外,即使在训练数据被错误标记为正常的异常污染的具有挑战性的场景中,AE- SAD也保持了其优势,进一步突出了其鲁棒性和实用性。
{"title":"Reconstruction error-based anomaly detection with few outlying examples","authors":"Fabrizio Angiulli,&nbsp;Fabio Fassetti,&nbsp;Luca Ferragina","doi":"10.1016/j.neucom.2026.133002","DOIUrl":"10.1016/j.neucom.2026.133002","url":null,"abstract":"<div><div>Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detection which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data.</div><div>When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit.</div><div>We propose <span><math><mtext>AE--SAD</mtext></math></span>, a novel reconstruction error-based architecture that explicitly leverages labeled anomalies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies.</div><div>Extensive experiments demonstrate that <span><math><mtext>AE--SAD</mtext></math></span> consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, <span><math><mtext>AE--SAD</mtext></math></span> maintains its advantage even in challenging scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 133002"},"PeriodicalIF":6.5,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diffbias: Harnessing diffusion models’ prediction bias for adversarial patch defense 扩散:利用扩散模型对对抗性斑块防御的预测偏差
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-09 DOI: 10.1016/j.neucom.2026.133009
Xudong Ye , Qi Zhang , Yapeng Wang , Xu Yang , Zuobin Ying , Jingzhang Sun , Qi Zhong , Xia Du
Adversarial patches pose a significant and real threat to deep neural networks, capable of inducing misclassification in realistic physical scenarios. Developing reliable and robust defense methods against these attacks is a critical application, and current research remains unsatisfactory. In this paper, we propose a novel framework that exploits the fact that unnatural perturbations introduced by adversarial patches can produce prediction biases significantly different from those of clean images during denoising. In the localization stage, our method focuses on the critical denoising steps through an adaptive temporal sampling strategy and introduces an energy metric that fuses kinetic and potential energy to quantify the degree of anomaly in the denoised trajectory. Furthermore, by combining this with the adaptive similarity weighting mechanism and the striding trajectory consistency analysis, our method effectively suppresses the interference of background noise, so as to achieve accurate locking of the patch area. In the restoration phase, the same diffusion model is applied to the patch region to restore the original visual content and integrity. This two-stage architecture shares a unified diffusion model, enabling the localization and inpainting processes to enhance the overall defense performance through information complementarity. Extensive experiments on the INRIA, COCO2017, and APRICOT datasets show that our approach achieves state-of-the-art detection performance under both digital and physical attack types without compromising the recognition accuracy of clean images.
对抗性补丁对深度神经网络构成了重大而真实的威胁,能够在现实的物理场景中诱导错误分类。针对这些攻击开发可靠和强大的防御方法是一个关键的应用,目前的研究仍然不令人满意。在本文中,我们提出了一个新的框架,该框架利用了由对抗补丁引入的非自然扰动可以在去噪期间产生与干净图像显著不同的预测偏差的事实。在定位阶段,我们的方法通过自适应时间采样策略专注于关键的去噪步骤,并引入融合动能和势能的能量度量来量化去噪轨迹中的异常程度。结合自适应相似度加权机制和步幅轨迹一致性分析,有效抑制背景噪声的干扰,实现对patch区域的精确锁定。在恢复阶段,将相同的扩散模型应用于patch区域,以恢复原始视觉内容和完整性。该两阶段体系结构共享统一的扩散模型,使定位和涂漆过程能够通过信息互补来增强整体防御性能。在INRIA、COCO2017和APRICOT数据集上的大量实验表明,我们的方法在数字和物理攻击类型下都实现了最先进的检测性能,而不会影响干净图像的识别精度。
{"title":"Diffbias: Harnessing diffusion models’ prediction bias for adversarial patch defense","authors":"Xudong Ye ,&nbsp;Qi Zhang ,&nbsp;Yapeng Wang ,&nbsp;Xu Yang ,&nbsp;Zuobin Ying ,&nbsp;Jingzhang Sun ,&nbsp;Qi Zhong ,&nbsp;Xia Du","doi":"10.1016/j.neucom.2026.133009","DOIUrl":"10.1016/j.neucom.2026.133009","url":null,"abstract":"<div><div>Adversarial patches pose a significant and real threat to deep neural networks, capable of inducing misclassification in realistic physical scenarios. Developing reliable and robust defense methods against these attacks is a critical application, and current research remains unsatisfactory. In this paper, we propose a novel framework that exploits the fact that unnatural perturbations introduced by adversarial patches can produce prediction biases significantly different from those of clean images during denoising. In the localization stage, our method focuses on the critical denoising steps through an adaptive temporal sampling strategy and introduces an energy metric that fuses kinetic and potential energy to quantify the degree of anomaly in the denoised trajectory. Furthermore, by combining this with the adaptive similarity weighting mechanism and the striding trajectory consistency analysis, our method effectively suppresses the interference of background noise, so as to achieve accurate locking of the patch area. In the restoration phase, the same diffusion model is applied to the patch region to restore the original visual content and integrity. This two-stage architecture shares a unified diffusion model, enabling the localization and inpainting processes to enhance the overall defense performance through information complementarity. Extensive experiments on the INRIA, COCO2017, and APRICOT datasets show that our approach achieves state-of-the-art detection performance under both digital and physical attack types without compromising the recognition accuracy of clean images.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133009"},"PeriodicalIF":6.5,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDadam: Central difference adam algorithm for physics-informed neural networks CDadam:用于物理信息神经网络的中心差分亚当算法
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-09 DOI: 10.1016/j.neucom.2026.132969
Mengjia Zhao , Yuqiu Shen , Majid Ahmed Khan , Yuanzheng Lou , Fangdan Dai , Jiacheng Weng , Jianhong Wang
In deep learning, the accuracy of gradient estimation directly affects the convergence behavior of optimizers and the final performance of models. As a representative adaptive optimizer, Adam excels at handling sparse gradients, but its reliance on first-order gradient approximations makes it vulnerable to stochastic noise and one-sided estimation errors. These issues may slow down convergence or distort parameter updates. To address these limitations, we propose central difference Adam algorithm (CDadam), which integrates central differences into Adam’s gradient computation process. We perform a theoretical analysis on it and numerical simulations prove that CDadam not only converges quickly, but also has high accuracy with global convergence ability. Then, the CDadam algorithm is applied to Physics-Informed Neural Networks (PINNs) for solving multiple partial differential equations. The results reveal that the proposed CDadam shows higher accuracy and robustness than other four mainstream optimizers, which proves the effectiveness of CDadam. The code for the CDadam is available at https://github.com/LYZ-NTU/CDadam-algorithm/tree/main.
在深度学习中,梯度估计的准确性直接影响优化器的收敛行为和模型的最终性能。Adam作为一种典型的自适应优化器,擅长处理稀疏梯度,但它依赖于一阶梯度近似,容易受到随机噪声和片面估计误差的影响。这些问题可能会减慢收敛速度或扭曲参数更新。为了解决这些局限性,我们提出了中心差分亚当算法(CDadam),该算法将中心差分集成到亚当的梯度计算过程中。对其进行了理论分析和数值模拟,证明了CDadam算法不仅收敛速度快,而且精度高,具有全局收敛能力。然后,将CDadam算法应用于物理信息神经网络(pinn)中求解多个偏微分方程。结果表明,与其他四种主流优化器相比,CDadam具有更高的精度和鲁棒性,证明了CDadam的有效性。CDadam的代码可从https://github.com/LYZ-NTU/CDadam-algorithm/tree/main获得。
{"title":"CDadam: Central difference adam algorithm for physics-informed neural networks","authors":"Mengjia Zhao ,&nbsp;Yuqiu Shen ,&nbsp;Majid Ahmed Khan ,&nbsp;Yuanzheng Lou ,&nbsp;Fangdan Dai ,&nbsp;Jiacheng Weng ,&nbsp;Jianhong Wang","doi":"10.1016/j.neucom.2026.132969","DOIUrl":"10.1016/j.neucom.2026.132969","url":null,"abstract":"<div><div>In deep learning, the accuracy of gradient estimation directly affects the convergence behavior of optimizers and the final performance of models. As a representative adaptive optimizer, Adam excels at handling sparse gradients, but its reliance on first-order gradient approximations makes it vulnerable to stochastic noise and one-sided estimation errors. These issues may slow down convergence or distort parameter updates. To address these limitations, we propose central difference Adam algorithm (CDadam), which integrates central differences into Adam’s gradient computation process. We perform a theoretical analysis on it and numerical simulations prove that CDadam not only converges quickly, but also has high accuracy with global convergence ability. Then, the CDadam algorithm is applied to Physics-Informed Neural Networks (PINNs) for solving multiple partial differential equations. The results reveal that the proposed CDadam shows higher accuracy and robustness than other four mainstream optimizers, which proves the effectiveness of CDadam. The code for the CDadam is available at <span><span>https://github.com/LYZ-NTU/CDadam-algorithm/tree/main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132969"},"PeriodicalIF":6.5,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConsistEAE: Enhancing low-resource event argument extraction with linguistically consistent demonstrations consiststeae:通过语言一致的演示增强低资源事件参数提取
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-09 DOI: 10.1016/j.neucom.2026.133005
Yikai Guo , Xuemeng Tian , Bin Ge , Yuting Yang , Yao He , Wenjun Ke , Junjie Hu , Yanyang Li , Haoran Luo
In the low-resource event argument extraction (EAE) task, the scarcity of labeled data restricts the accurate identification of event arguments. Although in-context learning (ICL) has shown promising performance, it fails to ensure fine-grained semantic and syntactic consistency between the selected demonstrations and the target event texts. To address this, we propose ConsistEAE, a method that identifies demonstrations by integrating weighted measures of semantic and syntactic consistency. For semantic consistency, we propose a global-local interactive representation learning approach to capture fine-grained semantic information. For syntactic consistency, we introduce a syntactic alignment approach that constructs syntactic dependency trees and assesses the syntactic consistency between event texts using tree edit distance. Experimental results show that ConsistEAE outperforms existing state-of-the-art baselines on both ACE2005-EN and ACE2005-EN+ datasets, with improvements of 1.63% in Arg-I and 2.15% in Arg-C on the ACE2005-EN dataset, along with 1.27% in Arg-I and 2.29% in Arg-C on the ACE2005-EN+ dataset.
在低资源事件参数提取(EAE)任务中,标记数据的稀缺性限制了事件参数的准确识别。尽管上下文学习(in-context learning, ICL)表现出了良好的性能,但它无法确保所选演示和目标事件文本之间的细粒度语义和语法一致性。为了解决这个问题,我们提出了consiststeae,这是一种通过集成语义和语法一致性的加权度量来识别演示的方法。为了语义一致性,我们提出了一种全局-局部交互表示学习方法来捕获细粒度的语义信息。在句法一致性方面,我们引入了一种句法对齐方法,该方法构建句法依赖树,并使用树编辑距离来评估事件文本之间的句法一致性。实验结果表明,在ACE2005-EN和ACE2005-EN+数据集上,ConsistEAE优于现有最先进的基线,在ACE2005-EN数据集上,Arg-I和Arg-C分别提高了1.63%和2.15%,在ACE2005-EN+数据集上,Arg-I和Arg-C分别提高了1.27%和2.29%。
{"title":"ConsistEAE: Enhancing low-resource event argument extraction with linguistically consistent demonstrations","authors":"Yikai Guo ,&nbsp;Xuemeng Tian ,&nbsp;Bin Ge ,&nbsp;Yuting Yang ,&nbsp;Yao He ,&nbsp;Wenjun Ke ,&nbsp;Junjie Hu ,&nbsp;Yanyang Li ,&nbsp;Haoran Luo","doi":"10.1016/j.neucom.2026.133005","DOIUrl":"10.1016/j.neucom.2026.133005","url":null,"abstract":"<div><div>In the low-resource event argument extraction (EAE) task, the scarcity of labeled data restricts the accurate identification of event arguments. Although in-context learning (ICL) has shown promising performance, it fails to ensure fine-grained semantic and syntactic consistency between the selected demonstrations and the target event texts. To address this, we propose ConsistEAE, a method that identifies demonstrations by integrating weighted measures of semantic and syntactic consistency. For semantic consistency, we propose a global-local interactive representation learning approach to capture fine-grained semantic information. For syntactic consistency, we introduce a syntactic alignment approach that constructs syntactic dependency trees and assesses the syntactic consistency between event texts using tree edit distance. Experimental results show that ConsistEAE outperforms existing state-of-the-art baselines on both ACE2005-EN and ACE2005-EN+ datasets, with improvements of 1.63% in Arg-I and 2.15% in Arg-C on the ACE2005-EN dataset, along with 1.27% in Arg-I and 2.29% in Arg-C on the ACE2005-EN+ dataset.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 133005"},"PeriodicalIF":6.5,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPEED: Structured kernel block pruning with filter groups for efficient and elastic SW-HW co-design in FPGA-based CNN accelerators 速度:基于fpga的CNN加速器中高效和弹性的SW-HW协同设计的滤波器组结构化核块剪枝
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132958
Kwanghyun Koo , Sunwoong Kim , Hyun Kim
On-device AI has received increasing attention due to its ability to provide personalized performance, reduce server load, and address privacy concerns. In this context, efforts have been made to deploy deep learning models on power-efficient hardware platforms, such as field-programmable gate arrays (FPGAs). Specifically, various pruning techniques have been devised to improve performance and energy consumption. However, prior pruning methods fail to achieve balanced hardware utilization, which limits actual performance gains. This paper proposes SPEED, a hardware-aware structured pruning framework integrated into FPGA-based convolutional neural network (CNN) accelerators. SPEED introduces a novel processing unit (PU)-aware kernel block pruning technique for balanced computation across a PU array. Additionally, it proposes an adaptive kernel merging technique to minimize information loss during pruning. Experiments on ResNet18, ResNet50, and YOLACT using ImageNet and Pascal VOC2012 datasets show that SPEED achieves comparable accuracy to software-based pruning methods while achieving higher throughput and lower latency, validated on two types of processing elements. Specifically, for ResNet18, SPEED removes 57.9% of parameters and 44.6% of FLOPs with only a 0.91% drop in Top-1 accuracy, and for ResNet50, it removes 73.2% of parameters and 66.0% of FLOPs with a 1.20% drop in Top-1 accuracy. FPGA benchmarking results show that SPEED efficiently converts reductions in floating-point operations into actual speedups, with little increase in hardware resource usage. When deployed on an FPGA board, SPEED improves FPS by 42.2% and enhances power efficiency by 42.7% compared to the baseline. Case studies in CNN classification and instance segmentation models demonstrate the effectiveness of SPEED as a practical pruning solution for FPGA-based CNN accelerators.
设备上的人工智能因其提供个性化性能、减少服务器负载和解决隐私问题的能力而受到越来越多的关注。在这种背景下,人们努力将深度学习模型部署在节能硬件平台上,例如现场可编程门阵列(fpga)。具体来说,已经设计了各种修剪技术来提高性能和能耗。但是,先前的修剪方法无法实现均衡的硬件利用率,从而限制了实际的性能增益。本文提出了一种集成在基于fpga的卷积神经网络(CNN)加速器中的硬件感知结构化剪枝框架SPEED。SPEED引入了一种新的处理单元(PU)感知内核块修剪技术,用于跨PU阵列的平衡计算。此外,提出了一种自适应核合并技术,以减少剪枝过程中的信息损失。使用ImageNet和Pascal VOC2012数据集在ResNet18、ResNet50和YOLACT上进行的实验表明,在两种类型的处理元素上验证,SPEED达到了与基于软件的修剪方法相当的精度,同时实现了更高的吞吐量和更低的延迟。具体来说,对于ResNet18, SPEED去除57.9%的参数和44.6%的FLOPs, Top-1精度仅下降0.91%;对于ResNet50,它去除73.2%的参数和66.0%的FLOPs, Top-1精度下降1.20%。FPGA基准测试结果表明,SPEED有效地将浮点运算的减少转化为实际的速度,而硬件资源的使用几乎没有增加。当部署在FPGA板上时,与基准相比,SPEED可提高42.2%的FPS和42.7%的功耗效率。CNN分类和实例分割模型的案例研究证明了SPEED作为基于fpga的CNN加速器的实用修剪解决方案的有效性。
{"title":"SPEED: Structured kernel block pruning with filter groups for efficient and elastic SW-HW co-design in FPGA-based CNN accelerators","authors":"Kwanghyun Koo ,&nbsp;Sunwoong Kim ,&nbsp;Hyun Kim","doi":"10.1016/j.neucom.2026.132958","DOIUrl":"10.1016/j.neucom.2026.132958","url":null,"abstract":"<div><div>On-device AI has received increasing attention due to its ability to provide personalized performance, reduce server load, and address privacy concerns. In this context, efforts have been made to deploy deep learning models on power-efficient hardware platforms, such as field-programmable gate arrays (FPGAs). Specifically, various pruning techniques have been devised to improve performance and energy consumption. However, prior pruning methods fail to achieve balanced hardware utilization, which limits actual performance gains. This paper proposes <em>SPEED</em>, a hardware-aware structured pruning framework integrated into FPGA-based convolutional neural network (CNN) accelerators. <em>SPEED</em> introduces a novel processing unit (PU)-aware kernel block pruning technique for balanced computation across a PU array. Additionally, it proposes an adaptive kernel merging technique to minimize information loss during pruning. Experiments on ResNet18, ResNet50, and YOLACT using ImageNet and Pascal VOC2012 datasets show that <em>SPEED</em> achieves comparable accuracy to software-based pruning methods while achieving higher throughput and lower latency, validated on two types of processing elements. Specifically, for ResNet18, <em>SPEED</em> removes 57.9% of parameters and 44.6% of FLOPs with only a 0.91% drop in Top-1 accuracy, and for ResNet50, it removes 73.2% of parameters and 66.0% of FLOPs with a 1.20% drop in Top-1 accuracy. FPGA benchmarking results show that <em>SPEED</em> efficiently converts reductions in floating-point operations into actual speedups, with little increase in hardware resource usage. When deployed on an FPGA board, <em>SPEED</em> improves FPS by 42.2% and enhances power efficiency by 42.7% compared to the baseline. Case studies in CNN classification and instance segmentation models demonstrate the effectiveness of <em>SPEED</em> as a practical pruning solution for FPGA-based CNN accelerators.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132958"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CaPGNN: Optimizing parallel graph neural network training with joint caching and resource-aware graph partitioning CaPGNN:利用联合缓存和资源感知图分区优化并行图神经网络训练
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132978
Xianfeng Song, Yi Zou, Zheng Shi
Graph-structured data is ubiquitous in the real world, and Graph Neural Networks (GNNs) have become increasingly popular in various fields due to their ability to process such irregular data directly. However, as data scales, GNNs become inefficient. Although parallel training offers performance improvements, increased communication costs often offset these advantages. To address this, this paper introduces CaPGNN, a novel parallel full-batch GNN training framework on single server with multi-GPU. Firstly, considering the fact that the number of remote vertices in a partition is often greater than or equal to the number of local vertices and there may exist many duplicate vertices, we propose a joint adaptive caching algorithm that leverages both CPU and GPU memory, integrating lightweight cache update and prefetch techniques to effectively reduce redundant communication costs. Furthermore, taking into account the varying computational and communication capabilities among GPUs, we propose a communication- and computation-aware heuristic graph partitioning algorithm inspired by graph sparsification. Additionally, we implement a pipeline to overlap computation and communication. Extensive experiments show that CaPGNN improves training efficiency by up to 18.98x and reduces communication costs by up to 99%, with minimal accuracy loss or even accuracy improvement in some cases. Finally, we extend CaPGNN to multi-machine multi-GPU environments. The code is available at https://github.com/songxf1024/CaPGNN.
图结构数据在现实世界中无处不在,图神经网络(gnn)由于能够直接处理这种不规则数据而在各个领域越来越受欢迎。然而,随着数据规模的扩大,gnn变得低效。虽然并行培训提供了性能改进,但增加的通信成本往往抵消了这些优势。为了解决这一问题,本文介绍了一种新型的多gpu单服务器并行全批GNN训练框架CaPGNN。首先,考虑到一个分区中远程顶点的数量通常大于或等于本地顶点的数量,并且可能存在许多重复顶点,我们提出了一种联合自适应缓存算法,该算法利用CPU和GPU内存,集成轻量级缓存更新和预取技术,有效降低冗余通信成本。此外,考虑到gpu之间不同的计算和通信能力,我们提出了一种受图稀疏化启发的通信和计算感知的启发式图划分算法。此外,我们还实现了一个管道来重叠计算和通信。大量实验表明,CaPGNN将训练效率提高了18.98倍,将通信成本降低了99%,在某些情况下精度损失最小,甚至精度提高。最后,我们将CaPGNN扩展到多机多gpu环境。代码可在https://github.com/songxf1024/CaPGNN上获得。
{"title":"CaPGNN: Optimizing parallel graph neural network training with joint caching and resource-aware graph partitioning","authors":"Xianfeng Song,&nbsp;Yi Zou,&nbsp;Zheng Shi","doi":"10.1016/j.neucom.2026.132978","DOIUrl":"10.1016/j.neucom.2026.132978","url":null,"abstract":"<div><div>Graph-structured data is ubiquitous in the real world, and Graph Neural Networks (GNNs) have become increasingly popular in various fields due to their ability to process such irregular data directly. However, as data scales, GNNs become inefficient. Although parallel training offers performance improvements, increased communication costs often offset these advantages. To address this, this paper introduces CaPGNN, a novel parallel full-batch GNN training framework on single server with multi-GPU. Firstly, considering the fact that the number of remote vertices in a partition is often greater than or equal to the number of local vertices and there may exist many duplicate vertices, we propose a joint adaptive caching algorithm that leverages both CPU and GPU memory, integrating lightweight cache update and prefetch techniques to effectively reduce redundant communication costs. Furthermore, taking into account the varying computational and communication capabilities among GPUs, we propose a communication- and computation-aware heuristic graph partitioning algorithm inspired by graph sparsification. Additionally, we implement a pipeline to overlap computation and communication. Extensive experiments show that CaPGNN improves training efficiency by up to 18.98x and reduces communication costs by up to 99%, with minimal accuracy loss or even accuracy improvement in some cases. Finally, we extend CaPGNN to multi-machine multi-GPU environments. The code is available at <span><span>https://github.com/songxf1024/CaPGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132978"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spiking neural P systems with brain-derived neurotrophic factor 脑源性神经营养因子刺激神经P系统
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132990
Yue Zhao, Zhenjiao Lin, Feng Qi
Spiking Neural P systems (SN P systems), as a member of the third generation artificial neural network models, have been widely studied in recent years due to their high scalability, high parallelism, and low energy consumption. However, the original SN P systems have limitations in nonlinear learning and dynamic plasticity. We introduced brain-derived neurotrophic factor (BDNF) into SN P systems for the first time, integrated BDNF and its signaling pathways, and enhanced spike control and synaptic plasticity through four innovative rules, thereby simulating the role of BDNF in neuronal adaptation. We demonstrate that the system requires only 25 neurons to perform universal calculations, which is less than the number of neurons used by existing SN P variants. Additionally, experimental evaluations on function approximation and image classification tasks confirm that the model achieves state-of-the-art performance. In the function fitting task, the system structure is further simplified through visual training and pruning strategies while maintaining efficient nonlinear computing power and high interpretability, fully demonstrating the operation process of the BDNF-SN P systems in actual tasks. In image classification, both comparative and ablation studies across MNIST and four MedMNIST datasets confirm the superiority of the 5B-BDNF-SN P family, with the 5B-Gram-SN P variant achieving up to 99.56% accuracy on MNIST and 97.87% AUC on PathMNIST. Furthermore, the model demonstrates high inference efficiency with fewer parameters, validating its effectiveness and adaptability in both general and medical image classification tasks.
脉冲神经网络(snp)系统作为第三代人工神经网络模型的一员,由于其高扩展性、高并行性和低能耗等特点,近年来得到了广泛的研究。然而,原有的SN - P系统在非线性学习和动态可塑性方面存在局限性。我们首次将脑源性神经营养因子(brain-derived neurotrophic factor, BDNF)引入SN - P系统,整合BDNF及其信号通路,并通过四条创新规则增强峰值控制和突触可塑性,从而模拟BDNF在神经元适应中的作用。我们证明了该系统只需要25个神经元来执行通用计算,这比现有的SN P变体使用的神经元数量要少。此外,对函数逼近和图像分类任务的实验评估证实了该模型达到了最先进的性能。在函数拟合任务中,通过视觉训练和剪枝策略进一步简化了系统结构,同时保持了高效的非线性计算能力和高可解释性,充分展示了BDNF-SN - P系统在实际任务中的运行过程。在图像分类方面,MNIST和四个MedMNIST数据集的比较和消融研究都证实了5B-BDNF-SN P家族的优势,5B-Gram-SN P变体在MNIST上的准确率高达99.56%,在PathMNIST上的AUC高达97.87%。此外,该模型在参数较少的情况下具有较高的推理效率,验证了其在普通图像和医学图像分类任务中的有效性和适应性。
{"title":"Spiking neural P systems with brain-derived neurotrophic factor","authors":"Yue Zhao,&nbsp;Zhenjiao Lin,&nbsp;Feng Qi","doi":"10.1016/j.neucom.2026.132990","DOIUrl":"10.1016/j.neucom.2026.132990","url":null,"abstract":"<div><div>Spiking Neural P systems (SN P systems), as a member of the third generation artificial neural network models, have been widely studied in recent years due to their high scalability, high parallelism, and low energy consumption. However, the original SN P systems have limitations in nonlinear learning and dynamic plasticity. We introduced brain-derived neurotrophic factor (BDNF) into SN P systems for the first time, integrated BDNF and its signaling pathways, and enhanced spike control and synaptic plasticity through four innovative rules, thereby simulating the role of BDNF in neuronal adaptation. We demonstrate that the system requires only 25 neurons to perform universal calculations, which is less than the number of neurons used by existing SN P variants. Additionally, experimental evaluations on function approximation and image classification tasks confirm that the model achieves state-of-the-art performance. In the function fitting task, the system structure is further simplified through visual training and pruning strategies while maintaining efficient nonlinear computing power and high interpretability, fully demonstrating the operation process of the BDNF-SN P systems in actual tasks. In image classification, both comparative and ablation studies across MNIST and four MedMNIST datasets confirm the superiority of the 5B-BDNF-SN P family, with the 5B-Gram-SN P variant achieving up to 99.56% accuracy on MNIST and 97.87% AUC on PathMNIST. Furthermore, the model demonstrates high inference efficiency with fewer parameters, validating its effectiveness and adaptability in both general and medical image classification tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 132990"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable artificial intelligence with Boolean rule-aware predictions in ridge regression models 山脊回归模型中具有布尔规则感知预测的可解释人工智能
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.132991
Seyed Amir Malekpour , Hamid Pezeshk
Recent artificial intelligence (AI) systems, including deep neural networks (DNNs), have become increasingly complex and less interpretable. We propose a model named Regression-Based Boolean Rule Inference, RBBR, that is understandable to humans. By transforming input features into multiple conjunctions, RBBR fits a ridge regression model to the conjunctions and target variable data and derives the Boolean rule set from conjunctions with a positive weight sign in the model. Moreover, for high-dimensional datasets, a strategy is presented to derive Boolean sub-rules from regression sub-models fitted to specific feature subsets. The Bayesian Information Criterion (BIC) is employed to rank the fitted models and associated Boolean rules, striking a balance between interpretability and accuracy. Additionally, a Bayesian framework is proposed for predicting the target class of new datapoints based on top-ranked Boolean rules selected by BIC. By considering the combinatorial interactions among input features, RBBR offers a robust feature selection strategy, surpassing decision trees. Experiments conducted on datasets with low sample sizes reveal that RBBR exhibits data efficiency. Our approach for Boolean rule inference from regression models is compatible with the learning structure of black-box models like DNNs, enabling the interpretation of parameter sets or neurons using Boolean rules.
最近的人工智能(AI)系统,包括深度神经网络(dnn),变得越来越复杂,越来越难以解释。我们提出了一个人类可以理解的基于回归的布尔规则推理(RBBR)模型。RBBR通过将输入特征转换为多个连词,对连词和目标变量数据拟合脊回归模型,并从模型中权号为正的连词中导出布尔规则集。此外,对于高维数据集,提出了一种从适合特定特征子集的回归子模型中导出布尔子规则的策略。采用贝叶斯信息准则(BIC)对拟合模型和相关布尔规则进行排序,在可解释性和准确性之间取得平衡。此外,提出了一种贝叶斯框架,用于基于BIC选择的布尔规则排序来预测新数据点的目标类别。通过考虑输入特征之间的组合交互作用,RBBR提供了一种鲁棒的特征选择策略,优于决策树。在低样本量的数据集上进行的实验表明,RBBR具有数据效率。我们从回归模型中进行布尔规则推理的方法与dnn等黑箱模型的学习结构兼容,可以使用布尔规则解释参数集或神经元。
{"title":"Explainable artificial intelligence with Boolean rule-aware predictions in ridge regression models","authors":"Seyed Amir Malekpour ,&nbsp;Hamid Pezeshk","doi":"10.1016/j.neucom.2026.132991","DOIUrl":"10.1016/j.neucom.2026.132991","url":null,"abstract":"<div><div>Recent artificial intelligence (AI) systems, including deep neural networks (DNNs), have become increasingly complex and less interpretable. We propose a model named Regression-Based Boolean Rule Inference, RBBR, that is understandable to humans. By transforming input features into multiple conjunctions, RBBR fits a ridge regression model to the conjunctions and target variable data and derives the Boolean rule set from conjunctions with a positive weight sign in the model. Moreover, for high-dimensional datasets, a strategy is presented to derive Boolean sub-rules from regression sub-models fitted to specific feature subsets. The Bayesian Information Criterion (BIC) is employed to rank the fitted models and associated Boolean rules, striking a balance between interpretability and accuracy. Additionally, a Bayesian framework is proposed for predicting the target class of new datapoints based on top-ranked Boolean rules selected by BIC. By considering the combinatorial interactions among input features, RBBR offers a robust feature selection strategy, surpassing decision trees. Experiments conducted on datasets with low sample sizes reveal that RBBR exhibits data efficiency. Our approach for Boolean rule inference from regression models is compatible with the learning structure of black-box models like DNNs, enabling the interpretation of parameter sets or neurons using Boolean rules.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132991"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MPFNet: Mamba-driven progressive fusion network for RAW-RGB collaborative demoiréing MPFNet:用于RAW-RGB协同分解的mamba驱动渐进式融合网络
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-07 DOI: 10.1016/j.neucom.2026.133019
Lieqiang Yang, Li Yu, Wang Zhang, Chengyan Deng, Jianqin Liu
With the development of smartphones and display technologies, screen-captured images have become an indispensable means of recording information. However, moiré patterns, generated due to the aliasing effect between the Color Filter Array (CFA) and screen display pixels, severely degrade image quality. Existing demoiré methods suffer from issues such as significant loss of original information in RGB images, limited receptive field range, and high computational complexity, leading to incomplete removal of moiré patterns. To address these limitations, we propose a Mamba-Driven Progressive Fusion Network (MPFNet) for RAW-RGB Collaborative Demoiréing. The MPFNet fully leverages RAW data (which retains richer original information) and RGB data (which provides guidance during RAW-to-RGB conversion), while harnessing the global receptive field attention enabled by Mamba’s linear computational complexity, thereby achieving low-color-difference moiré removal. The MPFNet adopts a two-stage architecture: In the first stage, a Simple Demoiré Block (SDB) performs shallow demoiréing on RAW data while extracting multi-scale RAW features. In the second stage, the dual-path adaptive feature fusion (DAFF) module is used to progressively fuse multi-scale RAW and RGB features, and then the DemoiréMamba Block (DMB) is used to achieve deep moiré removal and accurate color restoration. Extensive experiments on TMM22, RAWVDemoiré and FHDMI datasets demonstrate that MPFNet achieves state-of-the-art performance in both quantitative metrics and qualitative visual comparisons, while maintaining relatively low FLOPs. For instance, MPFNet achieves a PSNR of 28.86 dB on the TMM22 dataset, which is 0.51 dB higher than previous methods, and it also has lower GFLOPs.
随着智能手机和显示技术的发展,屏幕截图已经成为一种不可缺少的记录信息的手段。然而,由于彩色滤波器阵列(CFA)和屏幕显示像素之间的混叠效应而产生的波纹图案严重降低了图像质量。现有的图像去除方法存在RGB图像原始信息丢失严重、接收野范围有限、计算复杂度高等问题,导致图像去除不完全。为了解决这些限制,我们提出了一个mamba驱动的渐进式融合网络(MPFNet),用于RAW-RGB协同分解。MPFNet充分利用RAW数据(保留更丰富的原始信息)和RGB数据(在RAW到RGB转换期间提供指导),同时利用曼巴线性计算复杂性所启用的全球接受场注意力,从而实现低色差的条纹去除。MPFNet采用两阶段架构:第一阶段,SDB (Simple demoir Block)对原始数据进行浅层分解,同时提取多尺度RAW特征。第二阶段,采用双径自适应特征融合(DAFF)模块逐步融合多尺度RAW和RGB特征,然后采用demoir曼巴块(DMB)实现深度斑点去除和精确色彩恢复。在TMM22、rawvdemoir和FHDMI数据集上进行的大量实验表明,MPFNet在定量指标和定性视觉比较方面都达到了最先进的性能,同时保持了相对较低的FLOPs。例如,MPFNet在TMM22数据集上实现了28.86 dB的PSNR,比以前的方法提高了0.51 dB, GFLOPs也更低。
{"title":"MPFNet: Mamba-driven progressive fusion network for RAW-RGB collaborative demoiréing","authors":"Lieqiang Yang,&nbsp;Li Yu,&nbsp;Wang Zhang,&nbsp;Chengyan Deng,&nbsp;Jianqin Liu","doi":"10.1016/j.neucom.2026.133019","DOIUrl":"10.1016/j.neucom.2026.133019","url":null,"abstract":"<div><div>With the development of smartphones and display technologies, screen-captured images have become an indispensable means of recording information. However, moiré patterns, generated due to the aliasing effect between the Color Filter Array (CFA) and screen display pixels, severely degrade image quality. Existing demoiré methods suffer from issues such as significant loss of original information in RGB images, limited receptive field range, and high computational complexity, leading to incomplete removal of moiré patterns. To address these limitations, we propose a Mamba-Driven Progressive Fusion Network (MPFNet) for RAW-RGB Collaborative Demoiréing. The MPFNet fully leverages RAW data (which retains richer original information) and RGB data (which provides guidance during RAW-to-RGB conversion), while harnessing the global receptive field attention enabled by Mamba’s linear computational complexity, thereby achieving low-color-difference moiré removal. The MPFNet adopts a two-stage architecture: In the first stage, a Simple Demoiré Block (SDB) performs shallow demoiréing on RAW data while extracting multi-scale RAW features. In the second stage, the dual-path adaptive feature fusion (DAFF) module is used to progressively fuse multi-scale RAW and RGB features, and then the DemoiréMamba Block (DMB) is used to achieve deep moiré removal and accurate color restoration. Extensive experiments on TMM22, RAWVDemoiré and FHDMI datasets demonstrate that MPFNet achieves state-of-the-art performance in both quantitative metrics and qualitative visual comparisons, while maintaining relatively low FLOPs. For instance, MPFNet achieves a PSNR of 28.86 dB on the TMM22 dataset, which is 0.51 dB higher than previous methods, and it also has lower GFLOPs.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 133019"},"PeriodicalIF":6.5,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1