Transformers are widely used for their ability to capture data relations in sequence processing, with great success for a wide range of static tasks. However, the computational and memory footprint of their main component, i.e., the Scaled Dot-product Attention, is commonly overlooked. This makes their adoption infeasible in applications involving stream data processing with constraints in response latency, computational and memory resources. Some works have proposed methods to lower the computational cost of Transformers by using low-rank approximations, sparsity in attention, and efficient formulations for Continual Inference. In this paper, we introduce a new formulation of the Scaled Dot-product Attention based on the Nyström approximation that is suitable for Continual Inference. In experiments on Online Audio Classification and Online Action Detection tasks, the proposed Continual Scaled Dot-product Attention can lower the number of operations by up to three orders of magnitude compared to the original Transformers while retaining the predictive performance of competing models.
{"title":"Continual low-rank scaled dot-product attention","authors":"Ginés Carreto Picón , Illia Oleksiienko , Lukas Hedegaard , Arian Bakhtiarnia , Alexandros Iosifidis","doi":"10.1016/j.neunet.2025.108517","DOIUrl":"10.1016/j.neunet.2025.108517","url":null,"abstract":"<div><div>Transformers are widely used for their ability to capture data relations in sequence processing, with great success for a wide range of static tasks. However, the computational and memory footprint of their main component, i.e., the Scaled Dot-product Attention, is commonly overlooked. This makes their adoption infeasible in applications involving stream data processing with constraints in response latency, computational and memory resources. Some works have proposed methods to lower the computational cost of Transformers by using low-rank approximations, sparsity in attention, and efficient formulations for Continual Inference. In this paper, we introduce a new formulation of the Scaled Dot-product Attention based on the Nyström approximation that is suitable for Continual Inference. In experiments on Online Audio Classification and Online Action Detection tasks, the proposed Continual Scaled Dot-product Attention can lower the number of operations by up to three orders of magnitude compared to the original Transformers while retaining the predictive performance of competing models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108517"},"PeriodicalIF":6.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1016/j.neunet.2025.108515
Yiman Hu, Yixiong Zou, Xiaosen Wang, Yuhua Li, Kun He, Ruixuan Li
Few-Shot Learning (FSL) enables models to learn from just a few examples of new classes by leveraging knowledge from base classes. While FSL has made significant strides, its vulnerability to adversarial attacks–especially with limited data–has been overlooked. To address this, adversarial training is often used to build more robust models. However, we found that this approach can lead the model to memorize adversarial noise, which harms its ability to generalize. Hard labels exacerbate this issue by pushing the model toward perfect accuracy on adversarial examples, while also making it less robust to small changes in weights. To solve these problems, we propose Alleviation of Noise Memorization (ANM), a method that includes Adaptive Label Smoothing for more flexible supervision and Robust Weight Learning to enhance model stability. Our extensive experiments show that ANM effectively reduces noise memorization and improves generalization, outperforming current benchmarks.
few - shot Learning (FSL)通过利用基类的知识,使模型能够从几个新类的示例中学习。虽然FSL已经取得了重大进展,但它对对抗性攻击的脆弱性——尤其是在数据有限的情况下——却被忽视了。为了解决这个问题,对抗性训练通常用于构建更健壮的模型。然而,我们发现这种方法会导致模型记忆对抗性噪声,从而损害其泛化能力。硬标签使这个问题更加严重,因为它迫使模型在对抗性样本上追求完美的准确性,同时也使它对权重的微小变化的鲁棒性降低。为了解决这些问题,我们提出了缓解噪声记忆(ANM)方法,该方法包括自适应标签平滑以实现更灵活的监督和鲁棒权学习以增强模型稳定性。我们的大量实验表明,ANM有效地减少了噪声记忆并提高了泛化,优于当前的基准测试。
{"title":"Alleviating noise memorization for adversarially robust few-shot learning","authors":"Yiman Hu, Yixiong Zou, Xiaosen Wang, Yuhua Li, Kun He, Ruixuan Li","doi":"10.1016/j.neunet.2025.108515","DOIUrl":"10.1016/j.neunet.2025.108515","url":null,"abstract":"<div><div>Few-Shot Learning (FSL) enables models to learn from just a few examples of new classes by leveraging knowledge from base classes. While FSL has made significant strides, its vulnerability to adversarial attacks–especially with limited data–has been overlooked. To address this, adversarial training is often used to build more robust models. However, we found that this approach can lead the model to memorize adversarial noise, which harms its ability to generalize. Hard labels exacerbate this issue by pushing the model toward perfect accuracy on adversarial examples, while also making it less robust to small changes in weights. To solve these problems, we propose Alleviation of Noise Memorization (ANM), a method that includes Adaptive Label Smoothing for more flexible supervision and Robust Weight Learning to enhance model stability. Our extensive experiments show that ANM effectively reduces noise memorization and improves generalization, outperforming current benchmarks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108515"},"PeriodicalIF":6.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1016/j.neunet.2025.108498
Xiaoning Li , Min Guo , Qiancheng Yu , Kaiguang Wang , Cai Dai , Zhiqiang Li
Current ensemble regression learning (ERL) faces two primary limitations: first, traditional fusion strategies overlook the intrinsic complexity of the ensemble’s group decision-making (GDM) process; second, the pursuit of diversity exacerbates ensemble instability. These dual constraints have collectively caused ERL research to stagnate at the applied level, impeding further theoretical breakthroughs. In response, this paper proposes a novel ERL framework with decision balance (DBERL), designed to overcome these limitations through a research paradigm that integrates GDM with decision-balanced structures. Specifically, DBERL models the GDM process in traditional ERL as a decision-balanced network (DBN), clarifying both individual-level and group-level decision-making paradigms. Within this network, individuals are adaptively clustered based on task characteristics, thereby forming both narrowly and broadly balanced structures. A hierarchical balanced attention mechanism (HBA) is introduced to aggregate the decision influences of individuals within these structures. Finally, a phased feedback mechanism is incorporated to further promote consensus within the ensemble. The performance of DBERL, including the effectiveness of its internal modules, was rigorously validated across nine diverse datasets, encompassing various application domains, data volumes, and feature dimensions. The results indicate that among 45 evaluation metrics across all datasets, DBERL ranked first in 80% of comparisons against 11 baseline models, in 82.2% of comparisons against 12 ensemble strategies, and in 64% of comparisons against two other balance structures. Based on evaluation results across six dimensions, including fitting capability, correlation, interpretability, stability, data sensitivity, and generalization ability, DBERL achieved the top rank in statistical testing.
{"title":"An enhancing framework with an emphasis on decision balance in ensemble regression","authors":"Xiaoning Li , Min Guo , Qiancheng Yu , Kaiguang Wang , Cai Dai , Zhiqiang Li","doi":"10.1016/j.neunet.2025.108498","DOIUrl":"10.1016/j.neunet.2025.108498","url":null,"abstract":"<div><div>Current ensemble regression learning (ERL) faces two primary limitations: first, traditional fusion strategies overlook the intrinsic complexity of the ensemble’s group decision-making (GDM) process; second, the pursuit of diversity exacerbates ensemble instability. These dual constraints have collectively caused ERL research to stagnate at the applied level, impeding further theoretical breakthroughs. In response, this paper proposes a novel ERL framework with decision balance (DBERL), designed to overcome these limitations through a research paradigm that integrates GDM with decision-balanced structures. Specifically, DBERL models the GDM process in traditional ERL as a decision-balanced network (DBN), clarifying both individual-level and group-level decision-making paradigms. Within this network, individuals are adaptively clustered based on task characteristics, thereby forming both narrowly and broadly balanced structures. A hierarchical balanced attention mechanism (HBA) is introduced to aggregate the decision influences of individuals within these structures. Finally, a phased feedback mechanism is incorporated to further promote consensus within the ensemble. The performance of DBERL, including the effectiveness of its internal modules, was rigorously validated across nine diverse datasets, encompassing various application domains, data volumes, and feature dimensions. The results indicate that among 45 evaluation metrics across all datasets, DBERL ranked first in 80% of comparisons against 11 baseline models, in 82.2% of comparisons against 12 ensemble strategies, and in 64% of comparisons against two other balance structures. Based on evaluation results across six dimensions, including fitting capability, correlation, interpretability, stability, data sensitivity, and generalization ability, DBERL achieved the top rank in statistical testing.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108498"},"PeriodicalIF":6.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1016/j.neunet.2025.108497
Junbin Fang , Yixuan Shen , Yujing Sun , Canjian Jiang , You Jiang , Hezhong Pan , Siu-Ming Yiu , Zoe L. Jiang
Traffic sign recognition systems are crucial for autonomous driving safety. However, their susceptibility to adversarial attacks poses severe risks, potentially leading to catastrophic accidents. The purpose of adversarial attack research is to identify vulnerabilities in the systems, thereby improving understanding and response to these security threats. Unlike prior adversarial attacks, which are typically invasive, conspicuous, and impractical, our proposed attack operates non-invasively while remaining stealthy to human observers. Specifically, we exploit high-speed modulation of LED illumination and the rolling shutter mechanism of CMOS sensors to create imperceptible perturbations. By adjusting the LED flicker frequency, we effectively conduct denial-of-service attack and evasion attack. Extensive evaluations in both simulations and real-world scenarios confirm LIMA’s effectiveness, with a 100% success rate across most distance-angle combinations and 69.67% success even against defense models.
{"title":"LIMA: Towards building a non-invasive and stealthy real-world adversarial attack model for traffic sign recognition systems","authors":"Junbin Fang , Yixuan Shen , Yujing Sun , Canjian Jiang , You Jiang , Hezhong Pan , Siu-Ming Yiu , Zoe L. Jiang","doi":"10.1016/j.neunet.2025.108497","DOIUrl":"10.1016/j.neunet.2025.108497","url":null,"abstract":"<div><div>Traffic sign recognition systems are crucial for autonomous driving safety. However, their susceptibility to adversarial attacks poses severe risks, potentially leading to catastrophic accidents. The purpose of adversarial attack research is to identify vulnerabilities in the systems, thereby improving understanding and response to these security threats. Unlike prior adversarial attacks, which are typically invasive, conspicuous, and impractical, our proposed attack operates non-invasively while remaining stealthy to human observers. Specifically, we exploit high-speed modulation of LED illumination and the rolling shutter mechanism of CMOS sensors to create imperceptible perturbations. By adjusting the LED flicker frequency, we effectively conduct denial-of-service attack and evasion attack. Extensive evaluations in both simulations and real-world scenarios confirm LIMA’s effectiveness, with a 100% success rate across most distance-angle combinations and 69.67% success even against defense models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108497"},"PeriodicalIF":6.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1016/j.neunet.2025.108523
Shanshan Du , Hanli Wang
As a crucial subfield of intelligent human-machine interaction, visual dialog involves answering multi-turn questions based on visual content and history dialog, presenting significant technical challenges. Although recent works have made steady progress in visual dialog, several issues remain to be addressed. First, there are bias issues in fine-grained multimodal modeling, including information asymmetry and representation inconsistency, which lead to incomplete information understanding and decision-making biases during question answering. Second, previous visual dialog models relying on external knowledge suffer from poor knowledge quality and insufficient knowledge diversity, which introduce noise into the model and undermine the accuracy and coherence of the question responses. In this work, a novel semantic consistency visual dialog model enhanced by external knowledge (SCVD+) is proposed to cope with these challenges. Specifically, fine-grained structured visual and textual scene graphs are constructed to mitigate the issue of information asymmetry, which equally prioritize both linguistic and visual elements, ensuring a comprehensive capture of object relationships in images and word associations in dialog history. Furthermore, beneficial external knowledge sourced from a commonsense knowledge base is integrated to alleviate the representation inconsistency in multimodal scene graphs and to promote the model’s interpretability. Finally, implicit clues are derived from pre-trained large models and integrated with explicit information from scene graphs using a proposed dual-level knowledge fusion and reasoning strategy, which ensures the diversity of external knowledge and enhances the model’s reasoning capability in complex scenarios. Experimental results demonstrate the effectiveness of our method on the public datasets VisDial v0.9, VisDial v1.0, and OpenVisDial 2.0.
{"title":"Visual dialog with semantic consistency: An external knowledge-driven approach","authors":"Shanshan Du , Hanli Wang","doi":"10.1016/j.neunet.2025.108523","DOIUrl":"10.1016/j.neunet.2025.108523","url":null,"abstract":"<div><div>As a crucial subfield of intelligent human-machine interaction, visual dialog involves answering multi-turn questions based on visual content and history dialog, presenting significant technical challenges. Although recent works have made steady progress in visual dialog, several issues remain to be addressed. First, there are bias issues in fine-grained multimodal modeling, including information asymmetry and representation inconsistency, which lead to incomplete information understanding and decision-making biases during question answering. Second, previous visual dialog models relying on external knowledge suffer from poor knowledge quality and insufficient knowledge diversity, which introduce noise into the model and undermine the accuracy and coherence of the question responses. In this work, a novel semantic consistency visual dialog model enhanced by external knowledge (SCVD+) is proposed to cope with these challenges. Specifically, fine-grained structured visual and textual scene graphs are constructed to mitigate the issue of information asymmetry, which equally prioritize both linguistic and visual elements, ensuring a comprehensive capture of object relationships in images and word associations in dialog history. Furthermore, beneficial external knowledge sourced from a commonsense knowledge base is integrated to alleviate the representation inconsistency in multimodal scene graphs and to promote the model’s interpretability. Finally, implicit clues are derived from pre-trained large models and integrated with explicit information from scene graphs using a proposed dual-level knowledge fusion and reasoning strategy, which ensures the diversity of external knowledge and enhances the model’s reasoning capability in complex scenarios. Experimental results demonstrate the effectiveness of our method on the public datasets VisDial v0.9, VisDial v1.0, and OpenVisDial 2.0.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108523"},"PeriodicalIF":6.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1016/j.neunet.2025.108506
Chunxiao Fan , Jintao Li , Zhongqian Zhang , Fu Li , Bo Wang
To reduce the storage and computational complexity of neural network models, various model compression techniques have been proposed in recent years, including pruning and quantization. However, due to the lack of interconnection among different type of methods, it is difficult to effectively integrate the advantages of these diverse techniques. This paper proposes a novel two-phase collaborative training framework for joint pruning and quantization to achieve synergistic optimization of multiple compression techniques. This framework combines pruning, quantization operations, consisting of two phases: collaborative constraint pre-compression and post-training compression refinement phases. In the collaborative constraint pre-compression phase, a novel unified constraint loss function is designed to ensure that weights are close to quantization values, and sparse regularization is utilized to automatically learn the network structure for pruning. It can effectively combine pruning and quantization operations, avoiding the potential negative impacts of separately implementing pruning and quantization. By calculating the difference between the current parameter values and the target quantization values, quantization errors are reduced through iterative optimization during the training process, making the parameters closer to the selected 2n values. The pruned network has a regular structure, and quantization to 2n values makes it highly suitable for hardware implementation as it can be achieved using a shifter. In the post-training compression refinement phase, joint compression operations including channel pruning and low-bit quantization are completed. Experimental results on benchmark datasets such as MNIST, CIFAR-10 and CIFAR-100 show that the framework generates more concise network parameters while maintaining considerable accuracy, demonstrating excellent effectiveness in terms of compression ratio and accuracy. The proposed framework can integrate the complementary aspects of quantization and pruning, and effectively minimize the possible adverse interactions between quantization and pruning.
{"title":"Two-phase collaborative model compression training for joint pruning and quantization","authors":"Chunxiao Fan , Jintao Li , Zhongqian Zhang , Fu Li , Bo Wang","doi":"10.1016/j.neunet.2025.108506","DOIUrl":"10.1016/j.neunet.2025.108506","url":null,"abstract":"<div><div>To reduce the storage and computational complexity of neural network models, various model compression techniques have been proposed in recent years, including pruning and quantization. However, due to the lack of interconnection among different type of methods, it is difficult to effectively integrate the advantages of these diverse techniques. This paper proposes a novel two-phase collaborative training framework for joint pruning and quantization to achieve synergistic optimization of multiple compression techniques. This framework combines pruning, quantization operations, consisting of two phases: collaborative constraint pre-compression and post-training compression refinement phases. In the collaborative constraint pre-compression phase, a novel unified constraint loss function is designed to ensure that weights are close to quantization values, and sparse regularization is utilized to automatically learn the network structure for pruning. It can effectively combine pruning and quantization operations, avoiding the potential negative impacts of separately implementing pruning and quantization. By calculating the difference between the current parameter values and the target quantization values, quantization errors are reduced through iterative optimization during the training process, making the parameters closer to the selected 2<sup><em>n</em></sup> values. The pruned network has a regular structure, and quantization to 2<sup><em>n</em></sup> values makes it highly suitable for hardware implementation as it can be achieved using a shifter. In the post-training compression refinement phase, joint compression operations including channel pruning and low-bit quantization are completed. Experimental results on benchmark datasets such as MNIST, CIFAR-10 and CIFAR-100 show that the framework generates more concise network parameters while maintaining considerable accuracy, demonstrating excellent effectiveness in terms of compression ratio and accuracy. The proposed framework can integrate the complementary aspects of quantization and pruning, and effectively minimize the possible adverse interactions between quantization and pruning.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108506"},"PeriodicalIF":6.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1016/j.neunet.2025.108510
Chengwan You , Wenxu Shi , Guibin Hu , Bochuan Zheng
Low-light image enhancement aims to improve brightness, contrast, and structural details of degraded images, thus improving image quality and supporting visual perception tasks. However, existing methods often lead to inaccurate illumination adjustment, amplified noise, and structural loss. To address these issues, we propose a Lightweight Illumination Iterative Adjustment Network (LIIA-Net) that jointly processes images in both frequency and spatial domains. First, a linear cross attention module fuses illumination and content features. Then, an amplitude adaptive iterative adjustment module adaptively regulates brightness in the frequency domain. Finally, a mamba-based structure refinement module restores spatial textures. Despite having only 0.48M parameters, LIIA-Net achieves performance comparable to or even surpassing state-of-the-art methods on both real and synthetic datasets. Moreover, when applied to downstream object detection, our enhanced images significantly boost detection accuracy. The code is available at https://github.com/ycwsilent/LIIANet.
{"title":"LIIA -Net: A lightweight illumination iterative adjustment network for low-light image enhancement","authors":"Chengwan You , Wenxu Shi , Guibin Hu , Bochuan Zheng","doi":"10.1016/j.neunet.2025.108510","DOIUrl":"10.1016/j.neunet.2025.108510","url":null,"abstract":"<div><div>Low-light image enhancement aims to improve brightness, contrast, and structural details of degraded images, thus improving image quality and supporting visual perception tasks. However, existing methods often lead to inaccurate illumination adjustment, amplified noise, and structural loss. To address these issues, we propose a Lightweight Illumination Iterative Adjustment Network (LIIA-Net) that jointly processes images in both frequency and spatial domains. First, a linear cross attention module fuses illumination and content features. Then, an amplitude adaptive iterative adjustment module adaptively regulates brightness in the frequency domain. Finally, a mamba-based structure refinement module restores spatial textures. Despite having only 0.48M parameters, LIIA-Net achieves performance comparable to or even surpassing state-of-the-art methods on both real and synthetic datasets. Moreover, when applied to downstream object detection, our enhanced images significantly boost detection accuracy. The code is available at <span><span>https://github.com/ycwsilent/LIIANet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108510"},"PeriodicalIF":6.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1016/j.neunet.2025.108516
Zhi-Song Liu , Roland Maier , Andreas Rupp
Finite element methods typically require a high resolution to satisfactorily approximate micro and even macro patterns of an underlying physical model. This issue can be circumvented by appropriate multiscale strategies that are able to obtain reasonable approximations on under-resolved scales. In this paper, we study the implicit neural representation and propose a continuous super-resolution network as a correction strategy for multiscale effects. It can take coarse finite element data to learn both in-distribution and out-of-distribution high-resolution finite element predictions. Our highlight is the design of a local implicit transformer, which is able to learn multiscale features. We also propose Gabor wavelet-based coordinate encodings which can overcome the bias of neural networks learning low-frequency features. Finally, perception is often preferred over distortion so scientists can recognize the visual pattern for further investigation. However, implicit neural representation is known for its lack of local pattern supervision. We propose to use stochastic cosine similarities to compare the local feature differences between prediction and ground truth. It shows better performance on structural alignments. Our experiments show that our proposed strategy achieves superior performance as an in-distribution and out-of-distribution super-resolution strategy.
{"title":"Multiscale corrections by continuous super-resolution","authors":"Zhi-Song Liu , Roland Maier , Andreas Rupp","doi":"10.1016/j.neunet.2025.108516","DOIUrl":"10.1016/j.neunet.2025.108516","url":null,"abstract":"<div><div>Finite element methods typically require a high resolution to satisfactorily approximate micro and even macro patterns of an underlying physical model. This issue can be circumvented by appropriate multiscale strategies that are able to obtain reasonable approximations on under-resolved scales. In this paper, we study the implicit neural representation and propose a continuous super-resolution network as a correction strategy for multiscale effects. It can take coarse finite element data to learn both in-distribution and out-of-distribution high-resolution finite element predictions. Our highlight is the design of a local implicit transformer, which is able to learn multiscale features. We also propose Gabor wavelet-based coordinate encodings which can overcome the bias of neural networks learning low-frequency features. Finally, perception is often preferred over distortion so scientists can recognize the visual pattern for further investigation. However, implicit neural representation is known for its lack of local pattern supervision. We propose to use stochastic cosine similarities to compare the local feature differences between prediction and ground truth. It shows better performance on structural alignments. Our experiments show that our proposed strategy achieves superior performance as an in-distribution and out-of-distribution super-resolution strategy.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108516"},"PeriodicalIF":6.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1016/j.neunet.2025.108491
Ruonan Liu , Guangdeng Zong , Xudong Zhao , Wencheng Wang
This paper investigates the neural network-based practical prescribed time adaptive tracking control problem for strict feedback nonlinear networked control systems with deception attacks in both sensors and actuators. To reduce the detrimental effects of deception attacks, an attack compensator is constructed based on compromised states and neural network technique. Then, a practical prescribed time function is introduced such that the tracking error does not violate the constraint boundary within the prescribed time, which ensures the transient and steady-state performances of the closed-loop system. Besides, the first-order sliding mode differentiator is designed to estimate the derivation of the virtual control laws, which eliminates the “complexity explosion”. Mathematically, it is demonstrated that all the signals in the closed-loop system are bounded, and the tracking error converges to a predetermined boundary within a prescribed time. Eventually, a numerical example and an application example of the single-link robotic arm system are adopted to exhibit the effectiveness of the acquired control algorithm.
{"title":"Neural network-based practical prescribed time adaptive tracking control for nonlinear networked control systems under deception attacks","authors":"Ruonan Liu , Guangdeng Zong , Xudong Zhao , Wencheng Wang","doi":"10.1016/j.neunet.2025.108491","DOIUrl":"10.1016/j.neunet.2025.108491","url":null,"abstract":"<div><div>This paper investigates the neural network-based practical prescribed time adaptive tracking control problem for strict feedback nonlinear networked control systems with deception attacks in both sensors and actuators. To reduce the detrimental effects of deception attacks, an attack compensator is constructed based on compromised states and neural network technique. Then, a practical prescribed time function is introduced such that the tracking error does not violate the constraint boundary within the prescribed time, which ensures the transient and steady-state performances of the closed-loop system. Besides, the first-order sliding mode differentiator is designed to estimate the derivation of the virtual control laws, which eliminates the “complexity explosion”. Mathematically, it is demonstrated that all the signals in the closed-loop system are bounded, and the tracking error converges to a predetermined boundary within a prescribed time. Eventually, a numerical example and an application example of the single-link robotic arm system are adopted to exhibit the effectiveness of the acquired control algorithm.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108491"},"PeriodicalIF":6.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.neunet.2025.108502
Yuheng Zong , Huaiping Jin , Hao Fang , Chai Hu , Jiashuo Shi
Dynamic spatial distortions in coherent light beams present a major challenge for stable and high-fidelity optical field shaping, particularly when the target output is a color-resolved pattern. Existing beam shaping techniques, including recent diffractive optical neural networks, are typically limited to monochromatic or grayscale targets and struggle to generalize across temporally varying, multi-distorted inputs. In this work, we propose a diffractive-electronic hybrid neural network tailored for real-time, color light field shaping. To improve spectral generalization, we introduce a wavelength-aware virtual branching (WAVB) mechanism during training, enabling the network to adaptively learn wavelength-specific shaping strategies without modifying the physical design. On the electronic side, we integrate a spectrally conditioned U-shape network, which is structurally adapted to preserve inter-channel dependencies. We implement frequency-selective skip connections (FSSC), allowing the network to emphasize mid- and high-frequency feature restoration while avoiding overcompensation in low-frequency regions. Additionally, we introduce an all-optical-driven optical flow prediction module, enabling frame-to-frame tracking and reverse inference of the beam’s evolution, thus enhancing temporal coherence. Our system achieves real-time operation at 50Hz, delivering robust, frame-stable color light field shaping across a range of spatial and temporal distortion scenarios. This work provides a task-specific, scalable framework for intelligent, adaptive imaging systems, with promising applications in dynamic holography, laser-based displays, and computational optical imaging.
{"title":"Color-resolved light field shaping via diffractive-electronic U-shape network with wavelength-aware virtual branching","authors":"Yuheng Zong , Huaiping Jin , Hao Fang , Chai Hu , Jiashuo Shi","doi":"10.1016/j.neunet.2025.108502","DOIUrl":"10.1016/j.neunet.2025.108502","url":null,"abstract":"<div><div>Dynamic spatial distortions in coherent light beams present a major challenge for stable and high-fidelity optical field shaping, particularly when the target output is a color-resolved pattern. Existing beam shaping techniques, including recent diffractive optical neural networks, are typically limited to monochromatic or grayscale targets and struggle to generalize across temporally varying, multi-distorted inputs. In this work, we propose a diffractive-electronic hybrid neural network tailored for real-time, color light field shaping. To improve spectral generalization, we introduce a wavelength-aware virtual branching (WAVB) mechanism during training, enabling the network to adaptively learn wavelength-specific shaping strategies without modifying the physical design. On the electronic side, we integrate a spectrally conditioned U-shape network, which is structurally adapted to preserve inter-channel dependencies. We implement frequency-selective skip connections (FSSC), allowing the network to emphasize mid- and high-frequency feature restoration while avoiding overcompensation in low-frequency regions. Additionally, we introduce an all-optical-driven optical flow prediction module, enabling frame-to-frame tracking and reverse inference of the beam’s evolution, thus enhancing temporal coherence. Our system achieves real-time operation at 50Hz, delivering robust, frame-stable color light field shaping across a range of spatial and temporal distortion scenarios. This work provides a task-specific, scalable framework for intelligent, adaptive imaging systems, with promising applications in dynamic holography, laser-based displays, and computational optical imaging.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"197 ","pages":"Article 108502"},"PeriodicalIF":6.3,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145841925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}