Pub Date : 2026-02-04DOI: 10.1016/j.neucom.2026.132869
Meng Liu , Xiao He
This paper introduces a novel extension to the Extended Kalman-based Smooth Variable Structure Filter (EK-SVSF), a hybrid state estimation framework that integrates the Extended Kalman Filter (EKF) with the Smooth Variable Structure Filter (SVSF). Tailored for nonlinear systems subject to model uncertainties and external disturbances, EK-SVSF enhances estimation accuracy by leveraging the complementary strengths of its constituent filters. Nonetheless, the efficacy of EK-SVSF hinges critically on the selection of an appropriate width for the smoothing boundary layer (SBL); suboptimal values—either excessively large or small—can substantially impair filtering performance. Compounding this issue, inherent model uncertainties render the determination of an optimal SBL a formidable and enduring challenge. To mitigate this, we propose a data-driven methodology that autonomously extracts salient features from the smoothing boundary function, thereby resolving the parameter tuning dilemma under model uncertainty. Furthermore, to refine the associated multi-loss weighted aggregation, we incorporate an adaptive weighting scheme based on the coefficient of variation, enabling dynamic optimization. Empirical evaluations demonstrate that the proposed approach yields robust and resilient state estimation outcomes, even in the presence of significant model discrepancies.
{"title":"Data-driven robust state estimation based on EK-SVSF","authors":"Meng Liu , Xiao He","doi":"10.1016/j.neucom.2026.132869","DOIUrl":"10.1016/j.neucom.2026.132869","url":null,"abstract":"<div><div>This paper introduces a novel extension to the Extended Kalman-based Smooth Variable Structure Filter (EK-SVSF), a hybrid state estimation framework that integrates the Extended Kalman Filter (EKF) with the Smooth Variable Structure Filter (SVSF). Tailored for nonlinear systems subject to model uncertainties and external disturbances, EK-SVSF enhances estimation accuracy by leveraging the complementary strengths of its constituent filters. Nonetheless, the efficacy of EK-SVSF hinges critically on the selection of an appropriate width for the smoothing boundary layer (SBL); suboptimal values—either excessively large or small—can substantially impair filtering performance. Compounding this issue, inherent model uncertainties render the determination of an optimal SBL a formidable and enduring challenge. To mitigate this, we propose a data-driven methodology that autonomously extracts salient features from the smoothing boundary function, thereby resolving the parameter tuning dilemma under model uncertainty. Furthermore, to refine the associated multi-loss weighted aggregation, we incorporate an adaptive weighting scheme based on the coefficient of variation, enabling dynamic optimization. Empirical evaluations demonstrate that the proposed approach yields robust and resilient state estimation outcomes, even in the presence of significant model discrepancies.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132869"},"PeriodicalIF":6.5,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1016/j.neucom.2026.132954
Yongping Dan , Zhaoyuan Wang , MengZhao Zhang , Zhuo Li
With the increasing complexity of industrial Internet of Things systems and other intelligent technologies, anomaly detection in multivariate time series has become pivotal for applications in equipment health monitoring and industrial process control. Existing methodologies often struggle with addressing the challenges of multivariate dependencies, temporal dynamics, and computational efficiency. Therefore, this paper introduces the Multi-scale Adaptive Dependency Temporal Convolutional Network (MAD-TCN), a lightweight and efficient model designed to overcome these limitations. MAD-TCN leverages a dual-branch architecture, utilizing both local (short-term) and global (long-term) temporal feature extraction through depthwise separable dilated convolutions, which are fused to achieve multiscale integration. The model incorporates a cross-variable convolutional feedforward network and an adaptive gated unit to dynamically adjust dependency relationships between variables, enhancing the model’s ability to handle complex interdependencies across multiple dimensions. Comprehensive experiments on four public benchmark datasets (SMAP, SWaT, SMD, MBA) alongside 13 state-of-the-art methods (including LSTM-NDT, DAGMM, TimesNet, TranAD and DTAAD) demonstrate that MAD-TCN outperforms the competition in terms of anomaly detection accuracy, achieving the highest or second-highest AUC and F1-scores, while maintaining a parameter count of only approximately 0.026 M. In addition, compared to the best alternative, MAD-TCN achieves a 34% improvement in training and inference speed. In summary, these experimental results fully demonstrate the superior performance of MAD-TCN in the time series anomaly detection task with both high accuracy and computational efficiency.Source code: https://github.com/qianmo2001/MAD-TCN
{"title":"MAD-TCN: Time series anomaly detection via multi-scale adaptive dependency temporal convolutional network","authors":"Yongping Dan , Zhaoyuan Wang , MengZhao Zhang , Zhuo Li","doi":"10.1016/j.neucom.2026.132954","DOIUrl":"10.1016/j.neucom.2026.132954","url":null,"abstract":"<div><div>With the increasing complexity of industrial Internet of Things systems and other intelligent technologies, anomaly detection in multivariate time series has become pivotal for applications in equipment health monitoring and industrial process control. Existing methodologies often struggle with addressing the challenges of multivariate dependencies, temporal dynamics, and computational efficiency. Therefore, this paper introduces the Multi-scale Adaptive Dependency Temporal Convolutional Network (MAD-TCN), a lightweight and efficient model designed to overcome these limitations. MAD-TCN leverages a dual-branch architecture, utilizing both local (short-term) and global (long-term) temporal feature extraction through depthwise separable dilated convolutions, which are fused to achieve multiscale integration. The model incorporates a cross-variable convolutional feedforward network and an adaptive gated unit to dynamically adjust dependency relationships between variables, enhancing the model’s ability to handle complex interdependencies across multiple dimensions. Comprehensive experiments on four public benchmark datasets (SMAP, SWaT, SMD, MBA) alongside 13 state-of-the-art methods (including LSTM-NDT, DAGMM, TimesNet, TranAD and DTAAD) demonstrate that MAD-TCN outperforms the competition in terms of anomaly detection accuracy, achieving the highest or second-highest AUC and F1-scores, while maintaining a parameter count of only approximately 0.026 M. In addition, compared to the best alternative, MAD-TCN achieves a 34% improvement in training and inference speed. In summary, these experimental results fully demonstrate the superior performance of MAD-TCN in the time series anomaly detection task with both high accuracy and computational efficiency.Source code: <span><span>https://github.com/qianmo2001/MAD-TCN</span><svg><path></path></svg></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132954"},"PeriodicalIF":6.5,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132946
Aoping Hong , Xiangyu Chen , Hongying Tang , Jiuhang Wang , Baoqing Li
Under low-light conditions, real-world images often exhibit significant illumination variations and uneven image quality. However, existing algorithms typically employ uniform enhancement strategies that disregard semantic consistency when processing such images, leading to issues such as overexposure, underexposure, or amplified noise and artifacts in shadow regions. To address this issue, we propose LumiGAN, a memory-based dual-branch network for low-light image enhancement. Specifically, LumiGAN utilizes a Quality Assessment Module (QAM) to segment images into regions requiring different enhancement levels. These regions are then processed by the encoder, which comprises the Spatial Dual-Branch Encoder Module (SDEM) and the Frequency Dual-Branch Encoder Module (FDEM). The SDEM extracts local and non-local features in network’s shallow layers through convolutions with varying receptive fields, while the FDEM captures global illumination and structural information in the deeper layers. Furthermore, these encoders optimize feature segmentation and extraction through dual-branch feature interaction. Finally, the decoder fuses and reconstructs the dual-branch features. Additionally, a memory bank module is introduced in the network’s intermediate layers. Drawing inspiration from human visual memory principles, this module enhances the semantic information of intermediate-layer features, thereby improving the consistency between the original image and the enhanced image. Comprehensive qualitative and quantitative evaluations on benchmark datasets demonstrate that our algorithm not only improves image brightness uniformity but also effectively suppresses noise and artifacts, while substantially boosting semantic consistency and image aesthetic quality. Codes and models are available at https://github.com/lLIVHT/LumiGAN.
{"title":"LumiGAN: Memory-guided dual-branch learning for real-world low-light image enhancement","authors":"Aoping Hong , Xiangyu Chen , Hongying Tang , Jiuhang Wang , Baoqing Li","doi":"10.1016/j.neucom.2026.132946","DOIUrl":"10.1016/j.neucom.2026.132946","url":null,"abstract":"<div><div>Under low-light conditions, real-world images often exhibit significant illumination variations and uneven image quality. However, existing algorithms typically employ uniform enhancement strategies that disregard semantic consistency when processing such images, leading to issues such as overexposure, underexposure, or amplified noise and artifacts in shadow regions. To address this issue, we propose LumiGAN, a memory-based dual-branch network for low-light image enhancement. Specifically, LumiGAN utilizes a Quality Assessment Module (QAM) to segment images into regions requiring different enhancement levels. These regions are then processed by the encoder, which comprises the Spatial Dual-Branch Encoder Module (SDEM) and the Frequency Dual-Branch Encoder Module (FDEM). The SDEM extracts local and non-local features in network’s shallow layers through convolutions with varying receptive fields, while the FDEM captures global illumination and structural information in the deeper layers. Furthermore, these encoders optimize feature segmentation and extraction through dual-branch feature interaction. Finally, the decoder fuses and reconstructs the dual-branch features. Additionally, a memory bank module is introduced in the network’s intermediate layers. Drawing inspiration from human visual memory principles, this module enhances the semantic information of intermediate-layer features, thereby improving the consistency between the original image and the enhanced image. Comprehensive qualitative and quantitative evaluations on benchmark datasets demonstrate that our algorithm not only improves image brightness uniformity but also effectively suppresses noise and artifacts, while substantially boosting semantic consistency and image aesthetic quality. Codes and models are available at <span><span>https://github.com/lLIVHT/LumiGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132946"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132951
Andrea Cossu , Federico Giannini , Giacomo Ziffer , Alessio Bernardo , Alexander Gepperth , Emanuele Della Valle , Barbara Hammer , Davide Bacciu
Continual Learning (CL) and Streaming Machine Learning (SML) study the ability of agents to learn from a stream of non-stationary data. Despite sharing some similarities, they address different and complementary challenges. While SML focuses on rapid adaptation after changes (concept drifts), CL aims to retain past knowledge when learning new tasks. After a brief introduction to CL and SML, we discuss Streaming Continual Learning (SCL), an emerging paradigm providing a unifying solution to real-world problems, which may require both SML and CL abilities. We claim that SCL can i) connect the CL and SML communities, motivating their work towards the same goal, and ii) foster the design of hybrid approaches that can quickly adapt to new information (as in SML) without forgetting previous knowledge (as in CL). We conclude the paper with a motivating example and a set of experiments, highlighting the need for SCL by showing how CL and SML alone struggle to achieve rapid adaptation and knowledge retention.
{"title":"A practical guide to streaming continual learning","authors":"Andrea Cossu , Federico Giannini , Giacomo Ziffer , Alessio Bernardo , Alexander Gepperth , Emanuele Della Valle , Barbara Hammer , Davide Bacciu","doi":"10.1016/j.neucom.2026.132951","DOIUrl":"10.1016/j.neucom.2026.132951","url":null,"abstract":"<div><div>Continual Learning (CL) and Streaming Machine Learning (SML) study the ability of agents to learn from a stream of non-stationary data. Despite sharing some similarities, they address different and complementary challenges. While SML focuses on rapid adaptation after changes (concept drifts), CL aims to retain past knowledge when learning new tasks. After a brief introduction to CL and SML, we discuss Streaming Continual Learning (SCL), an emerging paradigm providing a unifying solution to real-world problems, which may require both SML and CL abilities. We claim that SCL can i) connect the CL and SML communities, motivating their work towards the same goal, and ii) foster the design of hybrid approaches that can quickly adapt to new information (as in SML) without forgetting previous knowledge (as in CL). We conclude the paper with a motivating example and a set of experiments, highlighting the need for SCL by showing how CL and SML alone struggle to achieve rapid adaptation and knowledge retention.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132951"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132947
Caixia Yan , Yan Kou , Chuan Liu
Zero-shot Human-Object Interaction (HOI) detection has emerged as a new challenge that aims to precisely identify human-object interactions without relying on specific prior training data. The existing visual-semantic mapping-based approaches tackle this challenge by transferring knowledge from external sources or exploring compositional techniques. However, due to the lack of training samples for unseen classes, these methods often suffer from the issue of overfitting to the available seen data and fail to generalize well to novel and diverse HOI categories of long-tail distribution. Thus, in this work, we propose to synthesize visual features for unseen HOI categories conditioned on the semantic embedding of the corresponding category, enabling the model to learn both seen and unseen HOI instances in the visual domain. In pursuit of this objective, we develop an innovative unseen HOI synthesizer by unleashing the power of Generative Adversarial Networks (GANs). Considering the flexibility and complexity of the zero-shot HOI task settings, we design a multi-granularity GAN synthesizer to generate both composite HOI features and basic elements of subjects, verbs and objects, which are then fused to provide enriched training data for the unseen HOI classifier. To further enhance the quality of HOI feature synthesis, we have customized both inter-cluster, and intra-cluster contrastive learning and composition augmented generation strategies to facilitate the learning process of GANs. Extensive experiments demonstrate that the proposed method can synthesize appropriate visual features for various unobserved HOI categories, and thus performs favorably in multiple zero-shot HOI detection settings.
{"title":"SynHOI: Multi-granularity GAN synthesizer for generative zero-shot HOI detection","authors":"Caixia Yan , Yan Kou , Chuan Liu","doi":"10.1016/j.neucom.2026.132947","DOIUrl":"10.1016/j.neucom.2026.132947","url":null,"abstract":"<div><div>Zero-shot Human-Object Interaction (HOI) detection has emerged as a new challenge that aims to precisely identify human-object interactions without relying on specific prior training data. The existing visual-semantic mapping-based approaches tackle this challenge by transferring knowledge from external sources or exploring compositional techniques. However, due to the lack of training samples for unseen classes, these methods often suffer from the issue of overfitting to the available seen data and fail to generalize well to novel and diverse HOI categories of long-tail distribution. Thus, in this work, we propose to synthesize visual features for unseen HOI categories conditioned on the semantic embedding of the corresponding category, enabling the model to learn both seen and unseen HOI instances in the visual domain. In pursuit of this objective, we develop an innovative unseen HOI synthesizer by unleashing the power of Generative Adversarial Networks (GANs). Considering the flexibility and complexity of the zero-shot HOI task settings, we design a multi-granularity GAN synthesizer to generate both composite HOI features and basic elements of subjects, verbs and objects, which are then fused to provide enriched training data for the unseen HOI classifier. To further enhance the quality of HOI feature synthesis, we have customized both inter-cluster, and intra-cluster contrastive learning and composition augmented generation strategies to facilitate the learning process of GANs. Extensive experiments demonstrate that the proposed method can synthesize appropriate visual features for various unobserved HOI categories, and thus performs favorably in multiple zero-shot HOI detection settings.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132947"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132944
Zhaopeng Deng , Zheng Zhou , Haoran Zhao , Gengshen Wu , Xin Sun
Hyperspectral image (HSI) classification represents a critical research focus in remote sensing. However, effectively and efficiently modeling complex spectral-spatial relationships remains a fundamental challenge. While convolutional neural networks (CNNs) and Transformers have gained widespread adoption for HSI classification, CNNs struggle to capture long-range dependencies, and Transformers suffer from quadratic computational complexity. Recently, a selective state-space model (SSM) named Mamba has demonstrated significant potential. Nevertheless, directly applying Mamba to HSI classification poses substantial challenges due to intricate spectral-spatial interactions. To address this, we propose a novel Mamba-based architecture for HSI classification named Spectral–Spatial Dual-Branch Mamba for Hyperspectral Image Classification (SSDMamba), which models both spectral and spatial data and then efficiently fuses the two types of information. Specifically, we design a DS Spatial Mamba (DSSM) block, utilizing unidirectional scanning, to process spatial long-range information in a lightweight manner. Subsequently, we propose an FFT Spectral Mamba (FSM) block, which efficiently processes spectral data to establish global connections within the spectral data. Finally, a Dynamic Interactive Fusion Module (DIFM) dynamically and efficiently fuses spectral and spatial features. Extensive experiments on four benchmark HSI datasets demonstrate that SSDMamba achieves significantly higher accuracy with fewer parameters compared to other methods.
{"title":"SSDMamba: A spectral–spatial dual-branch mamba for hyperspectral image classification","authors":"Zhaopeng Deng , Zheng Zhou , Haoran Zhao , Gengshen Wu , Xin Sun","doi":"10.1016/j.neucom.2026.132944","DOIUrl":"10.1016/j.neucom.2026.132944","url":null,"abstract":"<div><div>Hyperspectral image (HSI) classification represents a critical research focus in remote sensing. However, effectively and efficiently modeling complex spectral-spatial relationships remains a fundamental challenge. While convolutional neural networks (CNNs) and Transformers have gained widespread adoption for HSI classification, CNNs struggle to capture long-range dependencies, and Transformers suffer from quadratic computational complexity. Recently, a selective state-space model (SSM) named Mamba has demonstrated significant potential. Nevertheless, directly applying Mamba to HSI classification poses substantial challenges due to intricate spectral-spatial interactions. To address this, we propose a novel Mamba-based architecture for HSI classification named Spectral–Spatial Dual-Branch Mamba for Hyperspectral Image Classification (SSDMamba), which models both spectral and spatial data and then efficiently fuses the two types of information. Specifically, we design a DS Spatial Mamba (DSSM) block, utilizing unidirectional scanning, to process spatial long-range information in a lightweight manner. Subsequently, we propose an FFT Spectral Mamba (FSM) block, which efficiently processes spectral data to establish global connections within the spectral data. Finally, a Dynamic Interactive Fusion Module (DIFM) dynamically and efficiently fuses spectral and spatial features. Extensive experiments on four benchmark HSI datasets demonstrate that SSDMamba achieves significantly higher accuracy with fewer parameters compared to other methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132944"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132945
Kai Qu, Shuangsi Xue, Xiaodong Zheng, Hui Cao
While deep learning has revolutionized time-series imputation, prevailing sequence-based paradigms remain vulnerable when handling contiguous block missing patterns, where the loss of local context leads to significant performance degradation. To dismantle this barrier, this paper introduces the SpatioTemporal Image Mask Imputer (STIMI), a vision-inspired framework that recasts high-dimensional wind speed reconstruction as a grayscale image reconstruction task. STIMI introduces a Mutual Information (MI)-based re-indexing strategy that reshapes irregular time series into a structured 2D grid, helping the model better recognize and recover missing patterns. It further adopts an asymmetric encoder–decoder architecture with a Multi-Scale Window Self-Attention (MSWSA) mechanism to efficiently capture multi-granularity spatiotemporal dependencies at reduced computational cost. Furthermore, the framework optimizes a dual-objective hybrid loss function, synergizing Mean Squared Error (MSE) with Kullback-Leibler (KL) divergence to ensure both point-wise fidelity and global distributional consistency. Extensive experiments confirm that STIMI consistently outperforms state-of-the-art baselines, demonstrating superior resilience particularly in extreme block missing scenarios. Finally, SHAP-based interpretability analysis reveals the model’s ability to prioritize local contextual information through a distance-dependent hierarchy, establishing STIMI as a trustworthy and explainable solution for data-intensive renewable energy applications.
{"title":"STIMI: A masked image modeling framework for spatiotemporal wind speed reconstruction","authors":"Kai Qu, Shuangsi Xue, Xiaodong Zheng, Hui Cao","doi":"10.1016/j.neucom.2026.132945","DOIUrl":"10.1016/j.neucom.2026.132945","url":null,"abstract":"<div><div>While deep learning has revolutionized time-series imputation, prevailing sequence-based paradigms remain vulnerable when handling contiguous block missing patterns, where the loss of local context leads to significant performance degradation. To dismantle this barrier, this paper introduces the <u>S</u>patio<u>T</u>emporal <u>I</u>mage <u>M</u>ask <u>I</u>mputer (STIMI), a vision-inspired framework that recasts high-dimensional wind speed reconstruction as a grayscale image reconstruction task. STIMI introduces a Mutual Information (MI)-based re-indexing strategy that reshapes irregular time series into a structured 2D grid, helping the model better recognize and recover missing patterns. It further adopts an asymmetric encoder–decoder architecture with a Multi-Scale Window Self-Attention (MSWSA) mechanism to efficiently capture multi-granularity spatiotemporal dependencies at reduced computational cost. Furthermore, the framework optimizes a dual-objective hybrid loss function, synergizing Mean Squared Error (MSE) with Kullback-Leibler (KL) divergence to ensure both point-wise fidelity and global distributional consistency. Extensive experiments confirm that STIMI consistently outperforms state-of-the-art baselines, demonstrating superior resilience particularly in extreme block missing scenarios. Finally, SHAP-based interpretability analysis reveals the model’s ability to prioritize local contextual information through a distance-dependent hierarchy, establishing STIMI as a trustworthy and explainable solution for data-intensive renewable energy applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132945"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As a crucial data source for Earth science research and spatial information applications, remote sensing images often face limitations in spatial resolution due to factors such as sensor performance, imaging conditions, and costs, making it challenging to meet the growing demand for fine-grained analysis. In recent years, deep learning-based remote sensing image super-resolution (RSISR) technology has demonstrated significant potential by reconstructing high-resolution (HR) details from low-resolution (LR) remote sensing images, quickly becoming a research hotspot. However, systematic reviews of RSISR methodologies, network architecture evolution, domain development characteristics, and future directions remain relatively scarce. To address this, this study comprehensively reviews the major advancements in the field since 2020 based on the development of deep learning-based RSISR frameworks. First, it defines the RSISR problem, provides a comprehensive statistical analysis of RSISR algorithms published from 2020 onward, and selects over 100 deep learning-related publications for in-depth study. Subsequently, existing research is systematically categorized according to methodological principles: supervised learning methods are divided into six categories based on convolutional neural networks, attention mechanisms, generative adversarial networks, Transformers, diffusion models, and Mamba, while unsupervised learning methods are grouped into four frameworks including self-supervised learning, contrastive learning, zero-shot learning, and generative methods. Additionally, commonly used datasets, loss functions, and evaluation metrics in RSISR tasks are reviewed, and existing performance assessment methods are discussed in detail. Finally, the study summarizes the current development trends, future directions, and key challenges in the field, aiming to provide theoretical reference and practical guidance for related research.
{"title":"Deep learning-based remote sensing image super-resolution: Recent advances and challenges","authors":"Jiawei Yang, Hongliang Ren, Zhichao He, Mengjie Zeng","doi":"10.1016/j.neucom.2026.132939","DOIUrl":"10.1016/j.neucom.2026.132939","url":null,"abstract":"<div><div>As a crucial data source for Earth science research and spatial information applications, remote sensing images often face limitations in spatial resolution due to factors such as sensor performance, imaging conditions, and costs, making it challenging to meet the growing demand for fine-grained analysis. In recent years, deep learning-based remote sensing image super-resolution (RSISR) technology has demonstrated significant potential by reconstructing high-resolution (HR) details from low-resolution (LR) remote sensing images, quickly becoming a research hotspot. However, systematic reviews of RSISR methodologies, network architecture evolution, domain development characteristics, and future directions remain relatively scarce. To address this, this study comprehensively reviews the major advancements in the field since 2020 based on the development of deep learning-based RSISR frameworks. First, it defines the RSISR problem, provides a comprehensive statistical analysis of RSISR algorithms published from 2020 onward, and selects over 100 deep learning-related publications for in-depth study. Subsequently, existing research is systematically categorized according to methodological principles: supervised learning methods are divided into six categories based on convolutional neural networks, attention mechanisms, generative adversarial networks, Transformers, diffusion models, and Mamba, while unsupervised learning methods are grouped into four frameworks including self-supervised learning, contrastive learning, zero-shot learning, and generative methods. Additionally, commonly used datasets, loss functions, and evaluation metrics in RSISR tasks are reviewed, and existing performance assessment methods are discussed in detail. Finally, the study summarizes the current development trends, future directions, and key challenges in the field, aiming to provide theoretical reference and practical guidance for related research.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132939"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132935
Jiarui Zhang , Zhihua Chen , Chun Zheng , Wenjun Yi , Guoxu Yan , Yi Wang
With the advancement of unmanned aerial vehicle technology, visual detection for anti-drone tasks in air-to-air scenarios has become increasingly critical. However, detecting fast-moving small UAVs in complex backgrounds remains challenging due to interference from background noise and blurred target edges, resulting in poor detection accuracy. To address these issues, we propose Anti-DETR, an end-to-end detection network leveraging wavelet convolution specifically for small-target UAV detection. Anti-DETR consists of three key components: first, the Global Multi-channel Wavelet Residual Network, which expands the receptive field through wavelet convolution and efficiently localizes targets with a global multi-channel attention mechanism; second, the Multi-scale Refined Feature Pyramid Network, employing an Adaptive Global Calibration Attention Unit to integrate fine-grained shallow features and deep semantic features, enhancing multi-scale feature representation; finally, the Histogram Self-Attention mechanism, which classifies pixel-level features to improve feature perception in complex backgrounds. Evaluations on the Det-Fly, DUT-Anti-UAV, and HazyDet datasets demonstrate that Anti-DETR surpasses several state-of-the-art methods and classical detectors, confirming its effectiveness and generalizability for accurate anti-UAV detection tasks under challenging environmental conditions. The code is available at https://github.com/Image-Zhang/anti-detr.
{"title":"Anti-DETR: End-to-end anti-drone visual detection network based on wavelet convolution","authors":"Jiarui Zhang , Zhihua Chen , Chun Zheng , Wenjun Yi , Guoxu Yan , Yi Wang","doi":"10.1016/j.neucom.2026.132935","DOIUrl":"10.1016/j.neucom.2026.132935","url":null,"abstract":"<div><div>With the advancement of unmanned aerial vehicle technology, visual detection for anti-drone tasks in air-to-air scenarios has become increasingly critical. However, detecting fast-moving small UAVs in complex backgrounds remains challenging due to interference from background noise and blurred target edges, resulting in poor detection accuracy. To address these issues, we propose Anti-DETR, an end-to-end detection network leveraging wavelet convolution specifically for small-target UAV detection. Anti-DETR consists of three key components: first, the Global Multi-channel Wavelet Residual Network, which expands the receptive field through wavelet convolution and efficiently localizes targets with a global multi-channel attention mechanism; second, the Multi-scale Refined Feature Pyramid Network, employing an Adaptive Global Calibration Attention Unit to integrate fine-grained shallow features and deep semantic features, enhancing multi-scale feature representation; finally, the Histogram Self-Attention mechanism, which classifies pixel-level features to improve feature perception in complex backgrounds. Evaluations on the Det-Fly, DUT-Anti-UAV, and HazyDet datasets demonstrate that Anti-DETR surpasses several state-of-the-art methods and classical detectors, confirming its effectiveness and generalizability for accurate anti-UAV detection tasks under challenging environmental conditions. The code is available at <span><span>https://github.com/Image-Zhang/anti-detr</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132935"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132938
Cagri Eser , Zeynep Sonat Baltaci , Emre Akbas , Sinan Kalkan
Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes. This, however, disregards the presence of redundant examples and inherent differences in the learning difficulties of classes. Alternatively, one can use complex measures such as training loss and uncertainty, which, however, depend on training a machine learning model. Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance that can be seamlessly incorporated into various imbalance mitigation methods. Our results across five different datasets with a diverse range of imbalance ratios show that ID consistently outperforms cardinality-based re-weighting and re-sampling techniques used in the literature. Moreover, we show that combining ID with cardinality can further improve performance. Our code and models are available at https://github.com/cagries/IDIM.
{"title":"Intrinsic dimensionality as a model-free measure of class imbalance","authors":"Cagri Eser , Zeynep Sonat Baltaci , Emre Akbas , Sinan Kalkan","doi":"10.1016/j.neucom.2026.132938","DOIUrl":"10.1016/j.neucom.2026.132938","url":null,"abstract":"<div><div>Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes. This, however, disregards the presence of redundant examples and inherent differences in the learning difficulties of classes. Alternatively, one can use complex measures such as training loss and uncertainty, which, however, depend on training a machine learning model. Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance that can be seamlessly incorporated into various imbalance mitigation methods. Our results across five different datasets with a diverse range of imbalance ratios show that ID consistently outperforms cardinality-based re-weighting and re-sampling techniques used in the literature. Moreover, we show that combining ID with cardinality can further improve performance. Our code and models are available at <span><span>https://github.com/cagries/IDIM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132938"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}