Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132639
Ahmet Yilmaz , Uğur Erkan , Abdurrahim Toktas , Qiang Lai , Suo Gao
Deep learning-based image retrieval (IR) approaches promising automatic feature extraction suffer from several limitations, including insufficient semantic representation, suboptimal retrieval performance, and limited evaluation across different hash code lengths. To address these limitations, a novel deep learning-based Autoencoder Hashing IR (AHIR) algorithm is proposed, employing the strengths of ResNet50 and autoencoder architectures. In this integrated model, ResNet50 is responsible for extracting the semantic features of images, while the autoencoder compresses these features to the required dimensions and transforms them into hash codes. The study's contributions include the ability to capture both low-level and high-level features, streamline IR for large-scale databases, and enhance efficiency in supervised learning scenarios. Furthermore, a comparative analysis of various reported IR algorithms is presented, highlighting the performance of AHIR against its counterparts for MS-COCO, NUS-WIDE, and MIRFLICKR-25K datasets. AHIR outperforms the existing methods with the highest mAP scores of 0.9103, 0.9007, and 0.9136 for MS-COCO, NUS-WIDE, and MIRFLICKR-25K, respectively. The results manifest the superior IR performance of AHIR thanks to the novel integrated autoencoder-based hashing mechanism.
{"title":"AHIR: Deep learning-based autoencoder hashing image retrieval","authors":"Ahmet Yilmaz , Uğur Erkan , Abdurrahim Toktas , Qiang Lai , Suo Gao","doi":"10.1016/j.neucom.2026.132639","DOIUrl":"10.1016/j.neucom.2026.132639","url":null,"abstract":"<div><div>Deep learning-based image retrieval (IR) approaches promising automatic feature extraction suffer from several limitations, including insufficient semantic representation, suboptimal retrieval performance, and limited evaluation across different hash code lengths. To address these limitations, a novel deep learning-based Autoencoder Hashing IR (AHIR) algorithm is proposed, employing the strengths of ResNet50 and autoencoder architectures. In this integrated model, ResNet50 is responsible for extracting the semantic features of images, while the autoencoder compresses these features to the required dimensions and transforms them into hash codes. The study's contributions include the ability to capture both low-level and high-level features, streamline IR for large-scale databases, and enhance efficiency in supervised learning scenarios. Furthermore, a comparative analysis of various reported IR algorithms is presented, highlighting the performance of AHIR against its counterparts for MS-COCO, NUS-WIDE, and MIRFLICKR-25K datasets. AHIR outperforms the existing methods with the highest mAP scores of 0.9103, 0.9007, and 0.9136 for MS-COCO, NUS-WIDE, and MIRFLICKR-25K, respectively. The results manifest the superior IR performance of AHIR thanks to the novel integrated autoencoder-based hashing mechanism.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132639"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132663
Dan Hu , Junwei Jin , Songbo Zhou , Xiang Du , Yanting Li
Biomedical imaging plays a vital role in modern healthcare by supporting accurate diagnosis and informed clinical decision-making. Although recent advances in artificial intelligence have significantly improved biomedical image analysis, persistent challenges remain, including high data variability, limited availability of annotated samples, and the need for transparent and interpretable classification models. To address these issues, this paper proposes a novel classification framework termed affine non-negative discriminative representation-based classification (ANDRC) for robust biomedical image recognition. In the proposed framework, a non-negativity constraint is imposed on the representation coefficients, which enforces the test sample to be reconstructed as a purely additive combination of training samples. By preventing subtractive contributions induced by negative coefficients, this constraint encourages stronger contributions from homogeneous samples while naturally suppressing misleading representations from heterogeneous ones, thereby enhancing both interpretability and discriminative stability. In addition, an affine constraint is incorporated to effectively model the intrinsic structure of data distributed over a union of affine subspaces, which commonly arises in real-world biomedical imaging scenarios. The resulting optimization problem is efficiently solved using the Alternating Direction Method of Multipliers, ensuring stable convergence and computational efficiency. Extensive experiments conducted on multiple biomedical image classification benchmarks demonstrate that the proposed ANDRC method consistently outperforms several state-of-the-art approaches. These results highlight the effectiveness of ANDRC in handling data variability and label scarcity, as well as its potential for practical deployment in biomedical image analysis.
{"title":"Affine non-negative discriminative representation for biomedical image classification","authors":"Dan Hu , Junwei Jin , Songbo Zhou , Xiang Du , Yanting Li","doi":"10.1016/j.neucom.2026.132663","DOIUrl":"10.1016/j.neucom.2026.132663","url":null,"abstract":"<div><div>Biomedical imaging plays a vital role in modern healthcare by supporting accurate diagnosis and informed clinical decision-making. Although recent advances in artificial intelligence have significantly improved biomedical image analysis, persistent challenges remain, including high data variability, limited availability of annotated samples, and the need for transparent and interpretable classification models. To address these issues, this paper proposes a novel classification framework termed affine non-negative discriminative representation-based classification (ANDRC) for robust biomedical image recognition. In the proposed framework, a non-negativity constraint is imposed on the representation coefficients, which enforces the test sample to be reconstructed as a purely additive combination of training samples. By preventing subtractive contributions induced by negative coefficients, this constraint encourages stronger contributions from homogeneous samples while naturally suppressing misleading representations from heterogeneous ones, thereby enhancing both interpretability and discriminative stability. In addition, an affine constraint is incorporated to effectively model the intrinsic structure of data distributed over a union of affine subspaces, which commonly arises in real-world biomedical imaging scenarios. The resulting optimization problem is efficiently solved using the Alternating Direction Method of Multipliers, ensuring stable convergence and computational efficiency. Extensive experiments conducted on multiple biomedical image classification benchmarks demonstrate that the proposed ANDRC method consistently outperforms several state-of-the-art approaches. These results highlight the effectiveness of ANDRC in handling data variability and label scarcity, as well as its potential for practical deployment in biomedical image analysis.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132663"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132638
Vinitha M Rajan, Sahely Bhadra
The time series anomaly detection task aims to identify instances of abnormal system behavior by analyzing time series data. For multivariate time series data, it is crucial to learn both temporal and inter-variable relationships. Graph-based methods explicitly learn inter-variable relationships alongside temporal patterns extracted from the data. However, two major limitations exist in these models. First, they are typically trained on data that is assumed to be free of anomalies, which are challenging to obtain in real-world scenarios. Second, they learn relationships only at the variable level, overlooking valuable information about inherent groups of variables. We adopt a more realistic approach by training on data that include anomalies. Our model learns inter-variable and temporal relationships by simultaneously performing forecasting and reconstruction, which prevents overfitting to anomalies present in the training data. Additionally, we enhance the model by leveraging information about groups of variables through a Hierarchical Graph Neural Network, enabling more effective learning of inter-variable relationships. Our method demonstrates a significant improvement in time series anomaly detection performance, as evaluated on benchmark datasets, outperforming state-of-the-art baselines by a margin of up to %. We also present an in-depth analysis of the model’s behavioral efficiency through extensive experiments using synthetic and real datasets.
{"title":"HEIGHTS: Hierarchical graph structure learning for time series anomaly detection","authors":"Vinitha M Rajan, Sahely Bhadra","doi":"10.1016/j.neucom.2026.132638","DOIUrl":"10.1016/j.neucom.2026.132638","url":null,"abstract":"<div><div>The time series anomaly detection task aims to identify instances of abnormal system behavior by analyzing time series data. For multivariate time series data, it is crucial to learn both temporal and inter-variable relationships. Graph-based methods explicitly learn inter-variable relationships alongside temporal patterns extracted from the data. However, two major limitations exist in these models. First, they are typically trained on data that is assumed to be free of anomalies, which are challenging to obtain in real-world scenarios. Second, they learn relationships only at the variable level, overlooking valuable information about inherent groups of variables. We adopt a more realistic approach by training on data that include anomalies. Our model learns inter-variable and temporal relationships by simultaneously performing forecasting and reconstruction, which prevents overfitting to anomalies present in the training data. Additionally, we enhance the model by leveraging information about groups of variables through a Hierarchical Graph Neural Network, enabling more effective learning of inter-variable relationships. Our method demonstrates a significant improvement in time series anomaly detection performance, as evaluated on benchmark datasets, outperforming state-of-the-art baselines by a margin of up to <span><math><mn>39</mn></math></span>%. We also present an in-depth analysis of the model’s behavioral efficiency through extensive experiments using synthetic and real datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132638"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132651
Miquel Miró-Nicolau , Antoni Jaume-i-Capó , Gabriel Moyà-Alcover
Robustness is a pivotal attribute of trustworthy explainable artificial intelligence, ensuring that explanations remain consistent when inputs are perturbed slightly. Although numerous metrics have been proposed to quantify this robustness, their reliability remains largely unverified. This study introduces a novel meta-evaluation framework for robustness metrics, grounded in controlled, verifiable experimental setups. We propose three sanity tests: perfect explanation, normal explanation, and random output. These tests facilitate the systematic assessment of the validity of robustness metrics using transparent models. By evaluating seven state-of-the-art robustness metrics across four benchmark datasets, our results reveal significant shortcomings: no single metric consistently achieves the expected outcomes across all tests. These findings underscore fundamental flaws in robustness metrics and emphasise the necessity for improved evaluation frameworks. Our methodology provides a reproducible, assumption-light benchmark for identifying unreliable metrics before deployment in critical applications.
{"title":"Meta-evaluation of robustness metrics: An in-depth analysis","authors":"Miquel Miró-Nicolau , Antoni Jaume-i-Capó , Gabriel Moyà-Alcover","doi":"10.1016/j.neucom.2026.132651","DOIUrl":"10.1016/j.neucom.2026.132651","url":null,"abstract":"<div><div>Robustness is a pivotal attribute of trustworthy explainable artificial intelligence, ensuring that explanations remain consistent when inputs are perturbed slightly. Although numerous metrics have been proposed to quantify this robustness, their reliability remains largely unverified. This study introduces a novel meta-evaluation framework for robustness metrics, grounded in controlled, verifiable experimental setups. We propose three sanity tests: perfect explanation, normal explanation, and random output. These tests facilitate the systematic assessment of the validity of robustness metrics using transparent models. By evaluating seven state-of-the-art robustness metrics across four benchmark datasets, our results reveal significant shortcomings: no single metric consistently achieves the expected outcomes across all tests. These findings underscore fundamental flaws in robustness metrics and emphasise the necessity for improved evaluation frameworks. Our methodology provides a reproducible, assumption-light benchmark for identifying unreliable metrics before deployment in critical applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132651"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132640
Hanyao Wang , Yibing Zhan , Liu Liu , Liang Ding , Yan Yang , Jun Yu
Pretrained cross-modal models, such as the representative CLIP model, have recently led to a boom in the use of pretrained models for cross-modal zero-shot tasks due to their strong generalization abilities. However, we experimentally discovered that CLIP suffers from text-to-image retrieval hallucination, which adversely limits its capabilities under zero-shot learning. Specifically, in retrieval tasks, CLIP often assigns the highest score to an incorrect image, even when it correctly understands the image’s semantic content in classification tasks. Accordingly, we propose the Balanced Score with Auxiliary Prompts (BSAP) method to address this problem. BSAP introduces auxiliary prompts that provide multiple reference outcomes for each image retrieval task. These outcomes, derived from the image and the target text, are normalized to compute a final similarity score, thereby reducing hallucinations. We further combine the original results with BSAP to generate a more robust hybrid outcome, termed BSAP-H. Extensive experiments on Referring Expression Comprehension (REC) and Referring Image Segmentation (RIS) tasks demonstrate that BSAP significantly improves the performance of CLIP and state-of-the-art vision-language models (VLMs). Code available at https://github.com/WangHanyao/BSAP.
{"title":"Towards alleviating hallucination in text-to-image retrieval for CLIP in zero-shot learning","authors":"Hanyao Wang , Yibing Zhan , Liu Liu , Liang Ding , Yan Yang , Jun Yu","doi":"10.1016/j.neucom.2026.132640","DOIUrl":"10.1016/j.neucom.2026.132640","url":null,"abstract":"<div><div>Pretrained cross-modal models, such as the representative CLIP model, have recently led to a boom in the use of pretrained models for cross-modal zero-shot tasks due to their strong generalization abilities. However, we experimentally discovered that CLIP suffers from text-to-image retrieval hallucination, which adversely limits its capabilities under zero-shot learning. Specifically, in retrieval tasks, CLIP often assigns the highest score to an incorrect image, even when it correctly understands the image’s semantic content in classification tasks. Accordingly, we propose the <strong>B</strong>alanced <strong>S</strong>core with <strong>A</strong>uxiliary <strong>P</strong>rompts (<strong>BSAP</strong>) method to address this problem. BSAP introduces auxiliary prompts that provide multiple reference outcomes for each image retrieval task. These outcomes, derived from the image and the target text, are normalized to compute a final similarity score, thereby reducing hallucinations. We further combine the original results with BSAP to generate a more robust hybrid outcome, termed BSAP-H. Extensive experiments on Referring Expression Comprehension (REC) and Referring Image Segmentation (RIS) tasks demonstrate that BSAP significantly improves the performance of CLIP and state-of-the-art vision-language models (VLMs). Code available at <span><span>https://github.com/WangHanyao/BSAP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132640"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132650
Qingyue Xin , Rui Zhang , Yaoqi Hu , Weidong Zhou , Lan Tian , Guoyang Liu
This paper presents an intelligent Electroencephalogram (EEG) analysis platform (EEG-TFX) that provides a comprehensive solution for EEG signal processing. The platform adopts a modular design, integrating data preprocessing, time-frequency segmentation, feature extraction, model training, and visualization functions. It supports flexible configuration of time windows, frequency band segmentation, and multiple filter order selections. The platform offers various feature extraction methods, including entropy-based features and other popular EEG features, together with multiple classification algorithms, such as artificial neural networks, support vector machines, and decision trees. In addition, the system supports user-defined function extensions. It enables joint feature processing, dynamic visualization, and result export, making it suitable for multiple neuro-computing research fields such as brain-computer interfaces, neural rehabilitation, and EEG-based emotion recognition. EEG-TFX significantly enhances the efficiency and reliability of EEG signal analysis by providing standardized workflows, offering a convenient and integrated tool for related research. The toolbox is open-source and available at: https://github.com/Xin-qy/EEG-TFX.
{"title":"EEG-TFX: An interactive MATLAB toolbox for EEG feature engineering via multi-scale temporal windowing and filter banks","authors":"Qingyue Xin , Rui Zhang , Yaoqi Hu , Weidong Zhou , Lan Tian , Guoyang Liu","doi":"10.1016/j.neucom.2026.132650","DOIUrl":"10.1016/j.neucom.2026.132650","url":null,"abstract":"<div><div>This paper presents an intelligent Electroencephalogram (EEG) analysis platform (EEG-TFX) that provides a comprehensive solution for EEG signal processing. The platform adopts a modular design, integrating data preprocessing, time-frequency segmentation, feature extraction, model training, and visualization functions. It supports flexible configuration of time windows, frequency band segmentation, and multiple filter order selections. The platform offers various feature extraction methods, including entropy-based features and other popular EEG features, together with multiple classification algorithms, such as artificial neural networks, support vector machines, and decision trees. In addition, the system supports user-defined function extensions. It enables joint feature processing, dynamic visualization, and result export, making it suitable for multiple neuro-computing research fields such as brain-computer interfaces, neural rehabilitation, and EEG-based emotion recognition. EEG-TFX significantly enhances the efficiency and reliability of EEG signal analysis by providing standardized workflows, offering a convenient and integrated tool for related research. The toolbox is open-source and available at: <span><span>https://github.com/Xin-qy/EEG-TFX</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132650"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132658
Zixiang Fei , Hao Liu , Wenju Zhou , Minrui Fei
Dynamic Facial Expression Recognition (DFER) in real-world scenarios remains a difficult problem for emotion analysis, as it requires capturing subtle temporal variations while resisting the interference of spatial noise. To address this issue, we design a new framework, Multi-Scale BiTemporal Fusion Network (MSBTFN), which progressively refines spatiotemporal representations through dual-path processing combined with adaptive attention. The proposed architecture integrates multi-scale convolutions to capture both local and global motion characteristics from paired temporal segments. In addition, a dual-pooling mechanism fuses channel statistics with spatial operations to highlight discriminative and transient emotional cues. Coordinated 1D and 2D attention layers are then applied to construct adaptive channel weights and spatial response maps, effectively suppressing noise and enhancing feature quality. Furthermore, the Temporal Transformer Module (TTM) models temporal dependencies on the refined spatial features through an encoder built upon dual window and shifted-window attention blocks. Comprehensive experiments and ablation studies on three large-scale in-the-wild datasets—AFEW, FERV39K, and DFEW—demonstrate the robustness and accuracy of the proposed method.
{"title":"Multi-scale BiTemporal fusion for dynamic facial expression recognition in the wild","authors":"Zixiang Fei , Hao Liu , Wenju Zhou , Minrui Fei","doi":"10.1016/j.neucom.2026.132658","DOIUrl":"10.1016/j.neucom.2026.132658","url":null,"abstract":"<div><div>Dynamic Facial Expression Recognition (DFER) in real-world scenarios remains a difficult problem for emotion analysis, as it requires capturing subtle temporal variations while resisting the interference of spatial noise. To address this issue, we design a new framework, Multi-Scale BiTemporal Fusion Network (MSBTFN), which progressively refines spatiotemporal representations through dual-path processing combined with adaptive attention. The proposed architecture integrates multi-scale convolutions to capture both local and global motion characteristics from paired temporal segments. In addition, a dual-pooling mechanism fuses channel statistics with spatial operations to highlight discriminative and transient emotional cues. Coordinated 1D and 2D attention layers are then applied to construct adaptive channel weights and spatial response maps, effectively suppressing noise and enhancing feature quality. Furthermore, the Temporal Transformer Module (TTM) models temporal dependencies on the refined spatial features through an encoder built upon dual window and shifted-window attention blocks. Comprehensive experiments and ablation studies on three large-scale in-the-wild datasets—AFEW, FERV39K, and DFEW—demonstrate the robustness and accuracy of the proposed method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132658"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132619
Atifa Rafique , Xue Yu , Musaed Alhussein , Kashif Iqbal , Mohammad Kamrul Hasan , Khursheed Aurangzeb
Generative adversarial networks (GANs) have demonstrated remarkable success in image synthesis but often suffer from mode collapse, training instability, and computational inefficiency. To overcome these challenges, we propose a dynamic evolutionary architecture search approach for GANs, which employs differential evolution (DE) to optimize the generator architecture through genetic operations such as crossover and mutation. Furthermore, it incorporates dynamic layer-wise weight sharing (DLWS) with an adaptive similarity threshold (AST) to enhance parameter efficiency and training stability. Unlike traditional weight-sharing techniques, our dynamic mechanism adjusts based on structural similarities between layers, improving both specialization and stability. Additionally, we integrate fair single-path sampling and an operation discard strategy to ensure smoother training and faster convergence. Based on extensive experiments on CIFAR-10, STL-10, CIFAR-100, and CelebA datasets, our proposed method achieves an inception score (IS) of 8.99 0.06 on CIFAR-10, fréchet inception distance (FID) of 21.75 on STL-10 and an IS of 8.89 on the CIFAR-100 dataset. These results demonstrate the superior performance of our method, while significantly reducing GPU hours compared to existing approaches. This paper provides a scalable, stable, and efficient solution for optimizing GAN architectures, opening new possibilities for advanced generative modeling tasks.
{"title":"Differential evolutionary architecture search with dynamic similarity-aware weight sharing for optimization of GANs","authors":"Atifa Rafique , Xue Yu , Musaed Alhussein , Kashif Iqbal , Mohammad Kamrul Hasan , Khursheed Aurangzeb","doi":"10.1016/j.neucom.2026.132619","DOIUrl":"10.1016/j.neucom.2026.132619","url":null,"abstract":"<div><div>Generative adversarial networks (GANs) have demonstrated remarkable success in image synthesis but often suffer from mode collapse, training instability, and computational inefficiency. To overcome these challenges, we propose a dynamic evolutionary architecture search approach for GANs, which employs differential evolution (DE) to optimize the generator architecture through genetic operations such as crossover and mutation. Furthermore, it incorporates dynamic layer-wise weight sharing (DLWS) with an adaptive similarity threshold (AST) to enhance parameter efficiency and training stability. Unlike traditional weight-sharing techniques, our dynamic mechanism adjusts based on structural similarities between layers, improving both specialization and stability. Additionally, we integrate fair single-path sampling and an operation discard strategy to ensure smoother training and faster convergence. Based on extensive experiments on CIFAR-10, STL-10, CIFAR-100, and CelebA datasets, our proposed method achieves an inception score (IS) of 8.99 <span><math><mo>±</mo></math></span> 0.06 on CIFAR-10, fréchet inception distance (FID) of 21.75 on STL-10 and an IS of 8.89 on the CIFAR-100 dataset. These results demonstrate the superior performance of our method, while significantly reducing GPU hours compared to existing approaches. This paper provides a scalable, stable, and efficient solution for optimizing GAN architectures, opening new possibilities for advanced generative modeling tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132619"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.neucom.2026.132655
Qilin Zhu , Yulong Ding , Shuang-Hua Yang
Cyber-Physical Systems (CPS) are intelligent control systems that integrate computation, physical processes, and network communication. They are widely applied in various critical infrastructures such as industrial manufacturing, power grids, and water treatment plants. In recent years, CPS have increasingly become targets of malicious attacks, posing severe threats to both society and the environment. To safeguard the security of CPS, anomaly detection has emerged as a key approach to ensuring normal system operation. However, existing anomaly detection methods for CPS often suffer from limitations. Many approaches focus solely on cyber-level anomalies, without accounting for the unique integration of cyber and physical components in CPS, and they lack authentication mechanisms for physical components. Physical watermarking, which leverages the inherent control logic of the system, is capable of detecting replay attacks that are difficult to detect through traditional means. Nevertheless, conventional persistent watermarking strategies lead to substantial increases in control costs. This study proposes a dual-threshold watermarking method that, without assuming specific replay attack models, aims to add watermarks exclusively during attacks, thereby effectively reducing control costs. In addition, a novel anomaly detection statistic is introduced to address the limitations of existing intermittent watermarking schemes, where insufficient watermark addition during the attacker’s data recording phase leads to weakened residual anomalies. Furthermore, this study employs reinforcement learning techniques to dynamically regulate both the timing and strength of watermark addition. Simulation experiments on widely used linearized quadruple-tank system in CPS demonstrate the effectiveness of the proposed methods in accurately identifying replay attacks while minimizing control costs.
{"title":"A cost-effective dynamic physical watermarking strategy for replay attack detection in cyber-physical systems","authors":"Qilin Zhu , Yulong Ding , Shuang-Hua Yang","doi":"10.1016/j.neucom.2026.132655","DOIUrl":"10.1016/j.neucom.2026.132655","url":null,"abstract":"<div><div>Cyber-Physical Systems (CPS) are intelligent control systems that integrate computation, physical processes, and network communication. They are widely applied in various critical infrastructures such as industrial manufacturing, power grids, and water treatment plants. In recent years, CPS have increasingly become targets of malicious attacks, posing severe threats to both society and the environment. To safeguard the security of CPS, anomaly detection has emerged as a key approach to ensuring normal system operation. However, existing anomaly detection methods for CPS often suffer from limitations. Many approaches focus solely on cyber-level anomalies, without accounting for the unique integration of cyber and physical components in CPS, and they lack authentication mechanisms for physical components. Physical watermarking, which leverages the inherent control logic of the system, is capable of detecting replay attacks that are difficult to detect through traditional means. Nevertheless, conventional persistent watermarking strategies lead to substantial increases in control costs. This study proposes a dual-threshold watermarking method that, without assuming specific replay attack models, aims to add watermarks exclusively during attacks, thereby effectively reducing control costs. In addition, a novel anomaly detection statistic is introduced to address the limitations of existing intermittent watermarking schemes, where insufficient watermark addition during the attacker’s data recording phase leads to weakened residual anomalies. Furthermore, this study employs reinforcement learning techniques to dynamically regulate both the timing and strength of watermark addition. Simulation experiments on widely used linearized quadruple-tank system in CPS demonstrate the effectiveness of the proposed methods in accurately identifying replay attacks while minimizing control costs.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132655"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph contrastive learning has become an advanced paradigm for clustering scRNA-seq data by effectively mitigating the issue of false negative sample pairs in traditional contrastive learning. However, applying graph contrastive learning to scRNA-seq data clustering still faces two issues: first, scRNA-seq data is high-dimensional, noisy, and sparse, making it difficult for current graph construction methods to capture the inherent complex relationships between cells; second, existing methods rely on data perturbation to generate contrastive views, inevitably introducing additional noise and limiting model performance. To address these challenges, we propose scMFE, a multi-view fusion enhanced graph contrastive learning method for scRNA-seq data clustering. scMFE constructs cell graphs by fusing multi-view features, thereby capturing linear, nonlinear, and biological relationships between cells. scMFE generates contrastive views by applying topological structure constraints to the fused graph, thereby avoiding the introduction of additional noise. Building upon this, scMFE utilizes graph attention autoencoders to learn cell representations, which are then optimized through reconstruction loss and dual contrastive learning at both the cell and cluster levels. Comparative experiments on 14 real datasets against 14 baseline methods demonstrate scMFE’s superior performance in scRNA-seq data clustering. Further experiments, including analysis of cell graph quality, imbalanced cluster identification, ablation study, hyperparameter analysis, runtime analysis, cell type annotation, batch effect removal, analysis of multi-omics data, sensitivity analysis of the number of clusters, sensitivity analysis of the number of neighbors in the KNN graph, and generalization to unseen cell types, validate the method’s effectiveness and biological rationality. The source code for scMFE is available at https://github.com/Thirty-Six-Stratagems/scMFE.
{"title":"scMFE: A multi-view fusion enhanced graph contrastive learning method for scRNA-seq data clustering","authors":"Yun Bai, Zhenqiu Shu, Kaiwen Tan, Yongbing Zhang, Zhengtao Yu","doi":"10.1016/j.neucom.2026.132662","DOIUrl":"10.1016/j.neucom.2026.132662","url":null,"abstract":"<div><div>Graph contrastive learning has become an advanced paradigm for clustering scRNA-seq data by effectively mitigating the issue of false negative sample pairs in traditional contrastive learning. However, applying graph contrastive learning to scRNA-seq data clustering still faces two issues: first, scRNA-seq data is high-dimensional, noisy, and sparse, making it difficult for current graph construction methods to capture the inherent complex relationships between cells; second, existing methods rely on data perturbation to generate contrastive views, inevitably introducing additional noise and limiting model performance. To address these challenges, we propose scMFE, a multi-view fusion enhanced graph contrastive learning method for scRNA-seq data clustering. scMFE constructs cell graphs by fusing multi-view features, thereby capturing linear, nonlinear, and biological relationships between cells. scMFE generates contrastive views by applying topological structure constraints to the fused graph, thereby avoiding the introduction of additional noise. Building upon this, scMFE utilizes graph attention autoencoders to learn cell representations, which are then optimized through reconstruction loss and dual contrastive learning at both the cell and cluster levels. Comparative experiments on 14 real datasets against 14 baseline methods demonstrate scMFE’s superior performance in scRNA-seq data clustering. Further experiments, including analysis of cell graph quality, imbalanced cluster identification, ablation study, hyperparameter analysis, runtime analysis, cell type annotation, batch effect removal, analysis of multi-omics data, sensitivity analysis of the number of clusters, sensitivity analysis of the number of neighbors <span><math><mi>K</mi></math></span> in the KNN graph, and generalization to unseen cell types, validate the method’s effectiveness and biological rationality. The source code for scMFE is available at <span><span>https://github.com/Thirty-Six-Stratagems/scMFE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"671 ","pages":"Article 132662"},"PeriodicalIF":6.5,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}