Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132941
Su-Nan Yang, Liang-Dong Guo, Sheng-Juan Huang
This paper addresses the observer-based proportional-integral-derivative (PID) quasi-consensus control problem for stochastic multi-agent systems (MASs) subject to multimodal false data injection (MFDI) attacks. Agent communication is constrained by uniform quantization and a random access (RA) protocol, reflecting typical network limitations. A distributed observer-based PID control scheme is proposed to achieve bounded-error consensus. The concept of quasi-consensus in probability is adopted to assess system performance under stochastic disturbances, quantization effects, and adversarial intrusions. By constructing an augmented system model and employing switched Lyapunov techniques together with stochastic analysis, sufficient conditions for quasi-consensus are established in the form of linear matrix inequalities (LMIs), which yield explicit solutions for observer and PID gain matrices. Numerical simulations verify the effectiveness of the proposed control strategy in preserving quasi-consensus under both RA-induced scheduling uncertainties and coordinated MFDI attacks.
{"title":"Observer-based PID quasi-consensus control of stochastic multi-agent systems under multimodal false data injection attacks","authors":"Su-Nan Yang, Liang-Dong Guo, Sheng-Juan Huang","doi":"10.1016/j.neucom.2026.132941","DOIUrl":"10.1016/j.neucom.2026.132941","url":null,"abstract":"<div><div>This paper addresses the observer-based proportional-integral-derivative (PID) quasi-consensus control problem for stochastic multi-agent systems (MASs) subject to multimodal false data injection (MFDI) attacks. Agent communication is constrained by uniform quantization and a random access (RA) protocol, reflecting typical network limitations. A distributed observer-based PID control scheme is proposed to achieve bounded-error consensus. The concept of quasi-consensus in probability is adopted to assess system performance under stochastic disturbances, quantization effects, and adversarial intrusions. By constructing an augmented system model and employing switched Lyapunov techniques together with stochastic analysis, sufficient conditions for quasi-consensus are established in the form of linear matrix inequalities (LMIs), which yield explicit solutions for observer and PID gain matrices. Numerical simulations verify the effectiveness of the proposed control strategy in preserving quasi-consensus under both RA-induced scheduling uncertainties and coordinated MFDI attacks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132941"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1016/j.neucom.2026.132940
Yu Peng, Haoyu Zou, Kehu Yang, Qingqun Kong
Semi-supervised learning (SSL) has achieved remarkable success in medical image segmentation, with co-training paradigms standing out among existing approaches. However, most current methods use homogeneous network architectures, where identical inductive biases can cause confirmation bias, limiting performance. Additionally, these methods treat all pixels equally, failing to fully exploit the hidden information in complex regions. To overcome these issues, we propose an Asymmetric Deformation-guided Mutual Learning (ADML) framework. ADML builds an asymmetric dual-branch system consisting of a standard convolutional network (V-Net) and a deformable convolutional network (VNet-DCN), adding diverse inductive biases to provide heterogeneous supervisory signals. The core of our framework is asymmetric deformation-guided consistency learning (ADCL), which leverages the norm of the DCN offset field to measure local deformation complexity. This allows for the creation of spatial weight maps that adaptively modify pseudo-label weights, helping reduce confirmation bias and improve the reliability of pseudo-labels. Additionally, to enable knowledge transfer between the asymmetric models, we introduce a Cross-model Dynamic Feature Bank that stores high-confidence features and enforces alignment through a maximum mean discrepancy (MMD) loss, achieving deep semantic coherence between the two branches. Extensive experiments on three benchmarks show that ADML surpasses state-of-the-art methods, confirming its effectiveness in lowering annotation needs and enhancing segmentation accuracy.
{"title":"ADML: Asymmetric deformation-guided mutual learning for semi-supervised medical image segmentation","authors":"Yu Peng, Haoyu Zou, Kehu Yang, Qingqun Kong","doi":"10.1016/j.neucom.2026.132940","DOIUrl":"10.1016/j.neucom.2026.132940","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) has achieved remarkable success in medical image segmentation, with co-training paradigms standing out among existing approaches. However, most current methods use homogeneous network architectures, where identical inductive biases can cause confirmation bias, limiting performance. Additionally, these methods treat all pixels equally, failing to fully exploit the hidden information in complex regions. To overcome these issues, we propose an Asymmetric Deformation-guided Mutual Learning (ADML) framework. ADML builds an asymmetric dual-branch system consisting of a standard convolutional network (V-Net) and a deformable convolutional network (VNet-DCN), adding diverse inductive biases to provide heterogeneous supervisory signals. The core of our framework is asymmetric deformation-guided consistency learning (ADCL), which leverages the norm of the DCN offset field to measure local deformation complexity. This allows for the creation of spatial weight maps that adaptively modify pseudo-label weights, helping reduce confirmation bias and improve the reliability of pseudo-labels. Additionally, to enable knowledge transfer between the asymmetric models, we introduce a Cross-model Dynamic Feature Bank that stores high-confidence features and enforces alignment through a maximum mean discrepancy (MMD) loss, achieving deep semantic coherence between the two branches. Extensive experiments on three benchmarks show that ADML surpasses state-of-the-art methods, confirming its effectiveness in lowering annotation needs and enhancing segmentation accuracy.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132940"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132917
Yugen Yi , Yu Duan , Weixia Xu , Wei Deng , Hong Li , Longjun Huang , Siwei Luo , Yali Peng , Jiangyan Dai
Accurate tooth segmentation from oral panoramic images is crucial for addressing clinical oral health challenges. However, the complexity of oral structures and substantial variations in clinical data present significant difficulties, limiting the ability of existing deep learning models to capture fine-grained structural details. To address these limitations, we propose a Twin-Branch Multi-Scale Channel Gated Network (TBMCGNet), which introduces several targeted improvements over conventional architectures. Specifically, we design a Twin-Branch Complementary (TBC) module that integrates multi-scale convolutional layers with a Transformer structure. This hybrid architecture enables the model to effectively capture global contextual information while preserving the precise localization of dental boundaries in local regions. Furthermore, an adaptive feature fusion strategy is employed to optimally exploit cross-scale feature dependencies. Next, we introduce a Multi-Scale Channel Gated (MSCG) module to aggregate multi-level features from different encoder stages across multiple scales. This multi-scale fusion mechanism not only enhances the model’s capability to accurately delineate tooth boundaries but also effectively reduces semantic discrepancies between encoder stages. Consequently, the model achieves improved discrimination of subtle distinctions between edges and inter-tooth boundaries within complex oral configurations. Finally, we construct a novel oral panoramic image dataset to evaluate the effectiveness of tooth segmentation for both binary and multi-class tasks. Comprehensive experiments on public and proprietary datasets demonstrate that TBMCGNet outperforms current state-of-the-art approaches, achieving superior segmentation accuracy and robustness. The source code and datasets will be publicly available at: https://github.com/t-sukii/TBMCGNet.
{"title":"TBMCGNet and TSDataset: A twin-branch multi-scale channel-gated network with a new benchmark for tooth segmentation","authors":"Yugen Yi , Yu Duan , Weixia Xu , Wei Deng , Hong Li , Longjun Huang , Siwei Luo , Yali Peng , Jiangyan Dai","doi":"10.1016/j.neucom.2026.132917","DOIUrl":"10.1016/j.neucom.2026.132917","url":null,"abstract":"<div><div>Accurate tooth segmentation from oral panoramic images is crucial for addressing clinical oral health challenges. However, the complexity of oral structures and substantial variations in clinical data present significant difficulties, limiting the ability of existing deep learning models to capture fine-grained structural details. To address these limitations, we propose a Twin-Branch Multi-Scale Channel Gated Network (TBMCGNet), which introduces several targeted improvements over conventional architectures. Specifically, we design a Twin-Branch Complementary (TBC) module that integrates multi-scale convolutional layers with a Transformer structure. This hybrid architecture enables the model to effectively capture global contextual information while preserving the precise localization of dental boundaries in local regions. Furthermore, an adaptive feature fusion strategy is employed to optimally exploit cross-scale feature dependencies. Next, we introduce a Multi-Scale Channel Gated (MSCG) module to aggregate multi-level features from different encoder stages across multiple scales. This multi-scale fusion mechanism not only enhances the model’s capability to accurately delineate tooth boundaries but also effectively reduces semantic discrepancies between encoder stages. Consequently, the model achieves improved discrimination of subtle distinctions between edges and inter-tooth boundaries within complex oral configurations. Finally, we construct a novel oral panoramic image dataset to evaluate the effectiveness of tooth segmentation for both binary and multi-class tasks. Comprehensive experiments on public and proprietary datasets demonstrate that TBMCGNet outperforms current state-of-the-art approaches, achieving superior segmentation accuracy and robustness. The source code and datasets will be publicly available at: <span><span>https://github.com/t-sukii/TBMCGNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132917"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sparseness Optimized Feature Importance (SOFI) is a post-hoc method that produces explanations using minimal feature sets, reducing the cognitive burden on human experts by highlighting only the most critical factors. In practice, explanations take the form of a ranking of features whose cumulative marginalization leads to rapid degradation in model performance. However, SOFI employs hill climbing for ranking optimization, which increases the risk of convergence to local optima when the number of features grows. In addition, like other mainstream explainers, SOFI lacks a mechanism for exploiting prior knowledge during optimization. In this paper, we propose Sparseness Optimized Feature Importance with Prior Knowledge (SOFI-P), an extension of SOFI that integrates prior knowledge into a reinforcement learning framework to optimize explanation sparsity. In this explainer, the exploration is guided by a probabilistic swapping strategy that maximizes model performance degradation under cumulative feature marginalization. Prior knowledge is incorporated as a learnable parameter vector, initially defined by domain experts and later updated during optimization. In addition, we derive upper bounds on the change in explanation sparsity induced by adjacent and arbitrary swaps in a feature ranking. The proposed theorems provide practical value by establishing concrete limits for expected explanation sparsity post-swapping, thereby characterizing the problem’s search space complexity. Empirical evaluation on 40 structured classification datasets shows that SOFI-P produces more sparse explanations than state-of-the-art explainers. Furthermore, ablation studies confirm the benefits of incorporating prior knowledge to guide reinforcement learning, even when such knowledge is imprecise. Toward the end, a case study on chest X-ray images illustrates the practical applicability of the method.
{"title":"Sparseness-optimized feature importance with prior knowledge and reinforcement learning-powered optimization","authors":"Gonzalo Nápoles , Isel Grau , Yamisleydi Salgueiro","doi":"10.1016/j.neucom.2026.132925","DOIUrl":"10.1016/j.neucom.2026.132925","url":null,"abstract":"<div><div>Sparseness Optimized Feature Importance (SOFI) is a post-hoc method that produces explanations using minimal feature sets, reducing the cognitive burden on human experts by highlighting only the most critical factors. In practice, explanations take the form of a ranking of features whose cumulative marginalization leads to rapid degradation in model performance. However, SOFI employs hill climbing for ranking optimization, which increases the risk of convergence to local optima when the number of features grows. In addition, like other mainstream explainers, SOFI lacks a mechanism for exploiting prior knowledge during optimization. In this paper, we propose Sparseness Optimized Feature Importance with Prior Knowledge (SOFI-P), an extension of SOFI that integrates prior knowledge into a reinforcement learning framework to optimize explanation sparsity. In this explainer, the exploration is guided by a probabilistic swapping strategy that maximizes model performance degradation under cumulative feature marginalization. Prior knowledge is incorporated as a learnable parameter vector, initially defined by domain experts and later updated during optimization. In addition, we derive upper bounds on the change in explanation sparsity induced by adjacent and arbitrary swaps in a feature ranking. The proposed theorems provide practical value by establishing concrete limits for expected explanation sparsity post-swapping, thereby characterizing the problem’s search space complexity. Empirical evaluation on 40 structured classification datasets shows that SOFI-P produces more sparse explanations than state-of-the-art explainers. Furthermore, ablation studies confirm the benefits of incorporating prior knowledge to guide reinforcement learning, even when such knowledge is imprecise. Toward the end, a case study on chest X-ray images illustrates the practical applicability of the method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132925"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132913
Tianchi Yu , Ivan Oseledets
Physics-informed neural networks have emerged as a powerful tool for solving partial differential equations (PDEs) by integrating physical constraints into neural network training, but their performance is sensitive to the sampling of points. Motivated by the impressive performance of quasi-Monte Carlo methods in high dimensional problems, this paper proposes Quasi-Random Physics-Informed Neural Networks (QRPINNs), which sample training points from low-discrepancy sequences instead of the domain of PDEs. Theoretically, QRPINNs are shown to exhibit a faster convergence rate than PINNs. Empirically, experiments demonstrate that QRPINNs outperform PINNs and some representative adaptive sampling methods in high-dimensional PDEs. Furthermore, combining QRPINNs with adaptive sampling further enhances both accuracy and efficiency.
{"title":"Quasi-random physics-informed neural networks","authors":"Tianchi Yu , Ivan Oseledets","doi":"10.1016/j.neucom.2026.132913","DOIUrl":"10.1016/j.neucom.2026.132913","url":null,"abstract":"<div><div>Physics-informed neural networks have emerged as a powerful tool for solving partial differential equations (PDEs) by integrating physical constraints into neural network training, but their performance is sensitive to the sampling of points. Motivated by the impressive performance of quasi-Monte Carlo methods in high dimensional problems, this paper proposes Quasi-Random Physics-Informed Neural Networks (QRPINNs), which sample training points from low-discrepancy sequences instead of the domain of PDEs. Theoretically, QRPINNs are shown to exhibit a faster convergence rate than PINNs. Empirically, experiments demonstrate that QRPINNs outperform PINNs and some representative adaptive sampling methods in high-dimensional PDEs. Furthermore, combining QRPINNs with adaptive sampling further enhances both accuracy and efficiency.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132913"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132921
Kai Peng , Qing Li , Zhijian He , Bowen Zhang , Xianghua Fu , Bin Li , Xiaohui Wang , Zhi-Qi Cheng , Yan Yan , Xiaojiang Peng
Robotic manipulation enables robots to interact with and adapt to their environments, making it crucial for real-world intelligent applications. Recent advancements in Large Language Models (LLMs) have positioned them as transformative tools in robotic manipulation. By integrating vision, language, and action, LLM-based frameworks enhance reasoning, planning, and multimodal understanding, equipping robots to handle increasingly complex tasks. The Vision-Language-Action (VLA) paradigm, in particular, unifies perception, cognition, and execution, paving the way for generalized and versatile robotic systems. This paper provides a comprehensive survey of robotic manipulation, covering traditional bottom-up approaches and modern LLM-based methods. We emphasize recent LLM-based modular and end-to-end architectures, analyze benchmarks, datasets, and robotic hardware platforms. Additionally, we explore potential research directions to advance the field further. By synthesizing these developments, we aim to provide researchers and practitioners with a valuable resource to navigate this rapidly evolving domain and unlock the full potential of LLMs in robotic manipulation.
{"title":"A survey of robotic manipulation: From bottom-up approaches to end-to-end paradigms with LLMs","authors":"Kai Peng , Qing Li , Zhijian He , Bowen Zhang , Xianghua Fu , Bin Li , Xiaohui Wang , Zhi-Qi Cheng , Yan Yan , Xiaojiang Peng","doi":"10.1016/j.neucom.2026.132921","DOIUrl":"10.1016/j.neucom.2026.132921","url":null,"abstract":"<div><div>Robotic manipulation enables robots to interact with and adapt to their environments, making it crucial for real-world intelligent applications. Recent advancements in Large Language Models (LLMs) have positioned them as transformative tools in robotic manipulation. By integrating vision, language, and action, LLM-based frameworks enhance reasoning, planning, and multimodal understanding, equipping robots to handle increasingly complex tasks. The Vision-Language-Action (VLA) paradigm, in particular, unifies perception, cognition, and execution, paving the way for generalized and versatile robotic systems. This paper provides a comprehensive survey of robotic manipulation, covering traditional bottom-up approaches and modern LLM-based methods. We emphasize recent LLM-based modular and end-to-end architectures, analyze benchmarks, datasets, and robotic hardware platforms. Additionally, we explore potential research directions to advance the field further. By synthesizing these developments, we aim to provide researchers and practitioners with a valuable resource to navigate this rapidly evolving domain and unlock the full potential of LLMs in robotic manipulation.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132921"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132923
Xuanmian Liu , Xianping Qin , Shu Li , Fuchang Zhang , Rachid Hedjam , Guoqiang Zhong
Recently, differentiable architecture search has become one of the hotspots in the field of neural architecture search (NAS). However, this paradigm suffers from a critical inconsistency problem: the architecture optimized in the continuous space often collapses significantly after discretization. This discrepancy not only renders the computational resources spent on searching futile but also leads to derived architectures that fail to generalize to complex tasks, severely limiting the practical deployability of differentiable NAS. To address this problem and alleviate the well-known performance collapse in existing differentiable search approaches, we propose a new attention-guided differentiable NAS method, called progressively attentional architecture search (PAAS). In the implementation of PAAS, simultaneously considering performance, parameter quantity, and operation independence, we design a novel search space to improve the upper limit of the structural performance from the source of the NAS process. Moreover, we propose a new attention-guided architecture search paradigm, embedding attention modules to help distinguish the significant parts of the learned architectures, which effectively mitigates the optimization collapse at a granular level and the uncertainty of the architecture selection process caused by using only architecture parameters. In addition, we propose a progressive discretization strategy to bridge the structural gap between the search and evaluation stages, which mitigates the performance gap between the super-network and discrete architectures. Extensive experiments demonstrate that PAAS achieves a 2.47% error rate on CIFAR-10 with only 0.4 GPU days, outperforming state-of-the-art methods such as DARTS (2.76%) and DrNAS (2.54%) in both accuracy and efficiency. When transferred to ImageNet, it attains a 24.2% top-1 error, surpassing robust baselines such as PC-DARTS (25.1%) and ProxylessNAS (24.9%), thereby validating its strong cross-dataset generalization.
{"title":"Progressively attentional architecture search","authors":"Xuanmian Liu , Xianping Qin , Shu Li , Fuchang Zhang , Rachid Hedjam , Guoqiang Zhong","doi":"10.1016/j.neucom.2026.132923","DOIUrl":"10.1016/j.neucom.2026.132923","url":null,"abstract":"<div><div>Recently, differentiable architecture search has become one of the hotspots in the field of neural architecture search (NAS). However, this paradigm suffers from a critical inconsistency problem: the architecture optimized in the continuous space often collapses significantly after discretization. This discrepancy not only renders the computational resources spent on searching futile but also leads to derived architectures that fail to generalize to complex tasks, severely limiting the practical deployability of differentiable NAS. To address this problem and alleviate the well-known performance collapse in existing differentiable search approaches, we propose a new attention-guided differentiable NAS method, called progressively attentional architecture search (PAAS). In the implementation of PAAS, simultaneously considering performance, parameter quantity, and operation independence, we design a novel search space to improve the upper limit of the structural performance from the source of the NAS process. Moreover, we propose a new attention-guided architecture search paradigm, embedding attention modules to help distinguish the significant parts of the learned architectures, which effectively mitigates the optimization collapse at a granular level and the uncertainty of the architecture selection process caused by using only architecture parameters. In addition, we propose a progressive discretization strategy to bridge the structural gap between the search and evaluation stages, which mitigates the performance gap between the super-network and discrete architectures. Extensive experiments demonstrate that PAAS achieves a 2.47% error rate on CIFAR-10 with only 0.4 GPU days, outperforming state-of-the-art methods such as DARTS (2.76%) and DrNAS (2.54%) in both accuracy and efficiency. When transferred to ImageNet, it attains a 24.2% top-1 error, surpassing robust baselines such as PC-DARTS (25.1%) and ProxylessNAS (24.9%), thereby validating its strong cross-dataset generalization.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132923"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132922
Yexin Zhang , Zhongtian Ma , Qiaosheng Zhang , Zhen Wang , Xuelong LI
This paper studies community detection in correlated multi-view graphs from an information-theoretic perspective. We consider multi-view graphs observed from views on a common node set, where edge variables across views may be statistically dependent. To capture inter-graph correlations, we propose a random graph model called the multi-view stochastic block model (MVSBM), which generates graphs over nodes partitioned into two equal-sized communities. For each pair of nodes , the presence or absence of edges across the graphs depends on whether and belong to the same community. Our goal is to exactly recover the hidden communities from the observed graphs. Our contributions are three-fold. First, we establish an information-theoretic achievability result (Theorem 1), showing that exact recovery is possible when the MVSBM parameters exceed a critical threshold. Second, we derive a matching converse (Theorem 2), proving that below this threshold any estimator has an expected number of misclassified nodes greater than one. Together, these results yield a sharp threshold for exact recovery. Third, we develop a computationally efficient spectral clustering algorithm with a local refinement step. Experiments on MVSBM-generated graphs demonstrate a phase transition that closely matches the theoretical threshold and show that the proposed method outperforms several baselines. Overall, our results delineate the fundamental limits of community detection in correlated multi-view graphs.
{"title":"Community detection in the multi-view stochastic block model","authors":"Yexin Zhang , Zhongtian Ma , Qiaosheng Zhang , Zhen Wang , Xuelong LI","doi":"10.1016/j.neucom.2026.132922","DOIUrl":"10.1016/j.neucom.2026.132922","url":null,"abstract":"<div><div>This paper studies community detection in correlated multi-view graphs from an information-theoretic perspective. We consider multi-view graphs observed from <span><math><mi>D</mi></math></span> views on a common node set, where edge variables across views may be statistically dependent. To capture inter-graph correlations, we propose a random graph model called the multi-view stochastic block model (MVSBM), which generates <span><math><mi>D</mi></math></span> graphs over <span><math><mi>n</mi></math></span> nodes partitioned into two equal-sized communities. For each pair of nodes <span><math><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></math></span>, the presence or absence of edges across the <span><math><mi>D</mi></math></span> graphs depends on whether <span><math><mi>i</mi></math></span> and <span><math><mi>j</mi></math></span> belong to the same community. Our goal is to exactly recover the hidden communities from the observed graphs. Our contributions are three-fold. First, we establish an information-theoretic achievability result (<span><span>Theorem 1</span></span>), showing that exact recovery is possible when the MVSBM parameters exceed a critical threshold. Second, we derive a matching converse (<span><span>Theorem 2</span></span>), proving that below this threshold any estimator has an expected number of misclassified nodes greater than one. Together, these results yield a sharp threshold for exact recovery. Third, we develop a computationally efficient spectral clustering algorithm with a local refinement step. Experiments on MVSBM-generated graphs demonstrate a phase transition that closely matches the theoretical threshold and show that the proposed method outperforms several baselines. Overall, our results delineate the fundamental limits of community detection in correlated multi-view graphs.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132922"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132870
Rui Yang , Qindong Sun , Han Cao , Kai Lin , Chao Shen
Current state-of-the-art post-deployment countermeasures for adversarial example mitigation (known as adversarial purification and detection) exhibit significant limitations: (1) insufficient generalization performance on various adversarial examples, (2) serious negative effects on benign samples (referred to as the decreased accuracy), and (3) extensive inference-time consumption, etc. These limitations considerably hinder their application in safety-critical real-world scenarios. To narrow these gaps, this paper proposes a novel post-deployment countermeasure named Random, Ensemble, and Simultaneous Purification-Detection Framework (RES-PDF). Specifically, inspired by the adversarial region migration phenomenon observed in adversarial purification, RES-PDF first extends this concept to a continuous adversarial region migration phenomenon and exploits it to establish a novel adversarial purification named Random Ensemble Adversarial Purification (REAP). Then, RES-PDF innovatively introduces a detection feature on REAP to enhance its purification performance further while simultaneously using a purification feature to improve its detection performance further. In RES-PDF, purification and detection can complement each other, achieving the effect of . Extensive experiments across different scenarios demonstrate that RES-PDF surpasses previous countermeasures in several key areas: (1) remarkably enhanced generalization performance on various adversarial examples, with an average improvement of 10.0%; (2) minimal negative effects on benign samples, with a reduction of 1.0%; and (3) significantly reduced inference-time consumption, reduced to the millisecond level, etc. In general, RES-PDF provides a novel and efficient post-deployment countermeasure for adversarial example mitigation in safety-critical real-world scenarios.
{"title":"RES-PDF: A random, ensemble, and simultaneous purification-detection framework for adversarial example mitigation","authors":"Rui Yang , Qindong Sun , Han Cao , Kai Lin , Chao Shen","doi":"10.1016/j.neucom.2026.132870","DOIUrl":"10.1016/j.neucom.2026.132870","url":null,"abstract":"<div><div>Current state-of-the-art post-deployment countermeasures for adversarial example mitigation (known as adversarial purification and detection) exhibit significant limitations: (1) insufficient generalization performance on various adversarial examples, (2) serious negative effects on benign samples (referred to as the decreased accuracy), and (3) extensive inference-time consumption, <em>etc</em>. These limitations considerably hinder their application in safety-critical real-world scenarios. To narrow these gaps, this paper proposes a novel post-deployment countermeasure named Random, Ensemble, and Simultaneous Purification-Detection Framework (RES-PDF). Specifically, inspired by the adversarial region migration phenomenon observed in adversarial purification, RES-PDF first extends this concept to a continuous adversarial region migration phenomenon and exploits it to establish a novel adversarial purification named Random Ensemble Adversarial Purification (REAP). Then, RES-PDF innovatively introduces a detection feature on REAP to enhance its purification performance further while simultaneously using a purification feature to improve its detection performance further. In RES-PDF, purification and detection can complement each other, achieving the effect of <span><math><mn>1</mn><mo>+</mo><mn>1</mn><mo>></mo><mn>2</mn></math></span>. Extensive experiments across different scenarios demonstrate that RES-PDF surpasses previous countermeasures in several key areas: (1) remarkably enhanced generalization performance on various adversarial examples, with an average improvement of <span><math><mo>></mo></math></span>10.0%; (2) minimal negative effects on benign samples, with a reduction of <span><math><mo><</mo></math></span>1.0%; and (3) significantly reduced inference-time consumption, reduced to the millisecond level, <em>etc</em>. In general, RES-PDF provides a novel and efficient post-deployment countermeasure for adversarial example mitigation in safety-critical real-world scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132870"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1016/j.neucom.2026.132920
Ziyang Yu, Xiaodong Gu
The task of change captioning focuses on generating detailed descriptions of fine-grained differences between a pair of similar images. Unlike single-image captioning, this task demands that the model not only thoroughly analyze the visual content but also accurately identify the regions where changes occur within the image pair. A significant challenge in this process is detecting changes amidst noise and viewpoint variations. To tackle this challenge, we propose a Dual-Token Contrastive Change Localizer, which decouples the changed and unchanged features of the image pair. Specifically, we utilize two distinct tokens to learn common features and difference features, guided by our common constraints and difference constraints, respectively. These tokens are then used to generate representations of the changed and unchanged regions, which are subsequently transformed into descriptive sentences via a transformer decoder. Additionally, we introduce a sigmoid loss to replace the traditional InfoNCE loss, enhancing the alignment between visual and textual features. Extensive experiments demonstrate that our model achieves state-of-the-art performance across various change scenarios.
{"title":"Vision-language alignment with sigmoid loss and dual-token contrastive change localizer for precise change captioning","authors":"Ziyang Yu, Xiaodong Gu","doi":"10.1016/j.neucom.2026.132920","DOIUrl":"10.1016/j.neucom.2026.132920","url":null,"abstract":"<div><div>The task of change captioning focuses on generating detailed descriptions of fine-grained differences between a pair of similar images. Unlike single-image captioning, this task demands that the model not only thoroughly analyze the visual content but also accurately identify the regions where changes occur within the image pair. A significant challenge in this process is detecting changes amidst noise and viewpoint variations. To tackle this challenge, we propose a Dual-Token Contrastive Change Localizer, which decouples the changed and unchanged features of the image pair. Specifically, we utilize two distinct tokens to learn common features and difference features, guided by our common constraints and difference constraints, respectively. These tokens are then used to generate representations of the changed and unchanged regions, which are subsequently transformed into descriptive sentences via a transformer decoder. Additionally, we introduce a sigmoid loss to replace the traditional InfoNCE loss, enhancing the alignment between visual and textual features. Extensive experiments demonstrate that our model achieves state-of-the-art performance across various change scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132920"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}