首页 > 最新文献

Neurocomputing最新文献

英文 中文
COMMANDing anomalies: Continual video anomaly detection via dual-memory and temporal mamba modeling 指挥异常:通过双存储器和时间曼巴建模持续视频异常检测
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-03 DOI: 10.1016/j.neucom.2026.132943
Yan Liu , Kaiju Li , Md Sabuj Khan , Jian Lang , Rongpei Hong , Kunpeng Zhang , Fan Zhou
Weakly supervised video anomaly detection (WSVAD) aims to localize frame-level anomalies using only video-level labels, offering scalability for large-scale surveillance systems. However, existing methods often struggle to adapt to previously unseen and continuously evolving anomaly patterns, limiting their practical applicability. This challenge necessitates the development of continual learning (CL) frameworks that support incremental adaptation while preserving previously acquired knowledge. To this end, we propose a novel CL-based framework, dubbed COMMAND, for WSVAD that enables robust and adaptive anomaly detection in dynamic environments. COMMAND incorporates TempMamba, a temporal modeling unit based on Mamba blocks, which effectively captures both short-range and long-range temporal dependencies essential for distinguishing normal and abnormal behavior. In addition, MemDualNet introduces a dual-memory mechanism that retains both short-term variations and long-term contextual information, facilitating more expressive temporal representations. The framework Notation++, a continual learning strategy that integrates memory replay with a composite loss function comprising contrastive, focal, and multiple-instance objectives to alleviate catastrophic forgetting. Experimental results on benchmark datasets such as UCF-Crime and ShanghaiTech validate the effectiveness of the proposed approach, demonstrating superior performance in adaptability, generalization, and anomaly localization compared to existing state-of-the-art methods.
弱监督视频异常检测(WSVAD)旨在仅使用视频级标签来定位帧级异常,为大规模监控系统提供可扩展性。然而,现有的方法往往难以适应以前看不见的和不断发展的异常模式,限制了它们的实际适用性。这一挑战要求开发持续学习(CL)框架,以支持渐进式适应,同时保留先前获得的知识。为此,我们提出了一种新的基于cl的WSVAD框架,称为COMMAND,它可以在动态环境中实现鲁棒和自适应的异常检测。COMMAND集成了TempMamba,一个基于Mamba块的时间建模单元,它有效地捕获了区分正常和异常行为所必需的短期和长期时间依赖性。此外,MemDualNet引入了一种双记忆机制,可以保留短期变化和长期上下文信息,从而促进更具表现力的时间表征。框架Notation++,一种持续学习策略,将记忆重放与包含对比、焦点和多实例目标的复合损失函数集成在一起,以减轻灾难性遗忘。在UCF-Crime和ShanghaiTech等基准数据集上的实验结果验证了所提出方法的有效性,与现有的最先进方法相比,该方法在适应性、泛化和异常定位方面表现出优越的性能。
{"title":"COMMANDing anomalies: Continual video anomaly detection via dual-memory and temporal mamba modeling","authors":"Yan Liu ,&nbsp;Kaiju Li ,&nbsp;Md Sabuj Khan ,&nbsp;Jian Lang ,&nbsp;Rongpei Hong ,&nbsp;Kunpeng Zhang ,&nbsp;Fan Zhou","doi":"10.1016/j.neucom.2026.132943","DOIUrl":"10.1016/j.neucom.2026.132943","url":null,"abstract":"<div><div>Weakly supervised video anomaly detection (WSVAD) aims to localize frame-level anomalies using only video-level labels, offering scalability for large-scale surveillance systems. However, existing methods often struggle to adapt to previously unseen and continuously evolving anomaly patterns, limiting their practical applicability. This challenge necessitates the development of continual learning (CL) frameworks that support incremental adaptation while preserving previously acquired knowledge. To this end, we propose a novel CL-based framework, dubbed <strong>COMMAND</strong>, for WSVAD that enables robust and adaptive anomaly detection in dynamic environments. COMMAND incorporates TempMamba, a temporal modeling unit based on Mamba blocks, which effectively captures both short-range and long-range temporal dependencies essential for distinguishing normal and abnormal behavior. In addition, MemDualNet introduces a dual-memory mechanism that retains both short-term variations and long-term contextual information, facilitating more expressive temporal representations. The framework Notation<span><math><mo>+</mo><mo>+</mo></math></span>, a continual learning strategy that integrates memory replay with a composite loss function comprising contrastive, focal, and multiple-instance objectives to alleviate catastrophic forgetting. Experimental results on benchmark datasets such as UCF-Crime and ShanghaiTech validate the effectiveness of the proposed approach, demonstrating superior performance in adaptability, generalization, and anomaly localization compared to existing state-of-the-art methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132943"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADML: Asymmetric deformation-guided mutual learning for semi-supervised medical image segmentation ADML:半监督医学图像分割的非对称变形引导相互学习
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-03 DOI: 10.1016/j.neucom.2026.132940
Yu Peng, Haoyu Zou, Kehu Yang, Qingqun Kong
Semi-supervised learning (SSL) has achieved remarkable success in medical image segmentation, with co-training paradigms standing out among existing approaches. However, most current methods use homogeneous network architectures, where identical inductive biases can cause confirmation bias, limiting performance. Additionally, these methods treat all pixels equally, failing to fully exploit the hidden information in complex regions. To overcome these issues, we propose an Asymmetric Deformation-guided Mutual Learning (ADML) framework. ADML builds an asymmetric dual-branch system consisting of a standard convolutional network (V-Net) and a deformable convolutional network (VNet-DCN), adding diverse inductive biases to provide heterogeneous supervisory signals. The core of our framework is asymmetric deformation-guided consistency learning (ADCL), which leverages the norm of the DCN offset field to measure local deformation complexity. This allows for the creation of spatial weight maps that adaptively modify pseudo-label weights, helping reduce confirmation bias and improve the reliability of pseudo-labels. Additionally, to enable knowledge transfer between the asymmetric models, we introduce a Cross-model Dynamic Feature Bank that stores high-confidence features and enforces alignment through a maximum mean discrepancy (MMD) loss, achieving deep semantic coherence between the two branches. Extensive experiments on three benchmarks show that ADML surpasses state-of-the-art methods, confirming its effectiveness in lowering annotation needs and enhancing segmentation accuracy.
半监督学习(SSL)在医学图像分割中取得了显著的成功,其中共同训练范式在现有方法中脱颖而出。然而,目前大多数方法使用同构网络架构,其中相同的归纳偏差可能导致确认偏差,从而限制性能。此外,这些方法对所有像素都一视同仁,无法充分利用复杂区域中的隐藏信息。为了克服这些问题,我们提出了一个非对称变形引导相互学习(ADML)框架。ADML构建了一个由标准卷积网络(V-Net)和可变形卷积网络(VNet-DCN)组成的非对称双分支系统,增加了多种归纳偏置,提供异构监控信号。我们的框架的核心是不对称变形引导一致性学习(ADCL),它利用DCN偏移场的范数来测量局部变形复杂性。这允许创建自适应修改伪标签权重的空间权重图,有助于减少确认偏差并提高伪标签的可靠性。此外,为了实现非对称模型之间的知识转移,我们引入了一个跨模型动态特征库,该库存储高置信度特征,并通过最大平均差异(MMD)损失来强制对齐,从而实现两个分支之间的深度语义一致性。在三个基准上的大量实验表明,ADML超越了最先进的方法,证实了其在降低注释需求和提高分割精度方面的有效性。
{"title":"ADML: Asymmetric deformation-guided mutual learning for semi-supervised medical image segmentation","authors":"Yu Peng,&nbsp;Haoyu Zou,&nbsp;Kehu Yang,&nbsp;Qingqun Kong","doi":"10.1016/j.neucom.2026.132940","DOIUrl":"10.1016/j.neucom.2026.132940","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) has achieved remarkable success in medical image segmentation, with co-training paradigms standing out among existing approaches. However, most current methods use homogeneous network architectures, where identical inductive biases can cause confirmation bias, limiting performance. Additionally, these methods treat all pixels equally, failing to fully exploit the hidden information in complex regions. To overcome these issues, we propose an Asymmetric Deformation-guided Mutual Learning (ADML) framework. ADML builds an asymmetric dual-branch system consisting of a standard convolutional network (V-Net) and a deformable convolutional network (VNet-DCN), adding diverse inductive biases to provide heterogeneous supervisory signals. The core of our framework is asymmetric deformation-guided consistency learning (ADCL), which leverages the norm of the DCN offset field to measure local deformation complexity. This allows for the creation of spatial weight maps that adaptively modify pseudo-label weights, helping reduce confirmation bias and improve the reliability of pseudo-labels. Additionally, to enable knowledge transfer between the asymmetric models, we introduce a Cross-model Dynamic Feature Bank that stores high-confidence features and enforces alignment through a maximum mean discrepancy (MMD) loss, achieving deep semantic coherence between the two branches. Extensive experiments on three benchmarks show that ADML surpasses state-of-the-art methods, confirming its effectiveness in lowering annotation needs and enhancing segmentation accuracy.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132940"},"PeriodicalIF":6.5,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TBMCGNet and TSDataset: A twin-branch multi-scale channel-gated network with a new benchmark for tooth segmentation TBMCGNet和TSDataset:一个双分支多尺度通道门控网络,具有新的牙齿分割基准
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132917
Yugen Yi , Yu Duan , Weixia Xu , Wei Deng , Hong Li , Longjun Huang , Siwei Luo , Yali Peng , Jiangyan Dai
Accurate tooth segmentation from oral panoramic images is crucial for addressing clinical oral health challenges. However, the complexity of oral structures and substantial variations in clinical data present significant difficulties, limiting the ability of existing deep learning models to capture fine-grained structural details. To address these limitations, we propose a Twin-Branch Multi-Scale Channel Gated Network (TBMCGNet), which introduces several targeted improvements over conventional architectures. Specifically, we design a Twin-Branch Complementary (TBC) module that integrates multi-scale convolutional layers with a Transformer structure. This hybrid architecture enables the model to effectively capture global contextual information while preserving the precise localization of dental boundaries in local regions. Furthermore, an adaptive feature fusion strategy is employed to optimally exploit cross-scale feature dependencies. Next, we introduce a Multi-Scale Channel Gated (MSCG) module to aggregate multi-level features from different encoder stages across multiple scales. This multi-scale fusion mechanism not only enhances the model’s capability to accurately delineate tooth boundaries but also effectively reduces semantic discrepancies between encoder stages. Consequently, the model achieves improved discrimination of subtle distinctions between edges and inter-tooth boundaries within complex oral configurations. Finally, we construct a novel oral panoramic image dataset to evaluate the effectiveness of tooth segmentation for both binary and multi-class tasks. Comprehensive experiments on public and proprietary datasets demonstrate that TBMCGNet outperforms current state-of-the-art approaches, achieving superior segmentation accuracy and robustness. The source code and datasets will be publicly available at: https://github.com/t-sukii/TBMCGNet.
从口腔全景图像中准确分割牙齿对于解决临床口腔健康挑战至关重要。然而,口腔结构的复杂性和临床数据的实质性变化带来了重大困难,限制了现有深度学习模型捕捉细粒度结构细节的能力。为了解决这些限制,我们提出了一个双分支多尺度通道门控网络(TBMCGNet),它在传统架构的基础上引入了几个有针对性的改进。具体而言,我们设计了一个双分支互补(TBC)模块,该模块将多尺度卷积层与变压器结构集成在一起。这种混合结构使模型能够有效地捕获全局上下文信息,同时保持局部区域牙齿边界的精确定位。此外,采用自适应特征融合策略优化利用跨尺度特征依赖关系。接下来,我们引入了一个多尺度通道门控(MSCG)模块来聚合来自不同编码器阶段的多尺度特征。这种多尺度融合机制不仅提高了模型准确描绘齿边界的能力,而且有效地减少了编码器阶段之间的语义差异。因此,该模型可以更好地识别复杂口腔结构中边缘和齿间边界之间的细微差别。最后,我们构建了一个新的口腔全景图像数据集,以评估牙齿分割在二分类和多分类任务中的有效性。在公共和专有数据集上的综合实验表明,TBMCGNet优于当前最先进的方法,实现了卓越的分割精度和鲁棒性。源代码和数据集将在:https://github.com/t-sukii/TBMCGNet上公开。
{"title":"TBMCGNet and TSDataset: A twin-branch multi-scale channel-gated network with a new benchmark for tooth segmentation","authors":"Yugen Yi ,&nbsp;Yu Duan ,&nbsp;Weixia Xu ,&nbsp;Wei Deng ,&nbsp;Hong Li ,&nbsp;Longjun Huang ,&nbsp;Siwei Luo ,&nbsp;Yali Peng ,&nbsp;Jiangyan Dai","doi":"10.1016/j.neucom.2026.132917","DOIUrl":"10.1016/j.neucom.2026.132917","url":null,"abstract":"<div><div>Accurate tooth segmentation from oral panoramic images is crucial for addressing clinical oral health challenges. However, the complexity of oral structures and substantial variations in clinical data present significant difficulties, limiting the ability of existing deep learning models to capture fine-grained structural details. To address these limitations, we propose a Twin-Branch Multi-Scale Channel Gated Network (TBMCGNet), which introduces several targeted improvements over conventional architectures. Specifically, we design a Twin-Branch Complementary (TBC) module that integrates multi-scale convolutional layers with a Transformer structure. This hybrid architecture enables the model to effectively capture global contextual information while preserving the precise localization of dental boundaries in local regions. Furthermore, an adaptive feature fusion strategy is employed to optimally exploit cross-scale feature dependencies. Next, we introduce a Multi-Scale Channel Gated (MSCG) module to aggregate multi-level features from different encoder stages across multiple scales. This multi-scale fusion mechanism not only enhances the model’s capability to accurately delineate tooth boundaries but also effectively reduces semantic discrepancies between encoder stages. Consequently, the model achieves improved discrimination of subtle distinctions between edges and inter-tooth boundaries within complex oral configurations. Finally, we construct a novel oral panoramic image dataset to evaluate the effectiveness of tooth segmentation for both binary and multi-class tasks. Comprehensive experiments on public and proprietary datasets demonstrate that TBMCGNet outperforms current state-of-the-art approaches, achieving superior segmentation accuracy and robustness. The source code and datasets will be publicly available at: <span><span>https://github.com/t-sukii/TBMCGNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132917"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparseness-optimized feature importance with prior knowledge and reinforcement learning-powered optimization 稀疏优化的特征重要性与先验知识和强化学习驱动的优化
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132925
Gonzalo Nápoles , Isel Grau , Yamisleydi Salgueiro
Sparseness Optimized Feature Importance (SOFI) is a post-hoc method that produces explanations using minimal feature sets, reducing the cognitive burden on human experts by highlighting only the most critical factors. In practice, explanations take the form of a ranking of features whose cumulative marginalization leads to rapid degradation in model performance. However, SOFI employs hill climbing for ranking optimization, which increases the risk of convergence to local optima when the number of features grows. In addition, like other mainstream explainers, SOFI lacks a mechanism for exploiting prior knowledge during optimization. In this paper, we propose Sparseness Optimized Feature Importance with Prior Knowledge (SOFI-P), an extension of SOFI that integrates prior knowledge into a reinforcement learning framework to optimize explanation sparsity. In this explainer, the exploration is guided by a probabilistic swapping strategy that maximizes model performance degradation under cumulative feature marginalization. Prior knowledge is incorporated as a learnable parameter vector, initially defined by domain experts and later updated during optimization. In addition, we derive upper bounds on the change in explanation sparsity induced by adjacent and arbitrary swaps in a feature ranking. The proposed theorems provide practical value by establishing concrete limits for expected explanation sparsity post-swapping, thereby characterizing the problem’s search space complexity. Empirical evaluation on 40 structured classification datasets shows that SOFI-P produces more sparse explanations than state-of-the-art explainers. Furthermore, ablation studies confirm the benefits of incorporating prior knowledge to guide reinforcement learning, even when such knowledge is imprecise. Toward the end, a case study on chest X-ray images illustrates the practical applicability of the method.
稀疏优化特征重要性(SOFI)是一种使用最小特征集生成解释的事后方法,通过只突出最关键的因素来减少人类专家的认知负担。在实践中,解释采取特征排序的形式,这些特征的累积边缘化导致模型性能的快速退化。然而,SOFI采用爬坡法进行排序优化,当特征数量增加时,这增加了收敛到局部最优的风险。此外,与其他主流解释器一样,SOFI缺乏在优化过程中利用先验知识的机制。在本文中,我们提出了基于先验知识的稀疏优化特征重要性(SOFI- p),这是SOFI的扩展,它将先验知识集成到强化学习框架中以优化解释稀疏性。在本解释器中,探索由概率交换策略指导,该策略在累积特征边缘化下最大化模型性能退化。先验知识作为一个可学习的参数向量,由领域专家定义,并在优化过程中更新。此外,我们推导了在特征排序中相邻交换和任意交换引起的解释稀疏性变化的上界。所提出的定理通过建立交换后预期解释稀疏性的具体限制,从而表征了问题的搜索空间复杂性,具有实用价值。对40个结构化分类数据集的实证评估表明,SOFI-P比最先进的解释器产生更稀疏的解释。此外,消融研究证实了将先验知识纳入指导强化学习的好处,即使这些知识并不精确。最后,以胸部x线图像为例,说明了该方法的实用性。
{"title":"Sparseness-optimized feature importance with prior knowledge and reinforcement learning-powered optimization","authors":"Gonzalo Nápoles ,&nbsp;Isel Grau ,&nbsp;Yamisleydi Salgueiro","doi":"10.1016/j.neucom.2026.132925","DOIUrl":"10.1016/j.neucom.2026.132925","url":null,"abstract":"<div><div>Sparseness Optimized Feature Importance (SOFI) is a post-hoc method that produces explanations using minimal feature sets, reducing the cognitive burden on human experts by highlighting only the most critical factors. In practice, explanations take the form of a ranking of features whose cumulative marginalization leads to rapid degradation in model performance. However, SOFI employs hill climbing for ranking optimization, which increases the risk of convergence to local optima when the number of features grows. In addition, like other mainstream explainers, SOFI lacks a mechanism for exploiting prior knowledge during optimization. In this paper, we propose Sparseness Optimized Feature Importance with Prior Knowledge (SOFI-P), an extension of SOFI that integrates prior knowledge into a reinforcement learning framework to optimize explanation sparsity. In this explainer, the exploration is guided by a probabilistic swapping strategy that maximizes model performance degradation under cumulative feature marginalization. Prior knowledge is incorporated as a learnable parameter vector, initially defined by domain experts and later updated during optimization. In addition, we derive upper bounds on the change in explanation sparsity induced by adjacent and arbitrary swaps in a feature ranking. The proposed theorems provide practical value by establishing concrete limits for expected explanation sparsity post-swapping, thereby characterizing the problem’s search space complexity. Empirical evaluation on 40 structured classification datasets shows that SOFI-P produces more sparse explanations than state-of-the-art explainers. Furthermore, ablation studies confirm the benefits of incorporating prior knowledge to guide reinforcement learning, even when such knowledge is imprecise. Toward the end, a case study on chest X-ray images illustrates the practical applicability of the method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132925"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quasi-random physics-informed neural networks 准随机物理信息神经网络
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132913
Tianchi Yu , Ivan Oseledets
Physics-informed neural networks have emerged as a powerful tool for solving partial differential equations (PDEs) by integrating physical constraints into neural network training, but their performance is sensitive to the sampling of points. Motivated by the impressive performance of quasi-Monte Carlo methods in high dimensional problems, this paper proposes Quasi-Random Physics-Informed Neural Networks (QRPINNs), which sample training points from low-discrepancy sequences instead of the domain of PDEs. Theoretically, QRPINNs are shown to exhibit a faster convergence rate than PINNs. Empirically, experiments demonstrate that QRPINNs outperform PINNs and some representative adaptive sampling methods in high-dimensional PDEs. Furthermore, combining QRPINNs with adaptive sampling further enhances both accuracy and efficiency.
通过将物理约束整合到神经网络训练中,物理信息神经网络已成为求解偏微分方程(PDEs)的强大工具,但其性能对点采样很敏感。基于准蒙特卡罗方法在高维问题上令人印象深刻的表现,本文提出了准随机物理信息神经网络(qrpinn),该网络从低差异序列中采样训练点,而不是从偏微分方程的域中采样。理论上,qrpinn比pinn具有更快的收敛速度。实验表明,在高维偏微分方程中,qrpinn优于pinn和一些代表性的自适应采样方法。此外,将qrpinn与自适应采样相结合,进一步提高了精度和效率。
{"title":"Quasi-random physics-informed neural networks","authors":"Tianchi Yu ,&nbsp;Ivan Oseledets","doi":"10.1016/j.neucom.2026.132913","DOIUrl":"10.1016/j.neucom.2026.132913","url":null,"abstract":"<div><div>Physics-informed neural networks have emerged as a powerful tool for solving partial differential equations (PDEs) by integrating physical constraints into neural network training, but their performance is sensitive to the sampling of points. Motivated by the impressive performance of quasi-Monte Carlo methods in high dimensional problems, this paper proposes Quasi-Random Physics-Informed Neural Networks (QRPINNs), which sample training points from low-discrepancy sequences instead of the domain of PDEs. Theoretically, QRPINNs are shown to exhibit a faster convergence rate than PINNs. Empirically, experiments demonstrate that QRPINNs outperform PINNs and some representative adaptive sampling methods in high-dimensional PDEs. Furthermore, combining QRPINNs with adaptive sampling further enhances both accuracy and efficiency.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132913"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of robotic manipulation: From bottom-up approaches to end-to-end paradigms with LLMs 机器人操作的调查:从自下而上的方法到端到端的llm范例
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132921
Kai Peng , Qing Li , Zhijian He , Bowen Zhang , Xianghua Fu , Bin Li , Xiaohui Wang , Zhi-Qi Cheng , Yan Yan , Xiaojiang Peng
Robotic manipulation enables robots to interact with and adapt to their environments, making it crucial for real-world intelligent applications. Recent advancements in Large Language Models (LLMs) have positioned them as transformative tools in robotic manipulation. By integrating vision, language, and action, LLM-based frameworks enhance reasoning, planning, and multimodal understanding, equipping robots to handle increasingly complex tasks. The Vision-Language-Action (VLA) paradigm, in particular, unifies perception, cognition, and execution, paving the way for generalized and versatile robotic systems. This paper provides a comprehensive survey of robotic manipulation, covering traditional bottom-up approaches and modern LLM-based methods. We emphasize recent LLM-based modular and end-to-end architectures, analyze benchmarks, datasets, and robotic hardware platforms. Additionally, we explore potential research directions to advance the field further. By synthesizing these developments, we aim to provide researchers and practitioners with a valuable resource to navigate this rapidly evolving domain and unlock the full potential of LLMs in robotic manipulation.
机器人操作使机器人能够与环境进行交互并适应环境,这对于现实世界的智能应用至关重要。大型语言模型(llm)的最新进展将它们定位为机器人操作的变革性工具。通过整合视觉、语言和行动,基于法学硕士的框架增强了推理、规划和多模态理解,使机器人能够处理日益复杂的任务。尤其是视觉-语言-行动(VLA)范式,它统一了感知、认知和执行,为通用和通用的机器人系统铺平了道路。本文提供了一个全面的调查机器人操作,涵盖传统的自下而上的方法和现代法学硕士为基础的方法。我们强调最近基于法学硕士的模块化和端到端架构,分析基准测试、数据集和机器人硬件平台。此外,我们还探索了潜在的研究方向,以进一步推动该领域的发展。通过综合这些发展,我们的目标是为研究人员和实践者提供宝贵的资源,以导航这个快速发展的领域,并释放机器人操作法学硕士的全部潜力。
{"title":"A survey of robotic manipulation: From bottom-up approaches to end-to-end paradigms with LLMs","authors":"Kai Peng ,&nbsp;Qing Li ,&nbsp;Zhijian He ,&nbsp;Bowen Zhang ,&nbsp;Xianghua Fu ,&nbsp;Bin Li ,&nbsp;Xiaohui Wang ,&nbsp;Zhi-Qi Cheng ,&nbsp;Yan Yan ,&nbsp;Xiaojiang Peng","doi":"10.1016/j.neucom.2026.132921","DOIUrl":"10.1016/j.neucom.2026.132921","url":null,"abstract":"<div><div>Robotic manipulation enables robots to interact with and adapt to their environments, making it crucial for real-world intelligent applications. Recent advancements in Large Language Models (LLMs) have positioned them as transformative tools in robotic manipulation. By integrating vision, language, and action, LLM-based frameworks enhance reasoning, planning, and multimodal understanding, equipping robots to handle increasingly complex tasks. The Vision-Language-Action (VLA) paradigm, in particular, unifies perception, cognition, and execution, paving the way for generalized and versatile robotic systems. This paper provides a comprehensive survey of robotic manipulation, covering traditional bottom-up approaches and modern LLM-based methods. We emphasize recent LLM-based modular and end-to-end architectures, analyze benchmarks, datasets, and robotic hardware platforms. Additionally, we explore potential research directions to advance the field further. By synthesizing these developments, we aim to provide researchers and practitioners with a valuable resource to navigate this rapidly evolving domain and unlock the full potential of LLMs in robotic manipulation.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132921"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressively attentional architecture search 逐步注意建筑搜索
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132923
Xuanmian Liu , Xianping Qin , Shu Li , Fuchang Zhang , Rachid Hedjam , Guoqiang Zhong
Recently, differentiable architecture search has become one of the hotspots in the field of neural architecture search (NAS). However, this paradigm suffers from a critical inconsistency problem: the architecture optimized in the continuous space often collapses significantly after discretization. This discrepancy not only renders the computational resources spent on searching futile but also leads to derived architectures that fail to generalize to complex tasks, severely limiting the practical deployability of differentiable NAS. To address this problem and alleviate the well-known performance collapse in existing differentiable search approaches, we propose a new attention-guided differentiable NAS method, called progressively attentional architecture search (PAAS). In the implementation of PAAS, simultaneously considering performance, parameter quantity, and operation independence, we design a novel search space to improve the upper limit of the structural performance from the source of the NAS process. Moreover, we propose a new attention-guided architecture search paradigm, embedding attention modules to help distinguish the significant parts of the learned architectures, which effectively mitigates the optimization collapse at a granular level and the uncertainty of the architecture selection process caused by using only architecture parameters. In addition, we propose a progressive discretization strategy to bridge the structural gap between the search and evaluation stages, which mitigates the performance gap between the super-network and discrete architectures. Extensive experiments demonstrate that PAAS achieves a 2.47% error rate on CIFAR-10 with only 0.4 GPU days, outperforming state-of-the-art methods such as DARTS (2.76%) and DrNAS (2.54%) in both accuracy and efficiency. When transferred to ImageNet, it attains a 24.2% top-1 error, surpassing robust baselines such as PC-DARTS (25.1%) and ProxylessNAS (24.9%), thereby validating its strong cross-dataset generalization.
近年来,可微结构搜索已成为神经结构搜索(NAS)领域的研究热点之一。然而,这种范式存在一个关键的不一致性问题:在连续空间中优化的体系结构在离散化后往往会严重崩溃。这种差异不仅使花费在搜索上的计算资源变得无用,而且还导致派生的体系结构无法推广到复杂的任务,严重限制了可微分NAS的实际部署能力。为了解决这一问题并缓解现有可微搜索方法中众所周知的性能崩溃,我们提出了一种新的注意力引导的可微NAS方法,称为渐进式注意力架构搜索(PAAS)。在PAAS的实现中,同时考虑性能、参数数量和操作独立性,我们设计了一种新的搜索空间,从NAS过程的源头提高结构性能的上限。此外,我们提出了一种新的注意力引导的架构搜索范式,通过嵌入注意力模块来帮助区分学习到的架构的重要部分,有效地缓解了优化在粒度层面上的崩溃,以及仅使用架构参数导致的架构选择过程的不确定性。此外,我们提出了一种渐进式离散化策略来弥合搜索和评估阶段之间的结构差距,从而减轻了超级网络和离散架构之间的性能差距。广泛的实验表明,PAAS在CIFAR-10上实现了2.47%的错误率,仅用了0.4个GPU天,在准确性和效率方面都优于最先进的方法,如DARTS(2.76%)和DrNAS(2.54%)。当转移到ImageNet时,它达到了24.2%的前1误差,超过了PC-DARTS(25.1%)和ProxylessNAS(24.9%)等稳健基线,从而验证了其强大的跨数据集泛化。
{"title":"Progressively attentional architecture search","authors":"Xuanmian Liu ,&nbsp;Xianping Qin ,&nbsp;Shu Li ,&nbsp;Fuchang Zhang ,&nbsp;Rachid Hedjam ,&nbsp;Guoqiang Zhong","doi":"10.1016/j.neucom.2026.132923","DOIUrl":"10.1016/j.neucom.2026.132923","url":null,"abstract":"<div><div>Recently, differentiable architecture search has become one of the hotspots in the field of neural architecture search (NAS). However, this paradigm suffers from a critical inconsistency problem: the architecture optimized in the continuous space often collapses significantly after discretization. This discrepancy not only renders the computational resources spent on searching futile but also leads to derived architectures that fail to generalize to complex tasks, severely limiting the practical deployability of differentiable NAS. To address this problem and alleviate the well-known performance collapse in existing differentiable search approaches, we propose a new attention-guided differentiable NAS method, called progressively attentional architecture search (PAAS). In the implementation of PAAS, simultaneously considering performance, parameter quantity, and operation independence, we design a novel search space to improve the upper limit of the structural performance from the source of the NAS process. Moreover, we propose a new attention-guided architecture search paradigm, embedding attention modules to help distinguish the significant parts of the learned architectures, which effectively mitigates the optimization collapse at a granular level and the uncertainty of the architecture selection process caused by using only architecture parameters. In addition, we propose a progressive discretization strategy to bridge the structural gap between the search and evaluation stages, which mitigates the performance gap between the super-network and discrete architectures. Extensive experiments demonstrate that PAAS achieves a 2.47% error rate on CIFAR-10 with only 0.4 GPU days, outperforming state-of-the-art methods such as DARTS (2.76%) and DrNAS (2.54%) in both accuracy and efficiency. When transferred to ImageNet, it attains a 24.2% top-1 error, surpassing robust baselines such as PC-DARTS (25.1%) and ProxylessNAS (24.9%), thereby validating its strong cross-dataset generalization.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132923"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RES-PDF: A random, ensemble, and simultaneous purification-detection framework for adversarial example mitigation RES-PDF:用于对抗性示例缓解的随机、集成和同步净化检测框架
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132870
Rui Yang , Qindong Sun , Han Cao , Kai Lin , Chao Shen
Current state-of-the-art post-deployment countermeasures for adversarial example mitigation (known as adversarial purification and detection) exhibit significant limitations: (1) insufficient generalization performance on various adversarial examples, (2) serious negative effects on benign samples (referred to as the decreased accuracy), and (3) extensive inference-time consumption, etc. These limitations considerably hinder their application in safety-critical real-world scenarios. To narrow these gaps, this paper proposes a novel post-deployment countermeasure named Random, Ensemble, and Simultaneous Purification-Detection Framework (RES-PDF). Specifically, inspired by the adversarial region migration phenomenon observed in adversarial purification, RES-PDF first extends this concept to a continuous adversarial region migration phenomenon and exploits it to establish a novel adversarial purification named Random Ensemble Adversarial Purification (REAP). Then, RES-PDF innovatively introduces a detection feature on REAP to enhance its purification performance further while simultaneously using a purification feature to improve its detection performance further. In RES-PDF, purification and detection can complement each other, achieving the effect of 1+1>2. Extensive experiments across different scenarios demonstrate that RES-PDF surpasses previous countermeasures in several key areas: (1) remarkably enhanced generalization performance on various adversarial examples, with an average improvement of >10.0%; (2) minimal negative effects on benign samples, with a reduction of <1.0%; and (3) significantly reduced inference-time consumption, reduced to the millisecond level, etc. In general, RES-PDF provides a novel and efficient post-deployment countermeasure for adversarial example mitigation in safety-critical real-world scenarios.
目前最先进的对抗性示例缓解部署后对策(称为对抗性净化和检测)显示出明显的局限性:(1)对各种对抗性示例的泛化性能不足,(2)对良性样本的严重负面影响(称为准确性降低),以及(3)大量的推断时间消耗等。这些限制极大地阻碍了它们在安全关键的现实场景中的应用。为了缩小这些差距,本文提出了一种新的部署后对策,称为随机,集成和同步净化检测框架(RES-PDF)。具体来说,受对抗性净化中观察到的对抗性区域迁移现象的启发,RES-PDF首先将这一概念扩展到连续的对抗性区域迁移现象,并利用它建立了一种新的对抗性净化,称为随机集成对抗性净化(REAP)。然后,RES-PDF创新地在REAP上引入了一个检测功能,以进一步提高其净化性能,同时使用净化功能进一步提高其检测性能。在RES-PDF中,纯化和检测可以互补,达到1+1>;2的效果。在不同场景下的大量实验表明,RES-PDF在几个关键领域优于以往的对策:(1)在各种对抗示例上的泛化性能显著提高,平均提高了10.0%;(2)对良性样品的负面影响最小,减少了1.0%;(3)显著降低了推理时间消耗,降低到毫秒级等。一般来说,RES-PDF为安全关键的现实场景中的对抗性示例缓解提供了一种新颖而有效的部署后对策。
{"title":"RES-PDF: A random, ensemble, and simultaneous purification-detection framework for adversarial example mitigation","authors":"Rui Yang ,&nbsp;Qindong Sun ,&nbsp;Han Cao ,&nbsp;Kai Lin ,&nbsp;Chao Shen","doi":"10.1016/j.neucom.2026.132870","DOIUrl":"10.1016/j.neucom.2026.132870","url":null,"abstract":"<div><div>Current state-of-the-art post-deployment countermeasures for adversarial example mitigation (known as adversarial purification and detection) exhibit significant limitations: (1) insufficient generalization performance on various adversarial examples, (2) serious negative effects on benign samples (referred to as the decreased accuracy), and (3) extensive inference-time consumption, <em>etc</em>. These limitations considerably hinder their application in safety-critical real-world scenarios. To narrow these gaps, this paper proposes a novel post-deployment countermeasure named Random, Ensemble, and Simultaneous Purification-Detection Framework (RES-PDF). Specifically, inspired by the adversarial region migration phenomenon observed in adversarial purification, RES-PDF first extends this concept to a continuous adversarial region migration phenomenon and exploits it to establish a novel adversarial purification named Random Ensemble Adversarial Purification (REAP). Then, RES-PDF innovatively introduces a detection feature on REAP to enhance its purification performance further while simultaneously using a purification feature to improve its detection performance further. In RES-PDF, purification and detection can complement each other, achieving the effect of <span><math><mn>1</mn><mo>+</mo><mn>1</mn><mo>&gt;</mo><mn>2</mn></math></span>. Extensive experiments across different scenarios demonstrate that RES-PDF surpasses previous countermeasures in several key areas: (1) remarkably enhanced generalization performance on various adversarial examples, with an average improvement of <span><math><mo>&gt;</mo></math></span>10.0%; (2) minimal negative effects on benign samples, with a reduction of <span><math><mo>&lt;</mo></math></span>1.0%; and (3) significantly reduced inference-time consumption, reduced to the millisecond level, <em>etc</em>. In general, RES-PDF provides a novel and efficient post-deployment countermeasure for adversarial example mitigation in safety-critical real-world scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132870"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-language alignment with sigmoid loss and dual-token contrastive change localizer for precise change captioning 具有s形损失和双标记对比变化定位器的视觉语言对齐,用于精确的变化字幕
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132920
Ziyang Yu, Xiaodong Gu
The task of change captioning focuses on generating detailed descriptions of fine-grained differences between a pair of similar images. Unlike single-image captioning, this task demands that the model not only thoroughly analyze the visual content but also accurately identify the regions where changes occur within the image pair. A significant challenge in this process is detecting changes amidst noise and viewpoint variations. To tackle this challenge, we propose a Dual-Token Contrastive Change Localizer, which decouples the changed and unchanged features of the image pair. Specifically, we utilize two distinct tokens to learn common features and difference features, guided by our common constraints and difference constraints, respectively. These tokens are then used to generate representations of the changed and unchanged regions, which are subsequently transformed into descriptive sentences via a transformer decoder. Additionally, we introduce a sigmoid loss to replace the traditional InfoNCE loss, enhancing the alignment between visual and textual features. Extensive experiments demonstrate that our model achieves state-of-the-art performance across various change scenarios.
更改标题的任务侧重于生成一对相似图像之间细粒度差异的详细描述。与单图像字幕不同,该任务要求模型不仅要彻底分析视觉内容,还要准确识别图像对中发生变化的区域。这个过程中的一个重大挑战是在噪声和视点变化中检测变化。为了解决这个问题,我们提出了一个双令牌对比更改定位器,它将图像对的更改和未更改特征解耦。具体来说,我们分别在共同约束和差异约束的指导下,使用两个不同的标记来学习共同特征和差异特征。然后使用这些标记生成已更改和未更改区域的表示,这些表示随后通过转换器解码器转换为描述性句子。此外,我们引入了s形损失来取代传统的InfoNCE损失,增强了视觉和文本特征之间的一致性。大量的实验证明,我们的模型在各种变更场景中实现了最先进的性能。
{"title":"Vision-language alignment with sigmoid loss and dual-token contrastive change localizer for precise change captioning","authors":"Ziyang Yu,&nbsp;Xiaodong Gu","doi":"10.1016/j.neucom.2026.132920","DOIUrl":"10.1016/j.neucom.2026.132920","url":null,"abstract":"<div><div>The task of change captioning focuses on generating detailed descriptions of fine-grained differences between a pair of similar images. Unlike single-image captioning, this task demands that the model not only thoroughly analyze the visual content but also accurately identify the regions where changes occur within the image pair. A significant challenge in this process is detecting changes amidst noise and viewpoint variations. To tackle this challenge, we propose a Dual-Token Contrastive Change Localizer, which decouples the changed and unchanged features of the image pair. Specifically, we utilize two distinct tokens to learn common features and difference features, guided by our common constraints and difference constraints, respectively. These tokens are then used to generate representations of the changed and unchanged regions, which are subsequently transformed into descriptive sentences via a transformer decoder. Additionally, we introduce a sigmoid loss to replace the traditional InfoNCE loss, enhancing the alignment between visual and textual features. Extensive experiments demonstrate that our model achieves state-of-the-art performance across various change scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"674 ","pages":"Article 132920"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community detection in the multi-view stochastic block model 多视图随机块模型中的社区检测
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-02 DOI: 10.1016/j.neucom.2026.132922
Yexin Zhang , Zhongtian Ma , Qiaosheng Zhang , Zhen Wang , Xuelong LI
This paper studies community detection in correlated multi-view graphs from an information-theoretic perspective. We consider multi-view graphs observed from D views on a common node set, where edge variables across views may be statistically dependent. To capture inter-graph correlations, we propose a random graph model called the multi-view stochastic block model (MVSBM), which generates D graphs over n nodes partitioned into two equal-sized communities. For each pair of nodes (i,j), the presence or absence of edges across the D graphs depends on whether i and j belong to the same community. Our goal is to exactly recover the hidden communities from the observed graphs. Our contributions are three-fold. First, we establish an information-theoretic achievability result (Theorem 1), showing that exact recovery is possible when the MVSBM parameters exceed a critical threshold. Second, we derive a matching converse (Theorem 2), proving that below this threshold any estimator has an expected number of misclassified nodes greater than one. Together, these results yield a sharp threshold for exact recovery. Third, we develop a computationally efficient spectral clustering algorithm with a local refinement step. Experiments on MVSBM-generated graphs demonstrate a phase transition that closely matches the theoretical threshold and show that the proposed method outperforms several baselines. Overall, our results delineate the fundamental limits of community detection in correlated multi-view graphs.
本文从信息论的角度研究了关联多视图图中的社区检测问题。我们考虑在一个公共节点集上从D个视图观察到的多视图图,其中视图之间的边缘变量可能是统计相关的。为了捕获图间的相关性,我们提出了一种称为多视图随机块模型(MVSBM)的随机图模型,该模型在n个节点上生成D个图,这些节点被划分为两个大小相等的社区。对于每一对节点(i,j), D图上是否存在边取决于i和j是否属于同一个群落。我们的目标是从观察到的图中准确地恢复隐藏群落。我们的贡献有三方面。首先,我们建立了一个信息论可达性结果(定理1),表明当MVSBM参数超过临界阈值时,精确恢复是可能的。其次,我们推导了一个匹配逆(定理2),证明在这个阈值以下,任何估计量都有大于1的误分类节点的期望数目。总之,这些结果为精确恢复提供了一个尖锐的阈值。第三,我们开发了一种具有局部细化步骤的计算效率高的谱聚类算法。在mvsbm生成的图上的实验表明,相变与理论阈值非常接近,并且表明该方法优于几个基线。总的来说,我们的结果描述了相关多视图图中社区检测的基本限制。
{"title":"Community detection in the multi-view stochastic block model","authors":"Yexin Zhang ,&nbsp;Zhongtian Ma ,&nbsp;Qiaosheng Zhang ,&nbsp;Zhen Wang ,&nbsp;Xuelong LI","doi":"10.1016/j.neucom.2026.132922","DOIUrl":"10.1016/j.neucom.2026.132922","url":null,"abstract":"<div><div>This paper studies community detection in correlated multi-view graphs from an information-theoretic perspective. We consider multi-view graphs observed from <span><math><mi>D</mi></math></span> views on a common node set, where edge variables across views may be statistically dependent. To capture inter-graph correlations, we propose a random graph model called the multi-view stochastic block model (MVSBM), which generates <span><math><mi>D</mi></math></span> graphs over <span><math><mi>n</mi></math></span> nodes partitioned into two equal-sized communities. For each pair of nodes <span><math><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></math></span>, the presence or absence of edges across the <span><math><mi>D</mi></math></span> graphs depends on whether <span><math><mi>i</mi></math></span> and <span><math><mi>j</mi></math></span> belong to the same community. Our goal is to exactly recover the hidden communities from the observed graphs. Our contributions are three-fold. First, we establish an information-theoretic achievability result (<span><span>Theorem 1</span></span>), showing that exact recovery is possible when the MVSBM parameters exceed a critical threshold. Second, we derive a matching converse (<span><span>Theorem 2</span></span>), proving that below this threshold any estimator has an expected number of misclassified nodes greater than one. Together, these results yield a sharp threshold for exact recovery. Third, we develop a computationally efficient spectral clustering algorithm with a local refinement step. Experiments on MVSBM-generated graphs demonstrate a phase transition that closely matches the theoretical threshold and show that the proposed method outperforms several baselines. Overall, our results delineate the fundamental limits of community detection in correlated multi-view graphs.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"675 ","pages":"Article 132922"},"PeriodicalIF":6.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146147420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1