首页 > 最新文献

Neurocomputing最新文献

英文 中文
DS-MVC: A deep multi-view clustering method based on dynamic confidence fusion and differential guidance DS-MVC:一种基于动态置信度融合和差分制导的深度多视图聚类方法
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-17 DOI: 10.1016/j.neucom.2026.133079
Guanghong Zhou , Yong Wang , Xingyuan Ji , Chao Kong , Guifu Lu
Multi-view clustering aims to uncover discriminative data structures by integrating complementary information from multiple feature views. However, existing approaches often encounter several limitations: they struggle to handle high-dimensional and heterogeneous features, tend to assign suboptimal weights to different views, suffer from severe feature redundancy, and fail to effectively account for variations in view quality. These issues can substantially degrade clustering performance. To address these challenges, we propose a novel deep multi-view clustering framework, named DS-MVC. The proposed approach incorporates an enhanced feature fusion strategy and a multi-scale view contrastive learning scheme. First, we propose Dynamic View-Confidence Fusion, operating at the feature level. Specifically, we estimate the prediction confidence of each sample in each view and assign adaptive, sample-specific weights accordingly. This mechanism effectively emphasizes high-quality views while suppressing the influence of noisy or low-quality views, thereby enabling more accurate and fine-grained feature integration. Second, we propose Multi-Scale View Contrastive Learning, which leverages inter-view discrepancies to guide representation learning. By constructing hierarchical contrastive objectives based on prediction discrepancies between samples, the model is able to capture underlying structural relationships and contextual dependencies across views, leading to richer and more discriminative representations. Extensive experiments on multiple benchmark datasets demonstrate that DS-MVC achieves superior performance in terms of clustering accuracy and robustness. Furthermore, ablation studies validate the effectiveness of each component and confirm the generalization capability of the proposed framework.
多视图聚类旨在通过整合来自多个特征视图的互补信息来发现有区别的数据结构。然而,现有的方法经常遇到一些限制:它们难以处理高维和异构特征,倾向于为不同的视图分配次优权重,遭受严重的特征冗余,并且不能有效地解释视图质量的变化。这些问题会大大降低集群性能。为了解决这些挑战,我们提出了一种新的深度多视图聚类框架,称为DS-MVC。该方法结合了一种增强的特征融合策略和一种多尺度视图对比学习方案。首先,我们提出了动态视图-置信度融合,在特征级别上操作。具体来说,我们估计每个视图中每个样本的预测置信度,并相应地分配自适应的样本特定权重。该机制有效地强调了高质量视图,同时抑制了噪声或低质量视图的影响,从而实现了更精确和细粒度的特征集成。第二,我们提出了多尺度视角对比学习,它利用访谈视角差异来指导表征学习。通过基于样本之间的预测差异构建分层对比目标,该模型能够捕获视图之间的潜在结构关系和上下文依赖关系,从而获得更丰富、更具判别性的表示。在多个基准数据集上的大量实验表明,DS-MVC在聚类精度和鲁棒性方面都取得了优异的性能。此外,烧蚀研究验证了每个组件的有效性,并证实了所提出框架的泛化能力。
{"title":"DS-MVC: A deep multi-view clustering method based on dynamic confidence fusion and differential guidance","authors":"Guanghong Zhou ,&nbsp;Yong Wang ,&nbsp;Xingyuan Ji ,&nbsp;Chao Kong ,&nbsp;Guifu Lu","doi":"10.1016/j.neucom.2026.133079","DOIUrl":"10.1016/j.neucom.2026.133079","url":null,"abstract":"<div><div>Multi-view clustering aims to uncover discriminative data structures by integrating complementary information from multiple feature views. However, existing approaches often encounter several limitations: they struggle to handle high-dimensional and heterogeneous features, tend to assign suboptimal weights to different views, suffer from severe feature redundancy, and fail to effectively account for variations in view quality. These issues can substantially degrade clustering performance. To address these challenges, we propose a novel deep multi-view clustering framework, named DS-MVC. The proposed approach incorporates an enhanced feature fusion strategy and a multi-scale view contrastive learning scheme. First, we propose Dynamic View-Confidence Fusion, operating at the feature level. Specifically, we estimate the prediction confidence of each sample in each view and assign adaptive, sample-specific weights accordingly. This mechanism effectively emphasizes high-quality views while suppressing the influence of noisy or low-quality views, thereby enabling more accurate and fine-grained feature integration. Second, we propose Multi-Scale View Contrastive Learning, which leverages inter-view discrepancies to guide representation learning. By constructing hierarchical contrastive objectives based on prediction discrepancies between samples, the model is able to capture underlying structural relationships and contextual dependencies across views, leading to richer and more discriminative representations. Extensive experiments on multiple benchmark datasets demonstrate that DS-MVC achieves superior performance in terms of clustering accuracy and robustness. Furthermore, ablation studies validate the effectiveness of each component and confirm the generalization capability of the proposed framework.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133079"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Label distribution learning via implicit distribution representation 通过隐式分布表示学习标签分布
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-11 DOI: 10.1016/j.neucom.2026.133029
Zhuoran Zheng , Han Hu , Xin Su , Chen Lyu
In contrast to multi-label learning, label distribution learning characterizes the polysemy of examples by a label distribution to represent richer semantics. In the learning process of label distribution, the training data is collected mainly through manual annotation or label enhancement algorithms to generate label distribution. Unfortunately, the complexity of the manual annotation task or the inaccuracy of the label enhancement algorithm leads to noise and uncertainty in the label distribution training set. To alleviate this problem, we introduce the implicit distribution in the label distribution learning framework to characterize the uncertainty of each label value. Specifically, we use deep implicit representation learning to construct a label distribution matrix with Gaussian prior constraints, where each row component corresponds to the distribution estimate of each label value, and this row component is constrained by a prior Gaussian distribution to moderate the noise and uncertainty interference in the label distribution dataset. Finally, each row component of the label distribution matrix is transformed into a standard label distribution form by using the self-attention algorithm. We evaluate our model using several representative metrics, such as Chebyshev distance (0.0779 ± 0.0021) and KL divergence (0.0404 ± 0.0020), and demonstrate that our method achieves significant improvements in performance, mitigating noise and enhancing label distribution accuracy. The code is publicly available at: https://github.com/WaterHQH/SNNGCN.
与多标签学习相比,标签分布学习通过标签分布来表征示例的多义性,以表示更丰富的语义。在标签分布的学习过程中,主要通过人工标注或标签增强算法收集训练数据,生成标签分布。不幸的是,手工标注任务的复杂性或标签增强算法的不准确性导致标签分布训练集中存在噪声和不确定性。为了缓解这一问题,我们在标签分布学习框架中引入隐式分布来表征每个标签值的不确定性。具体而言,我们使用深度隐式表示学习构建了一个具有高斯先验约束的标签分布矩阵,其中每个行分量对应于每个标签值的分布估计,并且该行分量受到高斯先验分布的约束,以缓和标签分布数据集中的噪声和不确定性干扰。最后,利用自关注算法将标签分布矩阵的每一行分量转换成标准的标签分布形式。我们使用几个代表性指标,如切比雪夫距离(0.0779±0.0021)和KL散度(0.0404±0.0020)来评估我们的模型,并证明我们的方法在性能上取得了显著的改进,降低了噪声,提高了标签分布的准确性。该代码可在https://github.com/WaterHQH/SNNGCN公开获取。
{"title":"Label distribution learning via implicit distribution representation","authors":"Zhuoran Zheng ,&nbsp;Han Hu ,&nbsp;Xin Su ,&nbsp;Chen Lyu","doi":"10.1016/j.neucom.2026.133029","DOIUrl":"10.1016/j.neucom.2026.133029","url":null,"abstract":"<div><div>In contrast to multi-label learning, label distribution learning characterizes the polysemy of examples by a label distribution to represent richer semantics. In the learning process of label distribution, the training data is collected mainly through manual annotation or label enhancement algorithms to generate label distribution. Unfortunately, the complexity of the manual annotation task or the inaccuracy of the label enhancement algorithm leads to noise and uncertainty in the label distribution training set. To alleviate this problem, we introduce the implicit distribution in the label distribution learning framework to characterize the uncertainty of each label value. Specifically, we use deep implicit representation learning to construct a label distribution matrix with Gaussian prior constraints, where each row component corresponds to the distribution estimate of each label value, and this row component is constrained by a prior Gaussian distribution to moderate the noise and uncertainty interference in the label distribution dataset. Finally, each row component of the label distribution matrix is transformed into a standard label distribution form by using the self-attention algorithm. We evaluate our model using several representative metrics, such as Chebyshev distance (0.0779 <span><math><mo>±</mo></math></span> 0.0021) and KL divergence (0.0404 <span><math><mo>±</mo></math></span> 0.0020), and demonstrate that our method achieves significant improvements in performance, mitigating noise and enhancing label distribution accuracy. The code is publicly available at: <span><span>https://github.com/WaterHQH/SNNGCN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133029"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finite-time bipartite output synchronization and H∞ bipartite output synchronization for multi-weighted coupled memristive neural networks 多加权耦合记忆神经网络的有限时间二部输出同步和H∞二部输出同步
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-03 DOI: 10.1016/j.neucom.2026.132721
Hong-An Tang , Zi-Yi Xia , Xiaofang Hu , Shukai Duan , Lidan Wang
This article focuses on the finite-time bipartite output synchronization (FTBOS) and finite-time H bipartite output synchronization (FTHBOS) of multi-weighted coupled memristive neural networks (CMNNs). Firstly, as an extension of the concept of finite-time bipartite synchronization, a new definition of FTBOS for CMNNs is proposed. Compared to bipartite synchronization, FTBOS in CMNNs means that only partial states are required to achieve bipartite synchronization within a finite time interval. Secondly, considering that many researchers have studied the state synchronization of single weighted CMNNs but have not yet discussed the FTBOS of multi-weighted CMNNs, this article uses the proposed definition to solve the FTBOS and adaptive FTBOS problems for multi-weighted CMNNs. Thirdly, an output feedback control scheme is utilized to investigate the FTHBOS of multi-weighted CMNNs, and some different adaptive laws are designed according to different coupling weights to obtain the adaptive FTHBOS criterion of multi-weighted CMNNs. Eventually, the feasibility of the theoretical results is illustrated by two numerical examples.
本文主要研究多加权耦合记忆神经网络的有限时间二部输出同步(FTBOS)和有限时间H∞二部输出同步(FTHBOS)。首先,作为有限时间二部同步概念的扩展,提出了面向cmnn的FTBOS的新定义。与两部分同步相比,cmnn中的FTBOS意味着在有限的时间间隔内只需要部分状态就可以实现两部分同步。其次,考虑到许多研究人员对单加权cmnn的状态同步进行了研究,但对多加权cmnn的FTBOS尚未进行讨论,本文采用提出的定义解决多加权cmnn的FTBOS和自适应FTBOS问题。第三,采用输出反馈控制方法研究了多加权cmnn的FTHBOS,并根据不同的耦合权设计了不同的自适应律,得到了多加权cmnn的自适应FTHBOS准则。最后,通过两个算例说明了理论结果的可行性。
{"title":"Finite-time bipartite output synchronization and H∞ bipartite output synchronization for multi-weighted coupled memristive neural networks","authors":"Hong-An Tang ,&nbsp;Zi-Yi Xia ,&nbsp;Xiaofang Hu ,&nbsp;Shukai Duan ,&nbsp;Lidan Wang","doi":"10.1016/j.neucom.2026.132721","DOIUrl":"10.1016/j.neucom.2026.132721","url":null,"abstract":"<div><div>This article focuses on the finite-time bipartite output synchronization (FTBOS) and finite-time <span><math><msub><mi>H</mi><mrow><mi>∞</mi></mrow></msub></math></span> bipartite output synchronization (FTHBOS) of multi-weighted coupled memristive neural networks (CMNNs). Firstly, as an extension of the concept of finite-time bipartite synchronization, a new definition of FTBOS for CMNNs is proposed. Compared to bipartite synchronization, FTBOS in CMNNs means that only partial states are required to achieve bipartite synchronization within a finite time interval. Secondly, considering that many researchers have studied the state synchronization of single weighted CMNNs but have not yet discussed the FTBOS of multi-weighted CMNNs, this article uses the proposed definition to solve the FTBOS and adaptive FTBOS problems for multi-weighted CMNNs. Thirdly, an output feedback control scheme is utilized to investigate the FTHBOS of multi-weighted CMNNs, and some different adaptive laws are designed according to different coupling weights to obtain the adaptive FTHBOS criterion of multi-weighted CMNNs. Eventually, the feasibility of the theoretical results is illustrated by two numerical examples.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 132721"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wrinkles in time: Multi-scale patching and super-resolution for efficient time series forecasting 时间上的皱纹:多尺度修补和超分辨率,用于有效的时间序列预测
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-14 DOI: 10.1016/j.neucom.2026.133021
Yuwei Chen, Wenjing Jia, Qiang Wu
Time series forecasting has become an indispensable part of many fields, from weather prediction to equipment failure forecasting, where accuracy and efficiency directly impact the quality and speed of decision-making. Recently, models with self-attention mechanisms, such as Transformers, have gained widespread attention due to their strong performance in natural language processing and computer vision. However, self-attention mechanisms face certain limitations when applied to time series forecasting tasks, such as high computational complexity and inefficiency in handling long sequences. In contrast, Multilayer Perceptrons (MLPs) exhibit superior efficiency compared to self-attention mechanisms, allowing for faster processing times, especially when dealing with long sequences. However, MLPs have limitations in capturing long-range dependencies and complex patterns within time series data, which can hinder their predictive performance. To this end, we propose a hybrid Convolutional Neural Network–Multilayer Perceptron (CNN–MLP) framework designed to retain the efficiency of MLP-based forecasting while addressing its limitations. Our approach begins by decomposing the input series into its seasonal and trend components. We then apply two targeted enhancements: (i) a multi-scale patching scheme that reshapes the seasonal component into 2D patches processed by a 2D CNN to extract intra- and inter-period patterns; and (ii) a diffusion-based super-resolution module that enriches the trend component with fine-scale temporal detail. Collectively, these innovations introduce wrinkles in time, improving feature extraction at both global and local scales and yielding a model that achieves competitive accuracy while maintaining low parameter counts, low inference latency, and stable performance across diverse forecasting scenarios. Our source code is available at https://github.com/sonnybjs/Wrinkles-in-time.
时间序列预测已经成为许多领域不可或缺的一部分,从天气预报到设备故障预测,其准确性和效率直接影响决策的质量和速度。近年来,具有自注意机制的模型,如变形金刚,因其在自然语言处理和计算机视觉方面的优异表现而受到广泛关注。然而,自注意机制在应用于时间序列预测任务时面临一定的局限性,如高计算复杂度和处理长序列时的低效率。相比之下,多层感知器(mlp)表现出比自注意机制更高的效率,允许更快的处理时间,特别是在处理长序列时。然而,mlp在捕获时间序列数据中的长期依赖关系和复杂模式方面存在局限性,这可能会阻碍其预测性能。为此,我们提出了一种混合卷积神经网络-多层感知器(CNN-MLP)框架,旨在保持基于mlp的预测的效率,同时解决其局限性。我们的方法首先将输入序列分解为其季节性和趋势成分。然后,我们应用了两种有针对性的增强:(i)一种多尺度补丁方案,该方案将季节分量重塑为二维补丁,通过二维CNN处理以提取周期内和周期间模式;(ii)基于扩散的超分辨率模块,以精细尺度的时间细节丰富趋势分量。总的来说,这些创新在时间上引入了皱纹,改善了全局和局部尺度上的特征提取,并产生了一个模型,在保持低参数计数、低推理延迟和跨不同预测场景稳定性能的同时,实现了具有竞争力的准确性。我们的源代码可从https://github.com/sonnybjs/Wrinkles-in-time获得。
{"title":"Wrinkles in time: Multi-scale patching and super-resolution for efficient time series forecasting","authors":"Yuwei Chen,&nbsp;Wenjing Jia,&nbsp;Qiang Wu","doi":"10.1016/j.neucom.2026.133021","DOIUrl":"10.1016/j.neucom.2026.133021","url":null,"abstract":"<div><div>Time series forecasting has become an indispensable part of many fields, from weather prediction to equipment failure forecasting, where accuracy and efficiency directly impact the quality and speed of decision-making. Recently, models with self-attention mechanisms, such as Transformers, have gained widespread attention due to their strong performance in natural language processing and computer vision. However, self-attention mechanisms face certain limitations when applied to time series forecasting tasks, such as high computational complexity and inefficiency in handling long sequences. In contrast, Multilayer Perceptrons (MLPs) exhibit superior efficiency compared to self-attention mechanisms, allowing for faster processing times, especially when dealing with long sequences. However, MLPs have limitations in capturing long-range dependencies and complex patterns within time series data, which can hinder their predictive performance. To this end, we propose a hybrid Convolutional Neural Network–Multilayer Perceptron (CNN–MLP) framework designed to retain the efficiency of MLP-based forecasting while addressing its limitations. Our approach begins by decomposing the input series into its seasonal and trend components. We then apply two targeted enhancements: (i) a multi-scale patching scheme that reshapes the seasonal component into 2D patches processed by a 2D CNN to extract intra- and inter-period patterns; and (ii) a diffusion-based super-resolution module that enriches the trend component with fine-scale temporal detail. Collectively, these innovations introduce wrinkles in time, improving feature extraction at both global and local scales and yielding a model that achieves competitive accuracy while maintaining low parameter counts, low inference latency, and stable performance across diverse forecasting scenarios. Our source code is available at <span><span>https://github.com/sonnybjs/Wrinkles-in-time.</span><svg><path></path></svg></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133021"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ReCoD: Enhancing image description for cross-modal understanding via retrieval and comparison feedback mechanism ReCoD:通过检索和比较反馈机制增强图像描述以实现跨模态理解
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-11 DOI: 10.1016/j.neucom.2026.133025
Geunyoung Jung , Jun Park , Hankyeol Lee , Kyungwoo Song , Jiyoung Jung
To effectively utilize the large language models (LLMs) in the vision domain, it is essential to establish a strong connection between the visual and textual modalities. While deep embeddings can facilitate this connection, representing images as detailed textual descriptions offers significant advantages in terms of the usability and interpretability inherent in natural language. In this paper, we introduce a method of image description enhancement designed to generate highly detailed descriptions that include discriminative attributes of the given image, without requiring additional training. Our method, ReCoD, consists of two main components: 1) “image retrieval”, which retrieves the image most similar to the descriptions of the target image, and 2) “comparison”, which identifies and describes the differences between the target image and the retrieved image. These two components are complementary and form an iterative feedback mechanism. As this process iterates, the retrieved image becomes visually closer to the target image, and the descriptions become progressively more informative. Extensive experiments demonstrate the effectiveness of bridging the gap between the two modalities and the quality of our enhanced descriptions. The code is available at https://github.com/gyjung975/ReCoD.
为了有效地利用视觉领域的大型语言模型(llm),必须在视觉和文本模态之间建立牢固的联系。虽然深度嵌入可以促进这种联系,但就自然语言固有的可用性和可解释性而言,将图像表示为详细的文本描述提供了显著的优势。在本文中,我们介绍了一种图像描述增强方法,旨在生成高度详细的描述,包括给定图像的判别属性,而不需要额外的训练。我们的方法,ReCoD,由两个主要部分组成:1)“图像检索”,检索与目标图像描述最相似的图像;2)“比较”,识别和描述目标图像与检索图像之间的差异。这两个组件是互补的,形成了一个迭代反馈机制。随着这个过程的迭代,检索到的图像在视觉上变得更接近目标图像,并且描述变得越来越有信息量。广泛的实验证明了弥合两种模式之间差距的有效性和我们增强描述的质量。代码可在https://github.com/gyjung975/ReCoD上获得。
{"title":"ReCoD: Enhancing image description for cross-modal understanding via retrieval and comparison feedback mechanism","authors":"Geunyoung Jung ,&nbsp;Jun Park ,&nbsp;Hankyeol Lee ,&nbsp;Kyungwoo Song ,&nbsp;Jiyoung Jung","doi":"10.1016/j.neucom.2026.133025","DOIUrl":"10.1016/j.neucom.2026.133025","url":null,"abstract":"<div><div>To effectively utilize the large language models (LLMs) in the vision domain, it is essential to establish a strong connection between the visual and textual modalities. While deep embeddings can facilitate this connection, representing images as detailed textual descriptions offers significant advantages in terms of the usability and interpretability inherent in natural language. In this paper, we introduce a method of image description enhancement designed to generate highly detailed descriptions that include discriminative attributes of the given image, without requiring additional training. Our method, <span>ReCoD</span>, consists of two main components: 1) <em>“image retrieval”</em>, which retrieves the image most similar to the descriptions of the target image, and 2) <em>“comparison”</em>, which identifies and describes the differences between the target image and the retrieved image. These two components are complementary and form an iterative feedback mechanism. As this process iterates, the retrieved image becomes visually closer to the target image, and the descriptions become progressively more informative. Extensive experiments demonstrate the effectiveness of bridging the gap between the two modalities and the quality of our enhanced descriptions. The code is available at <span><span>https://github.com/gyjung975/ReCoD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133025"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A context-aware temporal knowledge graph completion method based on logical paths 基于逻辑路径的上下文感知时态知识图补全方法
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-13 DOI: 10.1016/j.neucom.2026.133049
Yuchao Zhang , Xiangjie Kong , Kailun Ye , Shangfei Zheng , Qihong Pan , Guojiang Shen , Jianxin Li
Temporal Knowledge Graph Completion (TKGC) plays a critical role in dynamic knowledge expansion by predicting entity relationships at specific time points based on existing facts. This enhances the performance of downstream applications. Existing TKGC methods mainly generate dynamic entity representations by capturing the structural characteristics of facts and the sequential properties of data. Some approaches also attempt to extract temporal logical rules from historical query data. However, these methods do not fully account for the contextual relevance of queries, which limits the accuracy of entity predictions. To address this gap, we propose a Context-Aware TKGC method based on Logical Paths (LPCA). This method integrates embedding representations and logical rules to create an interpretable TKGC completion framework, improving both prediction transparency and accuracy. Our approach mines query-specific logical paths, incorporating a time-aware sampling mechanism that prioritizes temporally recent facts to enhance temporal relevance and adaptability. Additionally, we design a self-attention mechanism to capture enriched contextual features of these temporal logical paths, modeling dependencies among the head entity, tail entity, query relation, and associated path elements to strengthen semantic and structural relevance. Experimental results show that the proposed method improves MRR scores by 3.59%, 0.99%, and 2.67% on the ICEWS14, ICEWS18, and ICEWS05-15 datasets, respectively.
时间知识图补全(TKGC)通过预测特定时间点的实体关系,在知识的动态扩展中起着至关重要的作用。这提高了下游应用程序的性能。现有的TKGC方法主要通过捕获事实的结构特征和数据的顺序属性来生成动态实体表示。一些方法还尝试从历史查询数据中提取时间逻辑规则。然而,这些方法不能完全考虑查询的上下文相关性,这限制了实体预测的准确性。为了解决这一差距,我们提出了一种基于逻辑路径(LPCA)的上下文感知TKGC方法。该方法集成了嵌入表示和逻辑规则,创建了一个可解释的TKGC补全框架,提高了预测的透明度和准确性。我们的方法挖掘特定于查询的逻辑路径,结合时间感知抽样机制,优先考虑时间上最近的事实,以增强时间相关性和适应性。此外,我们设计了一个自关注机制来捕获这些时间逻辑路径的丰富上下文特征,建模头部实体、尾部实体、查询关系和相关路径元素之间的依赖关系,以加强语义和结构相关性。实验结果表明,该方法在ICEWS14、ICEWS18和ICEWS05-15数据集上的MRR分数分别提高了3.59%、0.99%和2.67%。
{"title":"A context-aware temporal knowledge graph completion method based on logical paths","authors":"Yuchao Zhang ,&nbsp;Xiangjie Kong ,&nbsp;Kailun Ye ,&nbsp;Shangfei Zheng ,&nbsp;Qihong Pan ,&nbsp;Guojiang Shen ,&nbsp;Jianxin Li","doi":"10.1016/j.neucom.2026.133049","DOIUrl":"10.1016/j.neucom.2026.133049","url":null,"abstract":"<div><div>Temporal Knowledge Graph Completion (TKGC) plays a critical role in dynamic knowledge expansion by predicting entity relationships at specific time points based on existing facts. This enhances the performance of downstream applications. Existing TKGC methods mainly generate dynamic entity representations by capturing the structural characteristics of facts and the sequential properties of data. Some approaches also attempt to extract temporal logical rules from historical query data. However, these methods do not fully account for the contextual relevance of queries, which limits the accuracy of entity predictions. To address this gap, we propose a <strong>C</strong>ontext-<strong>A</strong>ware TKGC method based on <strong>L</strong>ogical <strong>P</strong>aths (LPCA). This method integrates embedding representations and logical rules to create an interpretable TKGC completion framework, improving both prediction transparency and accuracy. Our approach mines query-specific logical paths, incorporating a time-aware sampling mechanism that prioritizes temporally recent facts to enhance temporal relevance and adaptability. Additionally, we design a self-attention mechanism to capture enriched contextual features of these temporal logical paths, modeling dependencies among the head entity, tail entity, query relation, and associated path elements to strengthen semantic and structural relevance. Experimental results show that the proposed method improves MRR scores by 3.59%, 0.99%, and 2.67% on the ICEWS14, ICEWS18, and ICEWS05-15 datasets, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133049"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable AI: Context-aware layer-wise integrated gradients for explaining transformer models 可解释的AI:用于解释变压器模型的上下文感知分层集成梯度
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-12 DOI: 10.1016/j.neucom.2026.133050
Melkamu Abay Mersha, Jugal Kalita
Transformer models achieve state-of-the-art performance across domains and tasks, yet their deeply layered representations make their predictions difficult to interpret. Existing explainability methods rely on final-layer attributions, capture either local token-level attributions or global attention patterns without unification, and lack context-awareness of inter-token dependencies and structural components. They also fail to capture how relevance evolves across layers and how structural components shape decision-making. To address these limitations, we proposed the Context-Aware Layer-wise Integrated Gradients (CA-LIG) Framework, a unified hierarchical attribution framework that computes layer-wise Integrated Gradients within each Transformer block and fuses these token-level attributions with class-specific attention gradients. This integration yields signed, context-sensitive attribution maps that capture supportive and opposing evidence while tracing the hierarchical flow of relevance through the Transformer layers. We evaluate the CA-LIG Framework across diverse tasks, domains, and Transformer model families, including sentiment analysis and long and multi-class document classification with BERT, hate speech detection in a low-resource language setting with XLM-R and AfroLM, and image classification with Masked Autoencoder vision Transformer model. Across all tasks and architectures, CA-LIG provides more faithful attributions, shows stronger sensitivity to contextual dependencies, and produces clearer, more semantically coherent visualizations than established explainability methods. These results indicate that CA-LIG provides a more comprehensive, context-aware, and reliable explanation of Transformer decision-making, advancing both the practical interpretability and conceptual understanding of deep neural models.
The implementation code will be made publicly available at https://github.com/melkamumersha/Context-Aware-XAI upon acceptance of the paper.
Transformer模型实现了跨领域和任务的最先进的性能,但是它们的深度分层表示使得它们的预测难以解释。现有的可解释性方法依赖于最后一层属性,捕获局部标记级属性或全局注意模式而没有统一,并且缺乏对标记间依赖关系和结构组件的上下文感知。它们也未能捕捉到相关性是如何跨层演变的,以及结构组件如何影响决策。为了解决这些限制,我们提出了上下文感知分层集成梯度(CA-LIG)框架,这是一个统一的分层属性框架,它计算每个Transformer块中的分层集成梯度,并将这些令牌级属性与类特定的注意梯度融合在一起。这种集成产生了签名的、上下文敏感的归因图,在通过Transformer层跟踪相关性的分层流时,这些图捕获了支持和反对的证据。我们在不同的任务、领域和Transformer模型家族中评估CA-LIG框架,包括使用BERT进行情感分析和长类和多类文档分类,使用XLM-R和AfroLM进行低资源语言设置下的仇恨语音检测,以及使用遮面自动编码器视觉Transformer模型进行图像分类。在所有任务和体系结构中,CA-LIG提供了更忠实的归因,对上下文依赖性表现出更强的敏感性,并且比现有的可解释性方法产生更清晰、语义更连贯的可视化。这些结果表明,CA-LIG为Transformer决策提供了更全面、情境感知和可靠的解释,提高了深度神经模型的实际可解释性和概念理解。在论文被接受后,实现代码将在https://github.com/melkamumersha/Context-Aware-XAI上公开。
{"title":"Explainable AI: Context-aware layer-wise integrated gradients for explaining transformer models","authors":"Melkamu Abay Mersha,&nbsp;Jugal Kalita","doi":"10.1016/j.neucom.2026.133050","DOIUrl":"10.1016/j.neucom.2026.133050","url":null,"abstract":"<div><div>Transformer models achieve state-of-the-art performance across domains and tasks, yet their deeply layered representations make their predictions difficult to interpret. Existing explainability methods rely on final-layer attributions, capture either local token-level attributions or global attention patterns without unification, and lack context-awareness of inter-token dependencies and structural components. They also fail to capture how relevance evolves across layers and how structural components shape decision-making. To address these limitations, we proposed the <strong>Context-Aware Layer-wise Integrated Gradients (CA-LIG) Framework</strong>, a unified hierarchical attribution framework that computes layer-wise Integrated Gradients within each Transformer block and fuses these token-level attributions with class-specific attention gradients. This integration yields signed, context-sensitive attribution maps that capture supportive and opposing evidence while tracing the hierarchical flow of relevance through the Transformer layers. We evaluate the CA-LIG Framework across diverse tasks, domains, and Transformer model families, including sentiment analysis and long and multi-class document classification with BERT, hate speech detection in a low-resource language setting with XLM-R and AfroLM, and image classification with Masked Autoencoder vision Transformer model. Across all tasks and architectures, CA-LIG provides more faithful attributions, shows stronger sensitivity to contextual dependencies, and produces clearer, more semantically coherent visualizations than established explainability methods. These results indicate that CA-LIG provides a more comprehensive, context-aware, and reliable explanation of Transformer decision-making, advancing both the practical interpretability and conceptual understanding of deep neural models.</div><div>The implementation code will be made publicly available at <span><span>https://github.com/melkamumersha/Context-Aware-XAI</span><svg><path></path></svg></span> upon acceptance of the paper.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133050"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust learning with time series noisy labels via self-supervised learning and soft labels refurbishment 基于自监督学习和软标签翻新的时间序列噪声标签鲁棒学习
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-16 DOI: 10.1016/j.neucom.2026.133059
Jiarong Liu , Kaixiang Yang , Jian He , Chengrong Yang , Shuang-hua Yang , Yujue Zhou
Label noise significantly degrades the generalization performance of Deep Neural Networks (DNNs). While Learning with Noisy Labels (LNL) is well-established in computer vision, its application to time-series data presents unique challenges: (1) Feature extraction in supervised learning is corrupted by noisy labels, as the critical assumption that clean samples yield lower loss than noisy ones is frequently violated. This leads to learned representations with poor class separability and ambiguous boundaries, which degrades downstream task performance. (2) Conventional soft-labeling methods generate soft-labels from a single training epoch, ignoring historical and cross-model information. This often leads to unstable supervision and inconsistent soft-labels. To address these challenges, this paper proposes the Robust Representation Learning Network (RoRLNet) for noisy time-series classification. RoRLNet employs a two-stage robust learning paradigm that decouples feature extraction from classifier training. In the first stage, it learns noise-robust spatio-temporal representations by integrating MixDecomposition, a data augmentation strategy based on trend-seasonality decomposition, with MSSFE, a multi-scale self-supervised feature extractor. In the second stage, it trains the classifier using EnBootstrap, a soft-label correction module that stabilizes supervision by ensembling predictions from multiple models and historical epochs. Extensive experiments on multiple benchmarks under diverse noise conditions demonstrate that RoRLNet consistently outperforms state-of-the-art methods, by 7.76%. The source code is available at: https://github.com/JingGu-hub/RoRLNet.
标签噪声显著降低了深度神经网络(dnn)的泛化性能。虽然带有噪声标签的学习(LNL)在计算机视觉中已经得到了很好的应用,但它在时间序列数据中的应用面临着独特的挑战:(1)监督学习中的特征提取受到噪声标签的破坏,因为干净样本的损失低于噪声样本的关键假设经常被违反。这将导致学习到的表示具有较差的类可分离性和模糊的边界,从而降低下游任务的性能。(2)传统的软标签方法从单个训练历元生成软标签,忽略了历史和跨模型信息。这往往导致不稳定的监督和不一致的软标签。为了解决这些挑战,本文提出了用于噪声时间序列分类的鲁棒表示学习网络(RoRLNet)。RoRLNet采用两阶段鲁棒学习范式,将特征提取与分类器训练分离。在第一阶段,它通过集成MixDecomposition(一种基于趋势季节性分解的数据增强策略)和MSSFE(一种多尺度自监督特征提取器)来学习抗噪声时空表示。在第二阶段,它使用EnBootstrap训练分类器,这是一个软标签校正模块,通过集成来自多个模型和历史时代的预测来稳定监督。在不同噪声条件下的多个基准测试中进行的大量实验表明,RoRLNet的性能始终优于最先进的方法,达到7.76%。源代码可从https://github.com/JingGu-hub/RoRLNet获得。
{"title":"Robust learning with time series noisy labels via self-supervised learning and soft labels refurbishment","authors":"Jiarong Liu ,&nbsp;Kaixiang Yang ,&nbsp;Jian He ,&nbsp;Chengrong Yang ,&nbsp;Shuang-hua Yang ,&nbsp;Yujue Zhou","doi":"10.1016/j.neucom.2026.133059","DOIUrl":"10.1016/j.neucom.2026.133059","url":null,"abstract":"<div><div>Label noise significantly degrades the generalization performance of Deep Neural Networks (DNNs). While Learning with Noisy Labels (LNL) is well-established in computer vision, its application to time-series data presents unique challenges: (1) Feature extraction in supervised learning is corrupted by noisy labels, as the critical assumption that clean samples yield lower loss than noisy ones is frequently violated. This leads to learned representations with poor class separability and ambiguous boundaries, which degrades downstream task performance. (2) Conventional soft-labeling methods generate soft-labels from a single training epoch, ignoring historical and cross-model information. This often leads to unstable supervision and inconsistent soft-labels. To address these challenges, this paper proposes the Robust Representation Learning Network (RoRLNet) for noisy time-series classification. RoRLNet employs a two-stage robust learning paradigm that decouples feature extraction from classifier training. In the first stage, it learns noise-robust spatio-temporal representations by integrating MixDecomposition, a data augmentation strategy based on trend-seasonality decomposition, with MSSFE, a multi-scale self-supervised feature extractor. In the second stage, it trains the classifier using EnBootstrap, a soft-label correction module that stabilizes supervision by ensembling predictions from multiple models and historical epochs. Extensive experiments on multiple benchmarks under diverse noise conditions demonstrate that RoRLNet consistently outperforms state-of-the-art methods, by 7.76%. The source code is available at: <span><span>https://github.com/JingGu-hub/RoRLNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133059"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement learning control for once-through boiler-turbine units based on mechanistic directed multi-subgraph integration 基于机械有向多子图积分的锅炉汽轮机组直通强化学习控制
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-13 DOI: 10.1016/j.neucom.2026.133010
Zilong Liu, Chiqiang Liu, Dazi Li
Once-through boiler-turbine (OTBT) unit, as the core component of large-scale coal-fired power units, exhibits strong interdependencies among its state variables, which pose substantial challenges to conventional control strategies such as PID. Existing reinforcement learning (RL) approaches in complex process control often suffer from insufficient utilization of mechanistic knowledge and the presence of steady-state error. To address these issues, this paper proposes a new RL control framework named multi-subgraph integration reinforcement learning (MSIRL) for the OTBT unit. Based on the mechanistic model, a directed graph is constructed to characterize the dependencies of state variables, through which a graph attention network (GAT) extracts both local and global coupling features that are subsequently integrated into the Actor-Critic framework for end-to-end training. An integral compensation module is integrated at the model output to actively mitigate steady-state error induced by model uncertainties or external disturbances. Under setpoint tracking and disturbance rejection scenarios, the proposed method demonstrates significant advantages in both dynamic response speed and steady-state accuracy compared with conventional PID and standard RL approaches. Finally, ablation experiments further validate the critical roles of mechanistic directed graph embedding and integral compensation in enhancing the control performance of the system.
火电机组作为大型燃煤机组的核心部件,其状态变量之间具有很强的相互依赖性,这对传统的PID控制策略提出了很大的挑战。现有的强化学习方法在复杂过程控制中存在机械知识利用不足和稳态误差存在的问题。为了解决这些问题,本文针对OTBT单元提出了一种新的强化学习控制框架——多子图集成强化学习(MSIRL)。在机制模型的基础上,构造了一个有向图来表征状态变量的依赖关系,通过该有向图关注网络(GAT)提取局部和全局耦合特征,然后将这些特征集成到Actor-Critic框架中进行端到端训练。在模型输出处集成了积分补偿模块,以主动减轻由模型不确定性或外部干扰引起的稳态误差。在设定值跟踪和干扰抑制场景下,与传统PID和标准RL方法相比,该方法在动态响应速度和稳态精度方面具有显著优势。最后,烧蚀实验进一步验证了机械有向图嵌入和积分补偿在提高系统控制性能方面的重要作用。
{"title":"Reinforcement learning control for once-through boiler-turbine units based on mechanistic directed multi-subgraph integration","authors":"Zilong Liu,&nbsp;Chiqiang Liu,&nbsp;Dazi Li","doi":"10.1016/j.neucom.2026.133010","DOIUrl":"10.1016/j.neucom.2026.133010","url":null,"abstract":"<div><div>Once-through boiler-turbine (OTBT) unit, as the core component of large-scale coal-fired power units, exhibits strong interdependencies among its state variables, which pose substantial challenges to conventional control strategies such as PID. Existing reinforcement learning (RL) approaches in complex process control often suffer from insufficient utilization of mechanistic knowledge and the presence of steady-state error. To address these issues, this paper proposes a new RL control framework named multi-subgraph integration reinforcement learning (MSIRL) for the OTBT unit. Based on the mechanistic model, a directed graph is constructed to characterize the dependencies of state variables, through which a graph attention network (GAT) extracts both local and global coupling features that are subsequently integrated into the Actor-Critic framework for end-to-end training. An integral compensation module is integrated at the model output to actively mitigate steady-state error induced by model uncertainties or external disturbances. Under setpoint tracking and disturbance rejection scenarios, the proposed method demonstrates significant advantages in both dynamic response speed and steady-state accuracy compared with conventional PID and standard RL approaches. Finally, ablation experiments further validate the critical roles of mechanistic directed graph embedding and integral compensation in enhancing the control performance of the system.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133010"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147386329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3MU-Net: A multi-layer, multi-view and multi-modal segmentation model for PET/CT images of lung tumors 3MU-Net:一种多层、多视图、多模态的肺肿瘤PET/CT图像分割模型
IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-01 Epub Date: 2026-02-11 DOI: 10.1016/j.neucom.2026.133027
Yuxia Niu , Tao Zhou , Yang Liu , Qitao Liu
Multimodal medical images provide reliable and accurate information on the lesion categories, which is essential for diagnosing lung cancer. In medical images, the adequate extraction of lesion cross-modal features and lesion contextual semantic information is a key issue. 3MU-Net is proposed by this paper, which is a Multi-layer, Multi-view and Multi-modal lung tumors segmentation model. The main contributions of 3MU-Net are as follows: Firstly, CMformer uses parallel CNN branch and Transformer branch to learn cross-modal medical image features, which builds cross-modal pixel dependencies between global features and local features. Secondly, Cross-view Context-aware Processor is designed, there are four Deep-Shallow Feature Enhancement Modules and a Cross-view Attention Module in Cross-view Context-aware Processor. Deep-Shallow Feature Enhancement Module uses a bidirectional learning method to guide the information interaction between shallow-level features and deep-level features, which reduces the semantic information gap between adjacent layers. In addition, the Deep-Shallow Feature Enhancement Module aggregates coarse-grained semantic features and fine-grained detailed features. Cross-view Attention Module is used to obtain multi-scale lesion information through multi-view feature extraction, which extends the diversity of spatial features in the lesion area. Finally, the effectiveness of 3MU-Net is validated on a clinical multimodal lung medical image dataset. The results for Miou, Dice, Voe, Rvd, Acc, and Recall are 90.97%, 95.23%, 93.87%, 94.13%, 97.97%, and 94.45%, respectively. It is great significance for lung tumors segmentation. Code is available at: https://github.com/xiaoniu1030/ 3MU-Net.
多模态医学图像提供了可靠、准确的病灶分类信息,对肺癌的诊断至关重要。在医学图像中,病灶跨模态特征和病灶上下文语义信息的充分提取是一个关键问题。本文提出了一种多层、多视图、多模态的肺肿瘤分割模型3MU-Net。3MU-Net的主要贡献如下:首先,CMformer利用并行的CNN分支和Transformer分支学习医学图像的跨模态特征,在全局特征和局部特征之间建立跨模态的像素依赖关系。其次,设计了跨视图上下文感知处理器,跨视图上下文感知处理器中有四个深浅特征增强模块和一个跨视图注意模块。Deep-Shallow Feature Enhancement Module采用双向学习的方式引导浅层特征和深层特征之间的信息交互,减少了相邻层之间的语义信息缺口。此外,深浅特征增强模块聚合了粗粒度的语义特征和细粒度的细节特征。交叉视角注意力模块通过多视角特征提取获得多尺度病变信息,扩展了病变区域空间特征的多样性。最后,在临床多模态肺医学图像数据集上验证了3MU-Net的有效性。Miou、Dice、Voe、Rvd、Acc和Recall的结果分别为90.97%、95.23%、93.87%、94.13%、97.97%和94.45%。对肺肿瘤的分割具有重要意义。代码可从https://github.com/xiaoniu1030/ 3MU-Net获得。
{"title":"3MU-Net: A multi-layer, multi-view and multi-modal segmentation model for PET/CT images of lung tumors","authors":"Yuxia Niu ,&nbsp;Tao Zhou ,&nbsp;Yang Liu ,&nbsp;Qitao Liu","doi":"10.1016/j.neucom.2026.133027","DOIUrl":"10.1016/j.neucom.2026.133027","url":null,"abstract":"<div><div>Multimodal medical images provide reliable and accurate information on the lesion categories, which is essential for diagnosing lung cancer. In medical images, the adequate extraction of lesion cross-modal features and lesion contextual semantic information is a key issue. 3MU-Net is proposed by this paper, which is a Multi-layer, Multi-view and Multi-modal lung tumors segmentation model. The main contributions of 3MU-Net are as follows: Firstly, CMformer uses parallel CNN branch and Transformer branch to learn cross-modal medical image features, which builds cross-modal pixel dependencies between global features and local features. Secondly, Cross-view Context-aware Processor is designed, there are four Deep-Shallow Feature Enhancement Modules and a Cross-view Attention Module in Cross-view Context-aware Processor. Deep-Shallow Feature Enhancement Module uses a bidirectional learning method to guide the information interaction between shallow-level features and deep-level features, which reduces the semantic information gap between adjacent layers. In addition, the Deep-Shallow Feature Enhancement Module aggregates coarse-grained semantic features and fine-grained detailed features. Cross-view Attention Module is used to obtain multi-scale lesion information through multi-view feature extraction, which extends the diversity of spatial features in the lesion area. Finally, the effectiveness of 3MU-Net is validated on a clinical multimodal lung medical image dataset. The results for Miou, Dice, Voe, Rvd, Acc, and Recall are 90.97%, 95.23%, 93.87%, 94.13%, 97.97%, and 94.45%, respectively. It is great significance for lung tumors segmentation. Code is available at: <span><span>https://github.com/xiaoniu1030/</span><svg><path></path></svg></span> 3MU-Net.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"676 ","pages":"Article 133027"},"PeriodicalIF":6.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146173296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1