首页 > 最新文献

Neurocomputing最新文献

英文 中文
Rehearsal-free continual few-shot relation extraction via contrastive weighted prompts
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129741
Fengqin Yang, Mengen Ren, Delu Kong, Shuhua Liu, Zhiguo Fu
The primary challenge in continual few-shot relation extraction is mitigating catastrophic forgetting. Prevailing strategies involve saving a set of samples in memory and replaying them. However, these methods pose privacy and data security concerns. To address this, we propose a novel rehearsal-free approach called Contrastive Weighted Prompt (CWP). This approach categorizes learnable prompts into task-generic and task-specific prompts. Task-generic prompts are shared across all tasks and are injected into the higher layers of the BERT encoder to capture general task knowledge. Task-specific prompts are generated by weighting all the prompts in a task-specific prompt pool based on their relevance to individual samples. These task-specific prompts are injected into the lower layers of BERT to extract task-specific knowledge. Task-generic prompts retain knowledge from prior tasks, while task-specific prompts reduce mutual interference among tasks and improve the relevance between prompts and individual samples. To further enhance the discriminability of the prompt embeddings for samples belonging to different relations, we introduced a relation-aware contrastive learning strategy. Experimental results on two standard datasets indicate that the proposed method outperforms baseline methods and demonstrates superiority in mitigating catastrophic forgetting.
{"title":"Rehearsal-free continual few-shot relation extraction via contrastive weighted prompts","authors":"Fengqin Yang,&nbsp;Mengen Ren,&nbsp;Delu Kong,&nbsp;Shuhua Liu,&nbsp;Zhiguo Fu","doi":"10.1016/j.neucom.2025.129741","DOIUrl":"10.1016/j.neucom.2025.129741","url":null,"abstract":"<div><div>The primary challenge in continual few-shot relation extraction is mitigating catastrophic forgetting. Prevailing strategies involve saving a set of samples in memory and replaying them. However, these methods pose privacy and data security concerns. To address this, we propose a novel rehearsal-free approach called Contrastive Weighted Prompt (CWP). This approach categorizes learnable prompts into task-generic and task-specific prompts. Task-generic prompts are shared across all tasks and are injected into the higher layers of the BERT encoder to capture general task knowledge. Task-specific prompts are generated by weighting all the prompts in a task-specific prompt pool based on their relevance to individual samples. These task-specific prompts are injected into the lower layers of BERT to extract task-specific knowledge. Task-generic prompts retain knowledge from prior tasks, while task-specific prompts reduce mutual interference among tasks and improve the relevance between prompts and individual samples. To further enhance the discriminability of the prompt embeddings for samples belonging to different relations, we introduced a relation-aware contrastive learning strategy. Experimental results on two standard datasets indicate that the proposed method outperforms baseline methods and demonstrates superiority in mitigating catastrophic forgetting.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129741"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SHoTGCN: Spatial high-order temporal GCN for skeleton-based action recognition
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129697
Qiyu Liu , Ying Wu , Bicheng Li , Yuxin Ma , Hanling Li , Yong Yu
Action recognition algorithms that leverage human skeleton motion data are highly attractive due to their robustness and high information density. Currently, the majority of algorithms in this domain employ graph convolutional neural networks (GCNs). However, these algorithms often neglect the extraction of high-order features. To address this limitation, we propose a novel approach called the Spatial High-Order Temporal Graph Convolution Network (SHoTGCN), designed to evaluate the impact of high-order features on human action recognition. Our method begins by deriving high-order features from human skeleton time series data through temporal interactions. Utilizing these high-order features significantly improves the algorithm’s ability to recognize human actions. Moreover, we found that the traditional feature extraction method, which employs Depthwise Convolution (DWConv) with a single 2D convolution, is suboptimal compared to a multibranch structure for feature extraction. To address this, we introduce a structure re-parameterization technique with DWConv, termed Rep-tDWConv, to enhance feature extraction. By integrating the Exponential Moving Average (EMA) model during the model fusion process, our proposed model achieves state-of-the-art (SOTA) performance, with accuracies of 90.4% and 92.0% on the XSub and XSet splits of the NTU RGB+D 120 dataset, respectively.
{"title":"SHoTGCN: Spatial high-order temporal GCN for skeleton-based action recognition","authors":"Qiyu Liu ,&nbsp;Ying Wu ,&nbsp;Bicheng Li ,&nbsp;Yuxin Ma ,&nbsp;Hanling Li ,&nbsp;Yong Yu","doi":"10.1016/j.neucom.2025.129697","DOIUrl":"10.1016/j.neucom.2025.129697","url":null,"abstract":"<div><div>Action recognition algorithms that leverage human skeleton motion data are highly attractive due to their robustness and high information density. Currently, the majority of algorithms in this domain employ graph convolutional neural networks (GCNs). However, these algorithms often neglect the extraction of high-order features. To address this limitation, we propose a novel approach called the Spatial High-Order Temporal Graph Convolution Network (SHoTGCN), designed to evaluate the impact of high-order features on human action recognition. Our method begins by deriving high-order features from human skeleton time series data through temporal interactions. Utilizing these high-order features significantly improves the algorithm’s ability to recognize human actions. Moreover, we found that the traditional feature extraction method, which employs Depthwise Convolution (DWConv) with a single 2D convolution, is suboptimal compared to a multibranch structure for feature extraction. To address this, we introduce a structure re-parameterization technique with DWConv, termed Rep-tDWConv, to enhance feature extraction. By integrating the Exponential Moving Average (EMA) model during the model fusion process, our proposed model achieves state-of-the-art (SOTA) performance, with accuracies of 90.4% and 92.0% on the XSub and XSet splits of the NTU RGB+D 120 dataset, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129697"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An O(1/k) algorithm for multi-agent optimization with inequality constraints
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129770
Peng Li , Yiyi Zhao , Jiangping Hu , Jiangtao Ji
This paper presents a discrete-time solution algorithm for a constrained multi-agent optimization problem with inequality constraints. Its aim is to seek a solution to minimize the sum of all the agents’ objective functions while satisfy each agent’s local set constraint and nonlinear inequality constraints. Assume that agents’ local constraints are heterogeneous and all the objective functions are convex and continuous, but they may not be differentiable. Similar to the distributed alternating direction method of multipliers (ADMM) algorithm, the designed algorithm can solve the multi-agent optimization problem in a distributed manner and has a fast O(1/k) convergence rate. Moreover, it can deal with the nonlinear constraints, which cannot be handled by distributed ADMM algorithm. Finally, the proposed algorithm is applied to solve a robust linear regression problem, a lasso problem and a decentralized joint flow and power control problem with inequality constraints, respectively and thus the effectiveness of the proposed algorithm is verified.
{"title":"An O(1/k) algorithm for multi-agent optimization with inequality constraints","authors":"Peng Li ,&nbsp;Yiyi Zhao ,&nbsp;Jiangping Hu ,&nbsp;Jiangtao Ji","doi":"10.1016/j.neucom.2025.129770","DOIUrl":"10.1016/j.neucom.2025.129770","url":null,"abstract":"<div><div>This paper presents a discrete-time solution algorithm for a constrained multi-agent optimization problem with inequality constraints. Its aim is to seek a solution to minimize the sum of all the agents’ objective functions while satisfy each agent’s local set constraint and nonlinear inequality constraints. Assume that agents’ local constraints are heterogeneous and all the objective functions are convex and continuous, but they may not be differentiable. Similar to the distributed alternating direction method of multipliers (ADMM) algorithm, the designed algorithm can solve the multi-agent optimization problem in a distributed manner and has a fast <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>k</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Moreover, it can deal with the nonlinear constraints, which cannot be handled by distributed ADMM algorithm. Finally, the proposed algorithm is applied to solve a robust linear regression problem, a lasso problem and a decentralized joint flow and power control problem with inequality constraints, respectively and thus the effectiveness of the proposed algorithm is verified.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129770"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AA-mDLAM: An accelerated ADMM-based framework for training deep neural networks AA-mDLAM:基于 ADMM 的加速深度神经网络训练框架
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129744
Zeinab Ebrahimi , Gustavo Batista , Mohammad Deghat
Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an accelerated framework for training deep neural networks, termed AA-mDLAM, which integrates Anderson acceleration within an Alternating Minimization approach inspired by ADMM to tackle this drawback. The main intention of the AA-mDLAM algorithm is to employ Anderson acceleration to alternating minimization by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-mDLAM algorithm by conducting extensive experiments on seven benchmark datasets contrary to other state-of-the-art optimizers.
{"title":"AA-mDLAM: An accelerated ADMM-based framework for training deep neural networks","authors":"Zeinab Ebrahimi ,&nbsp;Gustavo Batista ,&nbsp;Mohammad Deghat","doi":"10.1016/j.neucom.2025.129744","DOIUrl":"10.1016/j.neucom.2025.129744","url":null,"abstract":"<div><div>Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an accelerated framework for training deep neural networks, termed AA-mDLAM, which integrates Anderson acceleration within an Alternating Minimization approach inspired by ADMM to tackle this drawback. The main intention of the AA-mDLAM algorithm is to employ Anderson acceleration to alternating minimization by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-mDLAM algorithm by conducting extensive experiments on seven benchmark datasets contrary to other state-of-the-art optimizers.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129744"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A variable-gain fixed-time convergent neurodynamic network for time-variant quadratic programming under unknown noises
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129778
Biao Song , Tinghe Hong , Weibing Li , Gang Chen , Yongping Pan , Kai Huang
This article proposes a variable-gain fixed-time convergent and noise-tolerant error-dynamics based neurodynamic network (VGFxTNT-EDNN) to solve time-varying quadratic programming problems, while being robust to unknown noises. Unlike existing finite-time convergent EDNNs, the newly designed VGFxTNT-EDNN guarantees fixed-time convergence by dynamically adjusting its variable parameters. Moreover, the VGFxTNT-EDNN effectively handles unknown noise, addressing a limitation of existing fixed-time or predefined-time convergent models, which typically assume that the noise is known. Theoretical analysis utilizing Lyapunov theory proves that the VGFxTNT-EDNN possesses fixed-time convergence and robustness properties. Numerical validations demonstrate superior noise tolerance and fixed-time convergence of the VGFxTNT-EDNN, as compared with the existing models. Finally, a path-tracking experiment is conducted by utilizing a Franka Emika Panda robot to verify the practicality of the VGFxTNT-EDNN.
{"title":"A variable-gain fixed-time convergent neurodynamic network for time-variant quadratic programming under unknown noises","authors":"Biao Song ,&nbsp;Tinghe Hong ,&nbsp;Weibing Li ,&nbsp;Gang Chen ,&nbsp;Yongping Pan ,&nbsp;Kai Huang","doi":"10.1016/j.neucom.2025.129778","DOIUrl":"10.1016/j.neucom.2025.129778","url":null,"abstract":"<div><div>This article proposes a variable-gain fixed-time convergent and noise-tolerant error-dynamics based neurodynamic network (VGFxTNT-EDNN) to solve time-varying quadratic programming problems, while being robust to unknown noises. Unlike existing finite-time convergent EDNNs, the newly designed VGFxTNT-EDNN guarantees fixed-time convergence by dynamically adjusting its variable parameters. Moreover, the VGFxTNT-EDNN effectively handles unknown noise, addressing a limitation of existing fixed-time or predefined-time convergent models, which typically assume that the noise is known. Theoretical analysis utilizing Lyapunov theory proves that the VGFxTNT-EDNN possesses fixed-time convergence and robustness properties. Numerical validations demonstrate superior noise tolerance and fixed-time convergence of the VGFxTNT-EDNN, as compared with the existing models. Finally, a path-tracking experiment is conducted by utilizing a Franka Emika Panda robot to verify the practicality of the VGFxTNT-EDNN.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129778"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards multi-fusion graph neural network for single-cell RNA sequence clustering
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129764
Chen-Min Yang , Dong Huang , Yuan-Kun Xu , Xiuting He , Guang-Yu Zhang , Chang-Dong Wang
Clustering analysis plays a crucial role in single-cell RNA sequencing (scRNA-seq) data analysis, in which the graph neural network (GNN)-based clustering methods have rapidly emerged as a promising technique. Despite considerable progress, the previous scRNA-seq clustering methods still suffer from two critical limitations. First, they mostly treat the node attributes and cell–cell topological information equally, neglecting their (probably) different reliability. Second, they usually only consider the learned representation of the last layer, lacking the ability to fuse multi-scale discriminative information embedded in different layers. In view of this, this paper presents a new single-cell multi-fusion graph neural network (scMFGNN) for scRNA-seq clustering. Particularly, we utilize a multi-fusion graph neural network (MFGNN) for learning discriminative representations while preserving the structural information latent in multi-scale network layers. To cope with the high-dispersion, high-heterogeneity, and high-dimensionality of scRNA-seq data, a zero-inflated negative binomial (ZINB) module is incorporated into the network structure. Furthermore, the consistency between node representations and graph topological information is constrained to guide the joint learning process. It is noteworthy that scMFGNN can dynamically fuse multi-scale representations from multiple layers and meanwhile adaptively combine node representations and topological structural information from the same layer for representation learning and clustering. Experiments on multiple scRNA-seq datasets demonstrate the superiority of scMFGNN over the state-of-the-art. Code available: https://github.com/youngcmm/scMFGNN.
{"title":"Towards multi-fusion graph neural network for single-cell RNA sequence clustering","authors":"Chen-Min Yang ,&nbsp;Dong Huang ,&nbsp;Yuan-Kun Xu ,&nbsp;Xiuting He ,&nbsp;Guang-Yu Zhang ,&nbsp;Chang-Dong Wang","doi":"10.1016/j.neucom.2025.129764","DOIUrl":"10.1016/j.neucom.2025.129764","url":null,"abstract":"<div><div>Clustering analysis plays a crucial role in single-cell RNA sequencing (scRNA-seq) data analysis, in which the graph neural network (GNN)-based clustering methods have rapidly emerged as a promising technique. Despite considerable progress, the previous scRNA-seq clustering methods still suffer from two critical limitations. First, they mostly treat the node attributes and cell–cell topological information equally, neglecting their (probably) different reliability. Second, they usually only consider the learned representation of the last layer, lacking the ability to fuse multi-scale discriminative information embedded in different layers. In view of this, this paper presents a new single-cell multi-fusion graph neural network (scMFGNN) for scRNA-seq clustering. Particularly, we utilize a multi-fusion graph neural network (MFGNN) for learning discriminative representations while preserving the structural information latent in multi-scale network layers. To cope with the high-dispersion, high-heterogeneity, and high-dimensionality of scRNA-seq data, a zero-inflated negative binomial (ZINB) module is incorporated into the network structure. Furthermore, the consistency between node representations and graph topological information is constrained to guide the joint learning process. It is noteworthy that scMFGNN can dynamically fuse multi-scale representations from multiple layers and meanwhile adaptively combine node representations and topological structural information from the same layer for representation learning and clustering. Experiments on multiple scRNA-seq datasets demonstrate the superiority of scMFGNN over the state-of-the-art. Code available: <span><span>https://github.com/youngcmm/scMFGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129764"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-shot detection of LLM-generated text via text reorder
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129829
Jingtao Sun , Zhanglong Lv
With the rapid advancement of large language model (LLM) technology, particularly with the emergence of advanced models like ChatGPT, distinguishing between LLM-generated and human-written texts has become increasingly challenging. This phenomenon presents unprecedented challenges to academic integrity and authenticity, making the detection of LLM-generated content a pressing concern in scientific research. To effectively and accurately detect texts generated by LLMs, this study constructs a comprehensive dataset of medical paper introductions, encompassing both human-written and LLM-generated content. Based on this dataset, a simple and efficient black-box, zero-shot detection method is proposed. The method builds upon the hypothesis that fundamental differences exist in the linguistic logical ordering between human-written and LLM-generated texts. Specifically, this method reorders the original text using dependency parse trees, calculates the similarity score (Rscore) between the reordered text and the original, and integrates log-likelihood features as auxiliary metrics. The approach synthesizes the reordered similarity and log-likelihood scores to derive a composite metric, establishing an effective classification threshold for discriminating between human-written and LLM-generated texts. The experimental results show that our approach not only effectively detects LLM-generated texts but also identifies LLM-polished abstracts, outperforming current state-of-the-art zero-shot detection methods (SOTA).
{"title":"Zero-shot detection of LLM-generated text via text reorder","authors":"Jingtao Sun ,&nbsp;Zhanglong Lv","doi":"10.1016/j.neucom.2025.129829","DOIUrl":"10.1016/j.neucom.2025.129829","url":null,"abstract":"<div><div>With the rapid advancement of large language model (LLM) technology, particularly with the emergence of advanced models like ChatGPT, distinguishing between LLM-generated and human-written texts has become increasingly challenging. This phenomenon presents unprecedented challenges to academic integrity and authenticity, making the detection of LLM-generated content a pressing concern in scientific research. To effectively and accurately detect texts generated by LLMs, this study constructs a comprehensive dataset of medical paper introductions, encompassing both human-written and LLM-generated content. Based on this dataset, a simple and efficient black-box, zero-shot detection method is proposed. The method builds upon the hypothesis that fundamental differences exist in the linguistic logical ordering between human-written and LLM-generated texts. Specifically, this method reorders the original text using dependency parse trees, calculates the similarity score (Rscore) between the reordered text and the original, and integrates log-likelihood features as auxiliary metrics. The approach synthesizes the reordered similarity and log-likelihood scores to derive a composite metric, establishing an effective classification threshold for discriminating between human-written and LLM-generated texts. The experimental results show that our approach not only effectively detects LLM-generated texts but also identifies LLM-polished abstracts, outperforming current state-of-the-art zero-shot detection methods (SOTA).</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129829"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilateral-Aware and Multi-Scale Region Guided U-Net for precise breast lesion segmentation in ultrasound images
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129775
Yangyang Li, Xintong Hou, Xuanting Hao, Ronghua Shang, Licheng Jiao
Breast cancer ranks among the leading health risks for women globally, with the significance of early diagnosis and intervention being paramount. Using computers to segment breast lesions from ultrasound images serves as a crucial auxiliary tool for diagnosing and studying this disease. However, the effectiveness of breast tumor segmentation is deeply impacted by the severe artifacts, the low contrast, and the diverse tumor shapes found in breast ultrasound images. The focus of this study is to devise an effective strategy to further alleviate the aforementioned issues and achieve more precise lesion segmentation. Specifically, we designed a Bilateral-Aware and Multi-Scale Region Guided U-Net (BA-MRGU-Net) with a bilateral perception strategy to segment breast tumors. Initially, we devised a Foreground and Background Aware Module (FBAM), primarily composed of an Adaptive Spatial Selection Unit (ASSU) and a Background Suppression Unit (BSU). The ASSU can help the network capture spatial context information that is more relevant to lesions. Concurrently, the BSU suppresses the feature responses of ultrasound artifacts and other tissues in the background. The FBAM can effectively distinguish between the foreground and background through these two independent branches. Subsequently, we developed a Multi-Scale Region Guided Module (MRGM) to utilize the feature maps across various scales to boost the network’s perception of the lesion region. We executed a range of experiments utilizing two widely used datasets with several state-of-the-art algorithms. The results reveal that our approach achieves improvements of varying degrees on multiple evaluation metrics and has superior segmentation performance.
{"title":"Bilateral-Aware and Multi-Scale Region Guided U-Net for precise breast lesion segmentation in ultrasound images","authors":"Yangyang Li,&nbsp;Xintong Hou,&nbsp;Xuanting Hao,&nbsp;Ronghua Shang,&nbsp;Licheng Jiao","doi":"10.1016/j.neucom.2025.129775","DOIUrl":"10.1016/j.neucom.2025.129775","url":null,"abstract":"<div><div>Breast cancer ranks among the leading health risks for women globally, with the significance of early diagnosis and intervention being paramount. Using computers to segment breast lesions from ultrasound images serves as a crucial auxiliary tool for diagnosing and studying this disease. However, the effectiveness of breast tumor segmentation is deeply impacted by the severe artifacts, the low contrast, and the diverse tumor shapes found in breast ultrasound images. The focus of this study is to devise an effective strategy to further alleviate the aforementioned issues and achieve more precise lesion segmentation. Specifically, we designed a Bilateral-Aware and Multi-Scale Region Guided U-Net (BA-MRGU-Net) with a bilateral perception strategy to segment breast tumors. Initially, we devised a Foreground and Background Aware Module (FBAM), primarily composed of an Adaptive Spatial Selection Unit (ASSU) and a Background Suppression Unit (BSU). The ASSU can help the network capture spatial context information that is more relevant to lesions. Concurrently, the BSU suppresses the feature responses of ultrasound artifacts and other tissues in the background. The FBAM can effectively distinguish between the foreground and background through these two independent branches. Subsequently, we developed a Multi-Scale Region Guided Module (MRGM) to utilize the feature maps across various scales to boost the network’s perception of the lesion region. We executed a range of experiments utilizing two widely used datasets with several state-of-the-art algorithms. The results reveal that our approach achieves improvements of varying degrees on multiple evaluation metrics and has superior segmentation performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129775"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing generalization in camera trap image recognition: Fine-tuning visual language models
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129826
Zihe Yang , Ye Tian , Lifeng Wang , Junguo Zhang
This study introduces a novel fine-tuning approach for enhancing the generalization capabilities of visual language models in the context of wildlife monitoring, particularly for camera trap image recognition. In this paper, we introduce Ecological Visual Language Models (Eco-VLMs), a model fine-tuned using an ecological subset of the ImageNet1K dataset (ImageNet1K-E), aimed at reducing the reliance on spurious correlations that affect the performance of models like CLIP when applied to specialized domains. By employing text augmentation techniques and expanding species names with rich descriptors, Eco-VLM is optimized to extract more distinctive features from images, thereby improving its discriminative capabilities for wildlife features. Meanwhile, random contrastive loss is proposed to improve the diversity of training data and the generalization of Eco-VLMs. The proposed Eco-CLIP and Eco-SigLIP model are rigorously evaluated against various camera trap datasets and demonstrates superior performance, with average F1 scores improved by 4.44 % and 3.79 % compared to the standard CLIP and SigLIP model. Intrinsic evaluations further confirm that Eco-VLMs have acquired a broader ecological knowledge base, highlighting its enhanced generalization abilities. This research contributes to the field by addressing the limitations of current visual language models in specialized ecological applications and underscores the potential of Eco-VLMs for improving wildlife monitoring efforts.
{"title":"Enhancing generalization in camera trap image recognition: Fine-tuning visual language models","authors":"Zihe Yang ,&nbsp;Ye Tian ,&nbsp;Lifeng Wang ,&nbsp;Junguo Zhang","doi":"10.1016/j.neucom.2025.129826","DOIUrl":"10.1016/j.neucom.2025.129826","url":null,"abstract":"<div><div>This study introduces a novel fine-tuning approach for enhancing the generalization capabilities of visual language models in the context of wildlife monitoring, particularly for camera trap image recognition. In this paper, we introduce Ecological Visual Language Models (Eco-VLMs), a model fine-tuned using an ecological subset of the ImageNet1K dataset (ImageNet1K-E), aimed at reducing the reliance on spurious correlations that affect the performance of models like CLIP when applied to specialized domains. By employing text augmentation techniques and expanding species names with rich descriptors, Eco-VLM is optimized to extract more distinctive features from images, thereby improving its discriminative capabilities for wildlife features. Meanwhile, random contrastive loss is proposed to improve the diversity of training data and the generalization of Eco-VLMs. The proposed Eco-CLIP and Eco-SigLIP model are rigorously evaluated against various camera trap datasets and demonstrates superior performance, with average F1 scores improved by 4.44 % and 3.79 % compared to the standard CLIP and SigLIP model. Intrinsic evaluations further confirm that Eco-VLMs have acquired a broader ecological knowledge base, highlighting its enhanced generalization abilities. This research contributes to the field by addressing the limitations of current visual language models in specialized ecological applications and underscores the potential of Eco-VLMs for improving wildlife monitoring efforts.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129826"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-security image steganography integrating multi-scale feature fusion with residual attention mechanism
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129838
Jiaqi Liang , Wei Xie , Haotian Wu , Junfeng Zhao , Xianhua Song
Constructing a good cost function is crucial for minimizing embedding distortion in image steganography. Recently, deep learning-based adaptive cost learning in image steganography has achieved significant advancements. For GAN-based image steganography, an encoder-decoder structure is typically employed by the generator. However, the continual encoding process often results in a lack of detailed information. Even if the image resolution is restored through skip connections, the generator will still be limited. To address the issue, this paper proposes a novel GAN structure named UMSA-GAN. Firstly, we design a residual attention mechanism, Res-CBAM, integrated into the generator network, which enables focusing on high-frequency regions in the cover image. Secondly, multi-scale feature information is also fused using skip connections, which enables the generator to learn more shallow features. Finally, unlike most of the previous works that only utilized Xu-Net as the discriminator, dual steganalyzers are also introduced as the discriminator to further enhance performance. Extensive comparative experiments demonstrate that UMSA-GAN effectively learns features from the cover images and generates better embedding probability maps. Compared to traditional and state-of-the-art GAN-based steganographic methods, UMSA-GAN exhibits superior security performance. In addition, the rationality and superiority of UMSA-GAN are further verified by a large number of ablation studies.
{"title":"High-security image steganography integrating multi-scale feature fusion with residual attention mechanism","authors":"Jiaqi Liang ,&nbsp;Wei Xie ,&nbsp;Haotian Wu ,&nbsp;Junfeng Zhao ,&nbsp;Xianhua Song","doi":"10.1016/j.neucom.2025.129838","DOIUrl":"10.1016/j.neucom.2025.129838","url":null,"abstract":"<div><div>Constructing a good cost function is crucial for minimizing embedding distortion in image steganography. Recently, deep learning-based adaptive cost learning in image steganography has achieved significant advancements. For GAN-based image steganography, an encoder-decoder structure is typically employed by the generator. However, the continual encoding process often results in a lack of detailed information. Even if the image resolution is restored through skip connections, the generator will still be limited. To address the issue, this paper proposes a novel GAN structure named UMSA-GAN. Firstly, we design a residual attention mechanism, Res-CBAM, integrated into the generator network, which enables focusing on high-frequency regions in the cover image. Secondly, multi-scale feature information is also fused using skip connections, which enables the generator to learn more shallow features. Finally, unlike most of the previous works that only utilized Xu-Net as the discriminator, dual steganalyzers are also introduced as the discriminator to further enhance performance. Extensive comparative experiments demonstrate that UMSA-GAN effectively learns features from the cover images and generates better embedding probability maps. Compared to traditional and state-of-the-art GAN-based steganographic methods, UMSA-GAN exhibits superior security performance. In addition, the rationality and superiority of UMSA-GAN are further verified by a large number of ablation studies.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129838"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143511837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1