Neurocomputing最新文献

英文中文

Rehearsal-free continual few-shot relation extraction via contrastive weighted prompts

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129741

Fengqin Yang, Mengen Ren, Delu Kong, Shuhua Liu, Zhiguo Fu

The primary challenge in continual few-shot relation extraction is mitigating catastrophic forgetting. Prevailing strategies involve saving a set of samples in memory and replaying them. However, these methods pose privacy and data security concerns. To address this, we propose a novel rehearsal-free approach called Contrastive Weighted Prompt (CWP). This approach categorizes learnable prompts into task-generic and task-specific prompts. Task-generic prompts are shared across all tasks and are injected into the higher layers of the BERT encoder to capture general task knowledge. Task-specific prompts are generated by weighting all the prompts in a task-specific prompt pool based on their relevance to individual samples. These task-specific prompts are injected into the lower layers of BERT to extract task-specific knowledge. Task-generic prompts retain knowledge from prior tasks, while task-specific prompts reduce mutual interference among tasks and improve the relevance between prompts and individual samples. To further enhance the discriminability of the prompt embeddings for samples belonging to different relations, we introduced a relation-aware contrastive learning strategy. Experimental results on two standard datasets indicate that the proposed method outperforms baseline methods and demonstrates superiority in mitigating catastrophic forgetting.

{"title":"Rehearsal-free continual few-shot relation extraction via contrastive weighted prompts","authors":"Fengqin Yang, Mengen Ren, Delu Kong, Shuhua Liu, Zhiguo Fu","doi":"10.1016/j.neucom.2025.129741","DOIUrl":"10.1016/j.neucom.2025.129741","url":null,"abstract":"<div><div>The primary challenge in continual few-shot relation extraction is mitigating catastrophic forgetting. Prevailing strategies involve saving a set of samples in memory and replaying them. However, these methods pose privacy and data security concerns. To address this, we propose a novel rehearsal-free approach called Contrastive Weighted Prompt (CWP). This approach categorizes learnable prompts into task-generic and task-specific prompts. Task-generic prompts are shared across all tasks and are injected into the higher layers of the BERT encoder to capture general task knowledge. Task-specific prompts are generated by weighting all the prompts in a task-specific prompt pool based on their relevance to individual samples. These task-specific prompts are injected into the lower layers of BERT to extract task-specific knowledge. Task-generic prompts retain knowledge from prior tasks, while task-specific prompts reduce mutual interference among tasks and improve the relevance between prompts and individual samples. To further enhance the discriminability of the prompt embeddings for samples belonging to different relations, we introduced a relation-aware contrastive learning strategy. Experimental results on two standard datasets indicate that the proposed method outperforms baseline methods and demonstrates superiority in mitigating catastrophic forgetting.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129741"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SHoTGCN: Spatial high-order temporal GCN for skeleton-based action recognition

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129697

Qiyu Liu , Ying Wu , Bicheng Li , Yuxin Ma , Hanling Li , Yong Yu

Action recognition algorithms that leverage human skeleton motion data are highly attractive due to their robustness and high information density. Currently, the majority of algorithms in this domain employ graph convolutional neural networks (GCNs). However, these algorithms often neglect the extraction of high-order features. To address this limitation, we propose a novel approach called the Spatial High-Order Temporal Graph Convolution Network (SHoTGCN), designed to evaluate the impact of high-order features on human action recognition. Our method begins by deriving high-order features from human skeleton time series data through temporal interactions. Utilizing these high-order features significantly improves the algorithm’s ability to recognize human actions. Moreover, we found that the traditional feature extraction method, which employs Depthwise Convolution (DWConv) with a single 2D convolution, is suboptimal compared to a multibranch structure for feature extraction. To address this, we introduce a structure re-parameterization technique with DWConv, termed Rep-tDWConv, to enhance feature extraction. By integrating the Exponential Moving Average (EMA) model during the model fusion process, our proposed model achieves state-of-the-art (SOTA) performance, with accuracies of 90.4% and 92.0% on the XSub and XSet splits of the NTU RGB+D 120 dataset, respectively.

{"title":"SHoTGCN: Spatial high-order temporal GCN for skeleton-based action recognition","authors":"Qiyu Liu , Ying Wu , Bicheng Li , Yuxin Ma , Hanling Li , Yong Yu","doi":"10.1016/j.neucom.2025.129697","DOIUrl":"10.1016/j.neucom.2025.129697","url":null,"abstract":"<div><div>Action recognition algorithms that leverage human skeleton motion data are highly attractive due to their robustness and high information density. Currently, the majority of algorithms in this domain employ graph convolutional neural networks (GCNs). However, these algorithms often neglect the extraction of high-order features. To address this limitation, we propose a novel approach called the Spatial High-Order Temporal Graph Convolution Network (SHoTGCN), designed to evaluate the impact of high-order features on human action recognition. Our method begins by deriving high-order features from human skeleton time series data through temporal interactions. Utilizing these high-order features significantly improves the algorithm’s ability to recognize human actions. Moreover, we found that the traditional feature extraction method, which employs Depthwise Convolution (DWConv) with a single 2D convolution, is suboptimal compared to a multibranch structure for feature extraction. To address this, we introduce a structure re-parameterization technique with DWConv, termed Rep-tDWConv, to enhance feature extraction. By integrating the Exponential Moving Average (EMA) model during the model fusion process, our proposed model achieves state-of-the-art (SOTA) performance, with accuracies of 90.4% and 92.0% on the XSub and XSet splits of the NTU RGB+D 120 dataset, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129697"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An O(1/k) algorithm for multi-agent optimization with inequality constraints

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129770

Peng Li , Yiyi Zhao , Jiangping Hu , Jiangtao Ji

This paper presents a discrete-time solution algorithm for a constrained multi-agent optimization problem with inequality constraints. Its aim is to seek a solution to minimize the sum of all the agents’ objective functions while satisfy each agent’s local set constraint and nonlinear inequality constraints. Assume that agents’ local constraints are heterogeneous and all the objective functions are convex and continuous, but they may not be differentiable. Similar to the distributed alternating direction method of multipliers (ADMM) algorithm, the designed algorithm can solve the multi-agent optimization problem in a distributed manner and has a fast

O (1 / k)

convergence rate. Moreover, it can deal with the nonlinear constraints, which cannot be handled by distributed ADMM algorithm. Finally, the proposed algorithm is applied to solve a robust linear regression problem, a lasso problem and a decentralized joint flow and power control problem with inequality constraints, respectively and thus the effectiveness of the proposed algorithm is verified.

{"title":"An O(1/k) algorithm for multi-agent optimization with inequality constraints","authors":"Peng Li , Yiyi Zhao , Jiangping Hu , Jiangtao Ji","doi":"10.1016/j.neucom.2025.129770","DOIUrl":"10.1016/j.neucom.2025.129770","url":null,"abstract":"<div><div>This paper presents a discrete-time solution algorithm for a constrained multi-agent optimization problem with inequality constraints. Its aim is to seek a solution to minimize the sum of all the agents’ objective functions while satisfy each agent’s local set constraint and nonlinear inequality constraints. Assume that agents’ local constraints are heterogeneous and all the objective functions are convex and continuous, but they may not be differentiable. Similar to the distributed alternating direction method of multipliers (ADMM) algorithm, the designed algorithm can solve the multi-agent optimization problem in a distributed manner and has a fast <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>/</mo><mi>k</mi><mo>)</mo></mrow></mrow></math></span> convergence rate. Moreover, it can deal with the nonlinear constraints, which cannot be handled by distributed ADMM algorithm. Finally, the proposed algorithm is applied to solve a robust linear regression problem, a lasso problem and a decentralized joint flow and power control problem with inequality constraints, respectively and thus the effectiveness of the proposed algorithm is verified.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129770"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AA-mDLAM: An accelerated ADMM-based framework for training deep neural networks AA-mDLAM：基于 ADMM 的加速深度神经网络训练框架

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129744

Zeinab Ebrahimi , Gustavo Batista , Mohammad Deghat

Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an accelerated framework for training deep neural networks, termed AA-mDLAM, which integrates Anderson acceleration within an Alternating Minimization approach inspired by ADMM to tackle this drawback. The main intention of the AA-mDLAM algorithm is to employ Anderson acceleration to alternating minimization by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-mDLAM algorithm by conducting extensive experiments on seven benchmark datasets contrary to other state-of-the-art optimizers.

{"title":"AA-mDLAM: An accelerated ADMM-based framework for training deep neural networks","authors":"Zeinab Ebrahimi , Gustavo Batista , Mohammad Deghat","doi":"10.1016/j.neucom.2025.129744","DOIUrl":"10.1016/j.neucom.2025.129744","url":null,"abstract":"<div><div>Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an accelerated framework for training deep neural networks, termed AA-mDLAM, which integrates Anderson acceleration within an Alternating Minimization approach inspired by ADMM to tackle this drawback. The main intention of the AA-mDLAM algorithm is to employ Anderson acceleration to alternating minimization by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-mDLAM algorithm by conducting extensive experiments on seven benchmark datasets contrary to other state-of-the-art optimizers.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129744"},"PeriodicalIF":5.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A variable-gain fixed-time convergent neurodynamic network for time-variant quadratic programming under unknown noises

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-25 DOI: 10.1016/j.neucom.2025.129778

Biao Song , Tinghe Hong , Weibing Li , Gang Chen , Yongping Pan , Kai Huang

This article proposes a variable-gain fixed-time convergent and noise-tolerant error-dynamics based neurodynamic network (VGFxTNT-EDNN) to solve time-varying quadratic programming problems, while being robust to unknown noises. Unlike existing finite-time convergent EDNNs, the newly designed VGFxTNT-EDNN guarantees fixed-time convergence by dynamically adjusting its variable parameters. Moreover, the VGFxTNT-EDNN effectively handles unknown noise, addressing a limitation of existing fixed-time or predefined-time convergent models, which typically assume that the noise is known. Theoretical analysis utilizing Lyapunov theory proves that the VGFxTNT-EDNN possesses fixed-time convergence and robustness properties. Numerical validations demonstrate superior noise tolerance and fixed-time convergence of the VGFxTNT-EDNN, as compared with the existing models. Finally, a path-tracking experiment is conducted by utilizing a Franka Emika Panda robot to verify the practicality of the VGFxTNT-EDNN.

引用次数: 0

Towards multi-fusion graph neural network for single-cell RNA sequence clustering

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129764

Chen-Min Yang , Dong Huang , Yuan-Kun Xu , Xiuting He , Guang-Yu Zhang , Chang-Dong Wang

Clustering analysis plays a crucial role in single-cell RNA sequencing (scRNA-seq) data analysis, in which the graph neural network (GNN)-based clustering methods have rapidly emerged as a promising technique. Despite considerable progress, the previous scRNA-seq clustering methods still suffer from two critical limitations. First, they mostly treat the node attributes and cell–cell topological information equally, neglecting their (probably) different reliability. Second, they usually only consider the learned representation of the last layer, lacking the ability to fuse multi-scale discriminative information embedded in different layers. In view of this, this paper presents a new single-cell multi-fusion graph neural network (scMFGNN) for scRNA-seq clustering. Particularly, we utilize a multi-fusion graph neural network (MFGNN) for learning discriminative representations while preserving the structural information latent in multi-scale network layers. To cope with the high-dispersion, high-heterogeneity, and high-dimensionality of scRNA-seq data, a zero-inflated negative binomial (ZINB) module is incorporated into the network structure. Furthermore, the consistency between node representations and graph topological information is constrained to guide the joint learning process. It is noteworthy that scMFGNN can dynamically fuse multi-scale representations from multiple layers and meanwhile adaptively combine node representations and topological structural information from the same layer for representation learning and clustering. Experiments on multiple scRNA-seq datasets demonstrate the superiority of scMFGNN over the state-of-the-art. Code available: https://github.com/youngcmm/scMFGNN.

{"title":"Towards multi-fusion graph neural network for single-cell RNA sequence clustering","authors":"Chen-Min Yang , Dong Huang , Yuan-Kun Xu , Xiuting He , Guang-Yu Zhang , Chang-Dong Wang","doi":"10.1016/j.neucom.2025.129764","DOIUrl":"10.1016/j.neucom.2025.129764","url":null,"abstract":"<div><div>Clustering analysis plays a crucial role in single-cell RNA sequencing (scRNA-seq) data analysis, in which the graph neural network (GNN)-based clustering methods have rapidly emerged as a promising technique. Despite considerable progress, the previous scRNA-seq clustering methods still suffer from two critical limitations. First, they mostly treat the node attributes and cell–cell topological information equally, neglecting their (probably) different reliability. Second, they usually only consider the learned representation of the last layer, lacking the ability to fuse multi-scale discriminative information embedded in different layers. In view of this, this paper presents a new single-cell multi-fusion graph neural network (scMFGNN) for scRNA-seq clustering. Particularly, we utilize a multi-fusion graph neural network (MFGNN) for learning discriminative representations while preserving the structural information latent in multi-scale network layers. To cope with the high-dispersion, high-heterogeneity, and high-dimensionality of scRNA-seq data, a zero-inflated negative binomial (ZINB) module is incorporated into the network structure. Furthermore, the consistency between node representations and graph topological information is constrained to guide the joint learning process. It is noteworthy that scMFGNN can dynamically fuse multi-scale representations from multiple layers and meanwhile adaptively combine node representations and topological structural information from the same layer for representation learning and clustering. Experiments on multiple scRNA-seq datasets demonstrate the superiority of scMFGNN over the state-of-the-art. Code available: <span><span>https://github.com/youngcmm/scMFGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129764"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Zero-shot detection of LLM-generated text via text reorder

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129829

Jingtao Sun , Zhanglong Lv

With the rapid advancement of large language model (LLM) technology, particularly with the emergence of advanced models like ChatGPT, distinguishing between LLM-generated and human-written texts has become increasingly challenging. This phenomenon presents unprecedented challenges to academic integrity and authenticity, making the detection of LLM-generated content a pressing concern in scientific research. To effectively and accurately detect texts generated by LLMs, this study constructs a comprehensive dataset of medical paper introductions, encompassing both human-written and LLM-generated content. Based on this dataset, a simple and efficient black-box, zero-shot detection method is proposed. The method builds upon the hypothesis that fundamental differences exist in the linguistic logical ordering between human-written and LLM-generated texts. Specifically, this method reorders the original text using dependency parse trees, calculates the similarity score (Rscore) between the reordered text and the original, and integrates log-likelihood features as auxiliary metrics. The approach synthesizes the reordered similarity and log-likelihood scores to derive a composite metric, establishing an effective classification threshold for discriminating between human-written and LLM-generated texts. The experimental results show that our approach not only effectively detects LLM-generated texts but also identifies LLM-polished abstracts, outperforming current state-of-the-art zero-shot detection methods (SOTA).

{"title":"Zero-shot detection of LLM-generated text via text reorder","authors":"Jingtao Sun , Zhanglong Lv","doi":"10.1016/j.neucom.2025.129829","DOIUrl":"10.1016/j.neucom.2025.129829","url":null,"abstract":"<div><div>With the rapid advancement of large language model (LLM) technology, particularly with the emergence of advanced models like ChatGPT, distinguishing between LLM-generated and human-written texts has become increasingly challenging. This phenomenon presents unprecedented challenges to academic integrity and authenticity, making the detection of LLM-generated content a pressing concern in scientific research. To effectively and accurately detect texts generated by LLMs, this study constructs a comprehensive dataset of medical paper introductions, encompassing both human-written and LLM-generated content. Based on this dataset, a simple and efficient black-box, zero-shot detection method is proposed. The method builds upon the hypothesis that fundamental differences exist in the linguistic logical ordering between human-written and LLM-generated texts. Specifically, this method reorders the original text using dependency parse trees, calculates the similarity score (Rscore) between the reordered text and the original, and integrates log-likelihood features as auxiliary metrics. The approach synthesizes the reordered similarity and log-likelihood scores to derive a composite metric, establishing an effective classification threshold for discriminating between human-written and LLM-generated texts. The experimental results show that our approach not only effectively detects LLM-generated texts but also identifies LLM-polished abstracts, outperforming current state-of-the-art zero-shot detection methods (SOTA).</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"631 ","pages":"Article 129829"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143508091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bilateral-Aware and Multi-Scale Region Guided U-Net for precise breast lesion segmentation in ultrasound images

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129775

Yangyang Li, Xintong Hou, Xuanting Hao, Ronghua Shang, Licheng Jiao

Breast cancer ranks among the leading health risks for women globally, with the significance of early diagnosis and intervention being paramount. Using computers to segment breast lesions from ultrasound images serves as a crucial auxiliary tool for diagnosing and studying this disease. However, the effectiveness of breast tumor segmentation is deeply impacted by the severe artifacts, the low contrast, and the diverse tumor shapes found in breast ultrasound images. The focus of this study is to devise an effective strategy to further alleviate the aforementioned issues and achieve more precise lesion segmentation. Specifically, we designed a Bilateral-Aware and Multi-Scale Region Guided U-Net (BA-MRGU-Net) with a bilateral perception strategy to segment breast tumors. Initially, we devised a Foreground and Background Aware Module (FBAM), primarily composed of an Adaptive Spatial Selection Unit (ASSU) and a Background Suppression Unit (BSU). The ASSU can help the network capture spatial context information that is more relevant to lesions. Concurrently, the BSU suppresses the feature responses of ultrasound artifacts and other tissues in the background. The FBAM can effectively distinguish between the foreground and background through these two independent branches. Subsequently, we developed a Multi-Scale Region Guided Module (MRGM) to utilize the feature maps across various scales to boost the network’s perception of the lesion region. We executed a range of experiments utilizing two widely used datasets with several state-of-the-art algorithms. The results reveal that our approach achieves improvements of varying degrees on multiple evaluation metrics and has superior segmentation performance.

{"title":"Bilateral-Aware and Multi-Scale Region Guided U-Net for precise breast lesion segmentation in ultrasound images","authors":"Yangyang Li, Xintong Hou, Xuanting Hao, Ronghua Shang, Licheng Jiao","doi":"10.1016/j.neucom.2025.129775","DOIUrl":"10.1016/j.neucom.2025.129775","url":null,"abstract":"<div><div>Breast cancer ranks among the leading health risks for women globally, with the significance of early diagnosis and intervention being paramount. Using computers to segment breast lesions from ultrasound images serves as a crucial auxiliary tool for diagnosing and studying this disease. However, the effectiveness of breast tumor segmentation is deeply impacted by the severe artifacts, the low contrast, and the diverse tumor shapes found in breast ultrasound images. The focus of this study is to devise an effective strategy to further alleviate the aforementioned issues and achieve more precise lesion segmentation. Specifically, we designed a Bilateral-Aware and Multi-Scale Region Guided U-Net (BA-MRGU-Net) with a bilateral perception strategy to segment breast tumors. Initially, we devised a Foreground and Background Aware Module (FBAM), primarily composed of an Adaptive Spatial Selection Unit (ASSU) and a Background Suppression Unit (BSU). The ASSU can help the network capture spatial context information that is more relevant to lesions. Concurrently, the BSU suppresses the feature responses of ultrasound artifacts and other tissues in the background. The FBAM can effectively distinguish between the foreground and background through these two independent branches. Subsequently, we developed a Multi-Scale Region Guided Module (MRGM) to utilize the feature maps across various scales to boost the network’s perception of the lesion region. We executed a range of experiments utilizing two widely used datasets with several state-of-the-art algorithms. The results reveal that our approach achieves improvements of varying degrees on multiple evaluation metrics and has superior segmentation performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129775"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing generalization in camera trap image recognition: Fine-tuning visual language models

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129826

Zihe Yang , Ye Tian , Lifeng Wang , Junguo Zhang

This study introduces a novel fine-tuning approach for enhancing the generalization capabilities of visual language models in the context of wildlife monitoring, particularly for camera trap image recognition. In this paper, we introduce Ecological Visual Language Models (Eco-VLMs), a model fine-tuned using an ecological subset of the ImageNet1K dataset (ImageNet1K-E), aimed at reducing the reliance on spurious correlations that affect the performance of models like CLIP when applied to specialized domains. By employing text augmentation techniques and expanding species names with rich descriptors, Eco-VLM is optimized to extract more distinctive features from images, thereby improving its discriminative capabilities for wildlife features. Meanwhile, random contrastive loss is proposed to improve the diversity of training data and the generalization of Eco-VLMs. The proposed Eco-CLIP and Eco-SigLIP model are rigorously evaluated against various camera trap datasets and demonstrates superior performance, with average F1 scores improved by 4.44 % and 3.79 % compared to the standard CLIP and SigLIP model. Intrinsic evaluations further confirm that Eco-VLMs have acquired a broader ecological knowledge base, highlighting its enhanced generalization abilities. This research contributes to the field by addressing the limitations of current visual language models in specialized ecological applications and underscores the potential of Eco-VLMs for improving wildlife monitoring efforts.

{"title":"Enhancing generalization in camera trap image recognition: Fine-tuning visual language models","authors":"Zihe Yang , Ye Tian , Lifeng Wang , Junguo Zhang","doi":"10.1016/j.neucom.2025.129826","DOIUrl":"10.1016/j.neucom.2025.129826","url":null,"abstract":"<div><div>This study introduces a novel fine-tuning approach for enhancing the generalization capabilities of visual language models in the context of wildlife monitoring, particularly for camera trap image recognition. In this paper, we introduce Ecological Visual Language Models (Eco-VLMs), a model fine-tuned using an ecological subset of the ImageNet1K dataset (ImageNet1K-E), aimed at reducing the reliance on spurious correlations that affect the performance of models like CLIP when applied to specialized domains. By employing text augmentation techniques and expanding species names with rich descriptors, Eco-VLM is optimized to extract more distinctive features from images, thereby improving its discriminative capabilities for wildlife features. Meanwhile, random contrastive loss is proposed to improve the diversity of training data and the generalization of Eco-VLMs. The proposed Eco-CLIP and Eco-SigLIP model are rigorously evaluated against various camera trap datasets and demonstrates superior performance, with average F1 scores improved by 4.44 % and 3.79 % compared to the standard CLIP and SigLIP model. Intrinsic evaluations further confirm that Eco-VLMs have acquired a broader ecological knowledge base, highlighting its enhanced generalization abilities. This research contributes to the field by addressing the limitations of current visual language models in specialized ecological applications and underscores the potential of Eco-VLMs for improving wildlife monitoring efforts.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129826"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-security image steganography integrating multi-scale feature fusion with residual attention mechanism

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing

Pub Date : 2025-02-24 DOI: 10.1016/j.neucom.2025.129838

Jiaqi Liang , Wei Xie , Haotian Wu , Junfeng Zhao , Xianhua Song

Constructing a good cost function is crucial for minimizing embedding distortion in image steganography. Recently, deep learning-based adaptive cost learning in image steganography has achieved significant advancements. For GAN-based image steganography, an encoder-decoder structure is typically employed by the generator. However, the continual encoding process often results in a lack of detailed information. Even if the image resolution is restored through skip connections, the generator will still be limited. To address the issue, this paper proposes a novel GAN structure named UMSA-GAN. Firstly, we design a residual attention mechanism, Res-CBAM, integrated into the generator network, which enables focusing on high-frequency regions in the cover image. Secondly, multi-scale feature information is also fused using skip connections, which enables the generator to learn more shallow features. Finally, unlike most of the previous works that only utilized Xu-Net as the discriminator, dual steganalyzers are also introduced as the discriminator to further enhance performance. Extensive comparative experiments demonstrate that UMSA-GAN effectively learns features from the cover images and generates better embedding probability maps. Compared to traditional and state-of-the-art GAN-based steganographic methods, UMSA-GAN exhibits superior security performance. In addition, the rationality and superiority of UMSA-GAN are further verified by a large number of ablation studies.

{"title":"High-security image steganography integrating multi-scale feature fusion with residual attention mechanism","authors":"Jiaqi Liang , Wei Xie , Haotian Wu , Junfeng Zhao , Xianhua Song","doi":"10.1016/j.neucom.2025.129838","DOIUrl":"10.1016/j.neucom.2025.129838","url":null,"abstract":"<div><div>Constructing a good cost function is crucial for minimizing embedding distortion in image steganography. Recently, deep learning-based adaptive cost learning in image steganography has achieved significant advancements. For GAN-based image steganography, an encoder-decoder structure is typically employed by the generator. However, the continual encoding process often results in a lack of detailed information. Even if the image resolution is restored through skip connections, the generator will still be limited. To address the issue, this paper proposes a novel GAN structure named UMSA-GAN. Firstly, we design a residual attention mechanism, Res-CBAM, integrated into the generator network, which enables focusing on high-frequency regions in the cover image. Secondly, multi-scale feature information is also fused using skip connections, which enables the generator to learn more shallow features. Finally, unlike most of the previous works that only utilized Xu-Net as the discriminator, dual steganalyzers are also introduced as the discriminator to further enhance performance. Extensive comparative experiments demonstrate that UMSA-GAN effectively learns features from the cover images and generates better embedding probability maps. Compared to traditional and state-of-the-art GAN-based steganographic methods, UMSA-GAN exhibits superior security performance. In addition, the rationality and superiority of UMSA-GAN are further verified by a large number of ablation studies.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"632 ","pages":"Article 129838"},"PeriodicalIF":5.5,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143511837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Neurocomputing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀