首页 > 最新文献

Neurocomputing最新文献

英文 中文
CMGN: Text GNN and RWKV MLP-mixer combined with cross-feature fusion for fake news detection
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-03 DOI: 10.1016/j.neucom.2025.129811
ShaoDong Cui, Kaibo Duan, Wen Ma, Hiroyuki Shinnou
With the rapid development of social media, the influence and harm of fake news have gradually increased, making accurate detection of fake news particularly important. Current fake news detection methods primarily rely on the main text of the news, neglecting the interrelationships between additional texts. We propose a cross-feature fusion network with additional text graph construction to address this issue and improve fake news detection. Specifically, we utilize a text graph neural network (GNN) to model the graph relationships of additional texts to enhance the model’s perception capabilities. Additionally, we employ the RWKV MLP-mixer to process the news text and design a cross-feature fusion mechanism to achieve mutual fusion of different features, thereby improving fake news detection. Experiments on the LIAR, FA-KES, IFND, and CHEF datasets demonstrate that our proposed model outperforms existing methods in fake news detection.
{"title":"CMGN: Text GNN and RWKV MLP-mixer combined with cross-feature fusion for fake news detection","authors":"ShaoDong Cui,&nbsp;Kaibo Duan,&nbsp;Wen Ma,&nbsp;Hiroyuki Shinnou","doi":"10.1016/j.neucom.2025.129811","DOIUrl":"10.1016/j.neucom.2025.129811","url":null,"abstract":"<div><div>With the rapid development of social media, the influence and harm of fake news have gradually increased, making accurate detection of fake news particularly important. Current fake news detection methods primarily rely on the main text of the news, neglecting the interrelationships between additional texts. We propose a cross-feature fusion network with additional text graph construction to address this issue and improve fake news detection. Specifically, we utilize a text graph neural network (GNN) to model the graph relationships of additional texts to enhance the model’s perception capabilities. Additionally, we employ the RWKV MLP-mixer to process the news text and design a cross-feature fusion mechanism to achieve mutual fusion of different features, thereby improving fake news detection. Experiments on the LIAR, FA-KES, IFND, and CHEF datasets demonstrate that our proposed model outperforms existing methods in fake news detection.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129811"},"PeriodicalIF":5.5,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
H-SGANet: Hybrid sparse graph attention network for deformable medical image registration
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-02 DOI: 10.1016/j.neucom.2025.129810
Yufeng Zhou, Wenming Cao
The integration of Convolutional Neural Networks (ConvNets) and Transformers has become a strong candidate for image registration, combining the strengths of both models and utilizing a large parameter space. However, this hybrid model, which treats brain MRI volumes as grid or sequence structures, struggles to accurately represent anatomical connectivity, diverse brain regions, and critical connections within the brain’s architecture. There are also concerns about the computational expense and GPU memory usage of this model. To address these issues, we propose a lightweight hybrid sparse graph attention network (H-SGANet). The network includes Sparse Graph Attention (SGA), a core mechanism based on Vision Graph Neural Networks (ViG) with predefined anatomical connections. The SGA module expands the model’s receptive field and integrates seamlessly into the network. To further enhance the hybrid network, Separable Self-Attention (SSA) is used as an advanced token mixer, combined with depth-wise convolution to form SSAFormer. This strategic integration is designed to more effectively extract long-range dependencies. As a hybrid ConvNet-ViG-Transformer model, H-SGANet offers three key benefits for volumetric medical image registration. It optimizes fixed and moving images simultaneously through a hybrid feature fusion layer and an end-to-end learning framework. Compared to VoxelMorph, a model with a similar parameter count, H-SGANet demonstrates significant performance enhancements of 3.5% and 1.5% in Dice score on the OASIS dataset and LPBA40 dataset, respectively. The code is publicly available at https://github.com/2250432015/H-SGANet/.
{"title":"H-SGANet: Hybrid sparse graph attention network for deformable medical image registration","authors":"Yufeng Zhou,&nbsp;Wenming Cao","doi":"10.1016/j.neucom.2025.129810","DOIUrl":"10.1016/j.neucom.2025.129810","url":null,"abstract":"<div><div>The integration of Convolutional Neural Networks (ConvNets) and Transformers has become a strong candidate for image registration, combining the strengths of both models and utilizing a large parameter space. However, this hybrid model, which treats brain MRI volumes as grid or sequence structures, struggles to accurately represent anatomical connectivity, diverse brain regions, and critical connections within the brain’s architecture. There are also concerns about the computational expense and GPU memory usage of this model. To address these issues, we propose a lightweight hybrid sparse graph attention network (H-SGANet). The network includes Sparse Graph Attention (SGA), a core mechanism based on Vision Graph Neural Networks (ViG) with predefined anatomical connections. The SGA module expands the model’s receptive field and integrates seamlessly into the network. To further enhance the hybrid network, Separable Self-Attention (SSA) is used as an advanced token mixer, combined with depth-wise convolution to form SSAFormer. This strategic integration is designed to more effectively extract long-range dependencies. As a hybrid ConvNet-ViG-Transformer model, H-SGANet offers three key benefits for volumetric medical image registration. It optimizes fixed and moving images simultaneously through a hybrid feature fusion layer and an end-to-end learning framework. Compared to VoxelMorph, a model with a similar parameter count, H-SGANet demonstrates significant performance enhancements of 3.5% and 1.5% in Dice score on the OASIS dataset and LPBA40 dataset, respectively. The code is publicly available at <span><span>https://github.com/2250432015/H-SGANet/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129810"},"PeriodicalIF":5.5,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fairness in constrained spectral clustering
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 DOI: 10.1016/j.neucom.2025.129815
Laxita Agrawal , V. Vijaya Saradhi , Teena Sharma
Semi-supervised clustering methods have gained significant attention in both theoretical research and real-world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method’s performance under different settings, reinforcing its robustness and applicability in real-world scenarios.
{"title":"Fairness in constrained spectral clustering","authors":"Laxita Agrawal ,&nbsp;V. Vijaya Saradhi ,&nbsp;Teena Sharma","doi":"10.1016/j.neucom.2025.129815","DOIUrl":"10.1016/j.neucom.2025.129815","url":null,"abstract":"<div><div>Semi-supervised clustering methods have gained significant attention in both theoretical research and real-world applications, including economics, finance, marketing, and healthcare. Among these methods, constrained spectral clustering enhances clustering quality by incorporating pairwise constraints, namely, must-link and cannot-link constraints, which guide the clustering process by specifying whether certain data points should or should not belong to the same cluster. However, traditional constrained spectral clustering methods may inadvertently propagate biases present in the data or constraints, leading to unequal representation of sensitive groups, such as different genders or racial groups, across clusters. This imbalance raises concerns about fairness, an issue that remains largely unexplored in constrained spectral clustering. To address this gap, this paper proposes a novel method named fair-constrained Spectral Clustering (fair-cSC). The proposed method integrates fairness into the must-link and cannot-link constraints by defining a fair constraint matrix, ensuring that pairwise relationships do not introduce bias against any particular group. Additionally, a balance constraint is incorporated to enforce fairness across input data points, promoting equal representation of sensitive groups within clusters. Comprehensive experiments on six benchmarked datasets, including ablation studies, demonstrate that the proposed fair-cSC method effectively enhances fairness while preserving clustering quality. Furthermore, the ablation study provides insights into the method’s performance under different settings, reinforcing its robustness and applicability in real-world scenarios.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129815"},"PeriodicalIF":5.5,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoupled contrastive learning for multilingual multimodal medical pre-trained model
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 DOI: 10.1016/j.neucom.2025.129809
Qiyuan Li , Chen Qiu , Haijiang Liu , Jinguang Gu , Dan Luo
Multilingual multimodal pre-training aims to facilitate the integration of conceptual representations across diverse languages and modalities within a shared, high-dimensional semantic space. This endeavor in healthcare faces challenges related to language diversity, suboptimal multimodal interactions, and an absence of coherent multilingual multimodal representations. In response to these challenges, we introduce a novel multilingual multimodal medical pre-training model. Initially, we employ a strategic augmentation of the medical corpus by expanding the MIMIC-CXR report dataset to 20 distinct languages using machine translation techniques. Subsequently, we develop a targeted label disambiguation technique to address the labeling noise within decoupled contrastive learning. In particular, it categorizes and refines uncertain phrases within the clinical reports based on disease type, promoting finer-grained semantic similarity and improving inter-modality interactions. Building on these proposals, we present a refined multilingual multimodal medical pre-trained model, significantly enhancing the understanding of medical multimodal data and adapting the model to multilingual medical contexts. Experiments reveal that our model outperforms other baselines in medical image classification and multilingual medical image–text retrieval by up to 13.78% and 12.6%, respectively.
{"title":"Decoupled contrastive learning for multilingual multimodal medical pre-trained model","authors":"Qiyuan Li ,&nbsp;Chen Qiu ,&nbsp;Haijiang Liu ,&nbsp;Jinguang Gu ,&nbsp;Dan Luo","doi":"10.1016/j.neucom.2025.129809","DOIUrl":"10.1016/j.neucom.2025.129809","url":null,"abstract":"<div><div>Multilingual multimodal pre-training aims to facilitate the integration of conceptual representations across diverse languages and modalities within a shared, high-dimensional semantic space. This endeavor in healthcare faces challenges related to language diversity, suboptimal multimodal interactions, and an absence of coherent multilingual multimodal representations. In response to these challenges, we introduce a novel multilingual multimodal medical pre-training model. Initially, we employ a strategic augmentation of the medical corpus by expanding the MIMIC-CXR report dataset to 20 distinct languages using machine translation techniques. Subsequently, we develop a targeted label disambiguation technique to address the labeling noise within decoupled contrastive learning. In particular, it categorizes and refines uncertain phrases within the clinical reports based on disease type, promoting finer-grained semantic similarity and improving inter-modality interactions. Building on these proposals, we present a refined multilingual multimodal medical pre-trained model, significantly enhancing the understanding of medical multimodal data and adapting the model to multilingual medical contexts. Experiments reveal that our model outperforms other baselines in medical image classification and multilingual medical image–text retrieval by up to 13.78% and 12.6%, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129809"},"PeriodicalIF":5.5,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal information fusion for multi-task end-to-end behavior prediction in autonomous driving
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 DOI: 10.1016/j.neucom.2025.129857
Guo Baicang , Liu Hao , Yang Xiao , Cao Yuan , Jin Lisheng , Wang Yinlin
Behavior prediction in autonomous driving is increasingly achieved through end-to-end frameworks that predict vehicle states from multi-modal information, streamlining decision-making and enhancing robustness in time-varying road conditions. This study proposes a novel multi-modal information fusion-based, multi-task end-to-end model that integrates RGB images, depth maps, and semantic segmentation data, enhancing situational awareness and predictive precision. Utilizing a Vision Transformer (ViT) for comprehensive spatial feature extraction and a Residual-CNN-BiGRU structure for capturing temporal dependencies, the model fuses spatiotemporal features to predict vehicle speed and steering angle with high precision. Through comparative, ablation, and generalization tests on the Udacity and self-collected datasets, the proposed model achieves steering angle prediction errors of MSE 0.012 rad, RMSE 0.109 rad, and MAE 0.074 rad, and speed prediction errors of MSE 0.321 km/h, RMSE 0.567 km/h, and MAE 0.373 km/h, outperforming existing driving behavior prediction models. Key contributions of this study include the development of a channel difference attention mechanism and advanced spatiotemporal feature fusion techniques, which improve predictive accuracy and robustness. These methods effectively balance computational efficiency and predictive performance, contributing to practical advancements in driving behavior prediction.
{"title":"Multi-modal information fusion for multi-task end-to-end behavior prediction in autonomous driving","authors":"Guo Baicang ,&nbsp;Liu Hao ,&nbsp;Yang Xiao ,&nbsp;Cao Yuan ,&nbsp;Jin Lisheng ,&nbsp;Wang Yinlin","doi":"10.1016/j.neucom.2025.129857","DOIUrl":"10.1016/j.neucom.2025.129857","url":null,"abstract":"<div><div>Behavior prediction in autonomous driving is increasingly achieved through end-to-end frameworks that predict vehicle states from multi-modal information, streamlining decision-making and enhancing robustness in time-varying road conditions. This study proposes a novel multi-modal information fusion-based, multi-task end-to-end model that integrates RGB images, depth maps, and semantic segmentation data, enhancing situational awareness and predictive precision. Utilizing a Vision Transformer (ViT) for comprehensive spatial feature extraction and a Residual-CNN-BiGRU structure for capturing temporal dependencies, the model fuses spatiotemporal features to predict vehicle speed and steering angle with high precision. Through comparative, ablation, and generalization tests on the Udacity and self-collected datasets, the proposed model achieves steering angle prediction errors of MSE 0.012 rad, RMSE 0.109 rad, and MAE 0.074 rad, and speed prediction errors of MSE 0.321 km/h, RMSE 0.567 km/h, and MAE 0.373 km/h, outperforming existing driving behavior prediction models. Key contributions of this study include the development of a channel difference attention mechanism and advanced spatiotemporal feature fusion techniques, which improve predictive accuracy and robustness. These methods effectively balance computational efficiency and predictive performance, contributing to practical advancements in driving behavior prediction.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129857"},"PeriodicalIF":5.5,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on self-adaptive grid point cloud down-sampling method based on plane fitting and Mahalanobis distance Gaussian weighting
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-28 DOI: 10.1016/j.neucom.2025.129746
Hongfei Zu , Jing Zhu , Xinfeng Wang , Xiang Zhang , Ning Chen , Gangxiang Guo , Zhangwei Chen
In this manuscript, a self-adaptive grid point cloud down-sampling method based on plane fitting was proposed, which could effectively reduce redundant data while better preserving the geometric features of the original model and maintaining high accuracy. This method first constructs initial voxel grids and divides the grids into large density and small density ones according to the point cloud density. After that, for small density grids, the boundary points are extracted first, and the rest areas are uniformly sampled, while for large density grids, a method based on Mahalanobis distance Gaussian weighting is proposed and adopted to estimate the normal vector of points, and feature points are determined and retained by calculating the information entropy. Then, three models in the public dataset, the Cat model, Bed_0355 model and Fandisk model, were employed as test subjects to compare the proposed method with two commonly used down-sampling methods: uniform sampling and voxel grid sampling methods. The results indicated that this new method was able to better retain the geometric features of the original models, especially high curvature and sharp parts, with smaller errors and fewer holes. Finally, this method was applied to the down-sampling of 3D scanning point clouds of two typical metal machine parts, threaded joint and sheet metal part, and the measured results demonstrated that this method not only effectively preserved the model features, but also guaranteed accuracy of key geometric dimensions after high reduction ratio down-sampling, such as the relative errors of thread tooth angles and hole inner diameters being less than 1 %.
{"title":"Research on self-adaptive grid point cloud down-sampling method based on plane fitting and Mahalanobis distance Gaussian weighting","authors":"Hongfei Zu ,&nbsp;Jing Zhu ,&nbsp;Xinfeng Wang ,&nbsp;Xiang Zhang ,&nbsp;Ning Chen ,&nbsp;Gangxiang Guo ,&nbsp;Zhangwei Chen","doi":"10.1016/j.neucom.2025.129746","DOIUrl":"10.1016/j.neucom.2025.129746","url":null,"abstract":"<div><div>In this manuscript, a self-adaptive grid point cloud down-sampling method based on plane fitting was proposed, which could effectively reduce redundant data while better preserving the geometric features of the original model and maintaining high accuracy. This method first constructs initial voxel grids and divides the grids into large density and small density ones according to the point cloud density. After that, for small density grids, the boundary points are extracted first, and the rest areas are uniformly sampled, while for large density grids, a method based on Mahalanobis distance Gaussian weighting is proposed and adopted to estimate the normal vector of points, and feature points are determined and retained by calculating the information entropy. Then, three models in the public dataset, the Cat model, Bed_0355 model and Fandisk model, were employed as test subjects to compare the proposed method with two commonly used down-sampling methods: uniform sampling and voxel grid sampling methods. The results indicated that this new method was able to better retain the geometric features of the original models, especially high curvature and sharp parts, with smaller errors and fewer holes. Finally, this method was applied to the down-sampling of 3D scanning point clouds of two typical metal machine parts, threaded joint and sheet metal part, and the measured results demonstrated that this method not only effectively preserved the model features, but also guaranteed accuracy of key geometric dimensions after high reduction ratio down-sampling, such as the relative errors of thread tooth angles and hole inner diameters being less than 1 %.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"634 ","pages":"Article 129746"},"PeriodicalIF":5.5,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143548122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal feature symbiosis for personalized meta-path generation in heterogeneous networks
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-28 DOI: 10.1016/j.neucom.2025.129780
Xiaotong Wu, Liqing Qiu, Weidong Zhao
In heterogeneous graph neural networks (HGNNs), the capture of intricate relationships among various types of entities is essential to achieve advanced machine learning applications. Heterogeneous Information Networks (HINs), composed of interconnected multi-type nodes and edges, face significant challenges in managing semantic diversity and inherent heterogeneity. Traditional methods, which rely on manually designed meta-paths, struggle to adapt dynamically to personalized needs and often neglect the integration of structural and attribute features. To address these limitations, this paper introduces the Cross-Modal Symbiotic Meta-Path Generator (CSMPG) framework. CSMPG integrates two key modules: a Cross-Modal State Generation Module that encodes node structure and attribute information into task-aware state vectors and a Personalized Meta-Path Generation Module that dynamically generates and refines meta-paths using reinforcement learning. By leveraging downstream task feedback, CSMPG optimizes path selection to maximize performance. The framework effectively balances cross-modal feature integration and semantic diversity, uncovering impactful meta-paths that are often overlooked by traditional approaches. Experimental results demonstrate that CSMPG consistently enhances recommendation quality and significantly outperforms structure-only and predefined-path-based models.
{"title":"Cross-modal feature symbiosis for personalized meta-path generation in heterogeneous networks","authors":"Xiaotong Wu,&nbsp;Liqing Qiu,&nbsp;Weidong Zhao","doi":"10.1016/j.neucom.2025.129780","DOIUrl":"10.1016/j.neucom.2025.129780","url":null,"abstract":"<div><div>In heterogeneous graph neural networks (HGNNs), the capture of intricate relationships among various types of entities is essential to achieve advanced machine learning applications. Heterogeneous Information Networks (HINs), composed of interconnected multi-type nodes and edges, face significant challenges in managing semantic diversity and inherent heterogeneity. Traditional methods, which rely on manually designed meta-paths, struggle to adapt dynamically to personalized needs and often neglect the integration of structural and attribute features. To address these limitations, this paper introduces the Cross-Modal Symbiotic Meta-Path Generator (CSMPG) framework. CSMPG integrates two key modules: a Cross-Modal State Generation Module that encodes node structure and attribute information into task-aware state vectors and a Personalized Meta-Path Generation Module that dynamically generates and refines meta-paths using reinforcement learning. By leveraging downstream task feedback, CSMPG optimizes path selection to maximize performance. The framework effectively balances cross-modal feature integration and semantic diversity, uncovering impactful meta-paths that are often overlooked by traditional approaches. Experimental results demonstrate that CSMPG consistently enhances recommendation quality and significantly outperforms structure-only and predefined-path-based models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129780"},"PeriodicalIF":5.5,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143529541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-shot medical relation extraction via prompt tuning enhanced pre-trained language model
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-27 DOI: 10.1016/j.neucom.2025.129752
Guoxiu He , Chen Huang
Medical relation extraction is crucial for developing structured information to support intelligent healthcare systems. However, acquiring large volumes of labeled medical data is challenging due to the specialized nature of medical knowledge and privacy constraints. To address this, we propose a prompt-enhanced few-shot relation extraction (FSRE) model that leverages few-shot and prompt learning techniques to improve performance with minimal data. Our approach introduces a hard prompt concatenated to the original input, enabling contextually enriched learning. We calculate prototype representations by averaging the intermediate states of each relation class in the support set, and classify relations by finding the shortest distance between the query instance and class prototypes. We evaluate our model against existing deep learning based FSRE models using three biomedical datasets: the 2010 i2b2/VA challenge dataset, the CHEMPROT corpus, and the BioRED dataset, focusing on few-shot scenarios with limited training data. Our model demonstrates exceptional performance, achieving the highest accuracy across all datasets in most training configurations under a 3-way-5-shot condition and significantly surpassing the current state-of-the-art. Particularly, it achieves improvements ranging from 1.25% to 11.25% on the 2010 i2b2/VA challenge dataset, 3.4% to 20.2% on the CHEMPROT dataset, and 2.73% to 10.98% on the BioRED dataset compared to existing models. These substantial gains highlight the model’s robust generalization ability, enabling it to effectively handle previously unseen relations during testing. The demonstrated effectiveness of this approach underscores its potential for diverse medical applications, particularly in scenarios where acquiring extensive labeled data is challenging.
医疗关系提取对于开发结构化信息以支持智能医疗系统至关重要。然而,由于医学知识的专业性和隐私限制,获取大量标注医疗数据具有挑战性。为解决这一问题,我们提出了一种提示增强型少量关系提取(FSRE)模型,该模型利用少量关系提取和提示学习技术,以最少的数据提高性能。我们的方法在原始输入的基础上引入了硬提示,从而实现了丰富的上下文学习。我们通过平均支持集中每个关系类的中间状态来计算原型表示,并通过寻找查询实例与类原型之间的最短距离来对关系进行分类。我们使用三个生物医学数据集(2010 i2b2/VA challenge 数据集、CHEMPROT 语料库和 BioRED 数据集)对我们的模型与现有的基于深度学习的 FSRE 模型进行了评估,重点是训练数据有限的少数几个场景。我们的模型表现出了卓越的性能,在 3 路 5 次拍摄条件下的大多数训练配置中,它在所有数据集上都达到了最高的准确率,大大超过了目前最先进的模型。特别是,与现有模型相比,该模型在 2010 i2b2/VA 挑战赛数据集上提高了 1.25% 到 11.25%,在 CHEMPROT 数据集上提高了 3.4% 到 20.2%,在 BioRED 数据集上提高了 2.73% 到 10.98%。这些大幅提升凸显了该模型强大的泛化能力,使其能够在测试过程中有效处理以前未见过的关系。这种方法的有效性凸显了它在各种医疗应用中的潜力,尤其是在获取大量标记数据具有挑战性的情况下。
{"title":"Few-shot medical relation extraction via prompt tuning enhanced pre-trained language model","authors":"Guoxiu He ,&nbsp;Chen Huang","doi":"10.1016/j.neucom.2025.129752","DOIUrl":"10.1016/j.neucom.2025.129752","url":null,"abstract":"<div><div>Medical relation extraction is crucial for developing structured information to support intelligent healthcare systems. However, acquiring large volumes of labeled medical data is challenging due to the specialized nature of medical knowledge and privacy constraints. To address this, we propose a prompt-enhanced few-shot relation extraction (FSRE) model that leverages few-shot and prompt learning techniques to improve performance with minimal data. Our approach introduces a hard prompt concatenated to the original input, enabling contextually enriched learning. We calculate prototype representations by averaging the intermediate states of each relation class in the support set, and classify relations by finding the shortest distance between the query instance and class prototypes. We evaluate our model against existing deep learning based FSRE models using three biomedical datasets: the 2010 i2b2/VA challenge dataset, the CHEMPROT corpus, and the BioRED dataset, focusing on few-shot scenarios with limited training data. Our model demonstrates exceptional performance, achieving the highest accuracy across all datasets in most training configurations under a 3-way-5-shot condition and significantly surpassing the current state-of-the-art. Particularly, it achieves improvements ranging from 1.25% to 11.25% on the 2010 i2b2/VA challenge dataset, 3.4% to 20.2% on the CHEMPROT dataset, and 2.73% to 10.98% on the BioRED dataset compared to existing models. These substantial gains highlight the model’s robust generalization ability, enabling it to effectively handle previously unseen relations during testing. The demonstrated effectiveness of this approach underscores its potential for diverse medical applications, particularly in scenarios where acquiring extensive labeled data is challenging.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129752"},"PeriodicalIF":5.5,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse reinforcement learning by expert imitation for the stochastic linear–quadratic optimal control problem
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-27 DOI: 10.1016/j.neucom.2025.129758
Zhongshi Sun , Guangyan Jia
This article studies inverse reinforcement learning (IRL) for the linear–quadratic stochastic optimal control problem, where two agents are considered. A learner agent lacks knowledge of the expert agent’s cost function, but it reconstructs an underlying cost function by observing the expert agent’s states and controls, thereby imitating the expert agent’s optimal feedback control. We initially present a model-based IRL method, which consists of a policy correction and a policy update from the policy iteration in reinforcement learning, as well as a cost function weight reconstruction informed by the inverse optimal control. Afterward, under this scheme, we propose a model-free off-policy IRL method, which requires no system identification, only collecting behavior data from the learner agent and expert agent once during the iteration process. Moreover, the proofs of the method’s convergence, stability, and non-unique solutions are given. Finally, a numerical example and an inverse mean–variance portfolio optimization example are provided to validate the effectiveness of the presented method.
{"title":"Inverse reinforcement learning by expert imitation for the stochastic linear–quadratic optimal control problem","authors":"Zhongshi Sun ,&nbsp;Guangyan Jia","doi":"10.1016/j.neucom.2025.129758","DOIUrl":"10.1016/j.neucom.2025.129758","url":null,"abstract":"<div><div>This article studies inverse reinforcement learning (IRL) for the linear–quadratic stochastic optimal control problem, where two agents are considered. A learner agent lacks knowledge of the expert agent’s cost function, but it reconstructs an underlying cost function by observing the expert agent’s states and controls, thereby imitating the expert agent’s optimal feedback control. We initially present a model-based IRL method, which consists of a policy correction and a policy update from the policy iteration in reinforcement learning, as well as a cost function weight reconstruction informed by the inverse optimal control. Afterward, under this scheme, we propose a model-free off-policy IRL method, which requires no system identification, only collecting behavior data from the learner agent and expert agent once during the iteration process. Moreover, the proofs of the method’s convergence, stability, and non-unique solutions are given. Finally, a numerical example and an inverse mean–variance portfolio optimization example are provided to validate the effectiveness of the presented method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129758"},"PeriodicalIF":5.5,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dynamic-static feature fusion learning network for speech emotion recognition
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-02-27 DOI: 10.1016/j.neucom.2025.129836
Peiyun Xue , Xiang Gao , Jing Bai , Zhenan Dong , Zhiyu Wang , Jiangshuai Xu
Speech is a paramount mode of human communication, and enhancing the quality and fluency of Human-Computer Interaction (HCI) greatly benefits from the significant contribution of Speech Emotion Recognition (SER). Feature representation poses a persistent challenge in SER. A single feature is difficult to adequately represent speech emotion, while directly concatenating multiple features may overlook the complementary nature and introduce interference due to redundant information. Towards these difficulties, this paper proposes a Multi-feature Learning network based on Dynamic-Static feature Fusion (ML-DSF) to obtain an effective hybrid feature representation for SER. Firstly, a Time-Frequency domain Self-Calibration Module (TFSC) is proposed to help the traditional convolutional neural networks in extracting static image features from the Log-Mel spectrograms. Then, a Lightweight Temporal Convolutional Network (L-TCNet) is used to acquire multi-scale dynamic temporal causal knowledge from the Mel Frequency Cepstrum Coefficients (MFCC). At last, both extracted features groups are fed into a connection attention module, optimized by Principal Component Analysis (PCA), facilitating emotion classification by reducing redundant information and enhancing the complementary information between features. For ensuring the independence of feature extraction, this paper adopts the training separation strategy. Evaluating the proposed model on two public datasets yielded a Weighted Accuracy (WA) of 93.33 % and an Unweighted Accuracy (UA) of 93.12 % on the RAVDESS dataset, and 94.95 % WA and 94.56 % UA on the EmoDB dataset. The obtained results outperformed the State-Of-The-Art (SOTA) findings. Meanwhile, the effectiveness of each module is validated by ablation experiments, and the generalization analysis is carried out on the cross-corpus SER tasks.
语音是人类交流的重要方式,而提高人机交互(HCI)的质量和流畅性则极大地得益于语音情感识别(SER)的重要贡献。在 SER 中,特征表示是一个长期的挑战。单一特征难以充分代表语音情感,而直接连接多个特征可能会忽略互补性,并因冗余信息而引入干扰。针对这些难题,本文提出了一种基于动态-静态特征融合的多特征学习网络(ML-DSF),为 SER 获得有效的混合特征表示。首先,本文提出了时频域自校准模块(TFSC),以帮助传统卷积神经网络从 Log-Mel 光谱图中提取静态图像特征。然后,使用轻量级时域卷积网络 (L-TCNet) 从 Mel Frequency Cepstrum Coefficients (MFCC) 中获取多尺度动态时域因果知识。最后,将提取的两组特征输入连接注意模块,通过主成分分析(PCA)进行优化,减少冗余信息,增强特征间的互补信息,从而促进情绪分类。为确保特征提取的独立性,本文采用了训练分离策略。在两个公开数据集上对所提出的模型进行了评估,结果表明在 RAVDESS 数据集上的加权准确率(WA)为 93.33 %,非加权准确率(UA)为 93.12 %;在 EmoDB 数据集上的加权准确率(WA)为 94.95 %,非加权准确率(UA)为 94.56 %。所获得的结果优于最新研究成果(SOTA)。同时,通过消融实验验证了各模块的有效性,并在跨语料库 SER 任务中进行了泛化分析。
{"title":"A dynamic-static feature fusion learning network for speech emotion recognition","authors":"Peiyun Xue ,&nbsp;Xiang Gao ,&nbsp;Jing Bai ,&nbsp;Zhenan Dong ,&nbsp;Zhiyu Wang ,&nbsp;Jiangshuai Xu","doi":"10.1016/j.neucom.2025.129836","DOIUrl":"10.1016/j.neucom.2025.129836","url":null,"abstract":"<div><div>Speech is a paramount mode of human communication, and enhancing the quality and fluency of Human-Computer Interaction (HCI) greatly benefits from the significant contribution of Speech Emotion Recognition (SER). Feature representation poses a persistent challenge in SER. A single feature is difficult to adequately represent speech emotion, while directly concatenating multiple features may overlook the complementary nature and introduce interference due to redundant information. Towards these difficulties, this paper proposes a Multi-feature Learning network based on Dynamic-Static feature Fusion (ML-DSF) to obtain an effective hybrid feature representation for SER. Firstly, a Time-Frequency domain Self-Calibration Module (TFSC) is proposed to help the traditional convolutional neural networks in extracting static image features from the Log-Mel spectrograms. Then, a Lightweight Temporal Convolutional Network (L-TCNet) is used to acquire multi-scale dynamic temporal causal knowledge from the Mel Frequency Cepstrum Coefficients (MFCC). At last, both extracted features groups are fed into a connection attention module, optimized by Principal Component Analysis (PCA), facilitating emotion classification by reducing redundant information and enhancing the complementary information between features. For ensuring the independence of feature extraction, this paper adopts the training separation strategy. Evaluating the proposed model on two public datasets yielded a Weighted Accuracy (WA) of 93.33 % and an Unweighted Accuracy (UA) of 93.12 % on the <em>RAVDESS</em> dataset, and 94.95 % WA and 94.56 % UA on the <em>EmoDB</em> dataset. The obtained results outperformed the State-Of-The-Art (SOTA) findings. Meanwhile, the effectiveness of each module is validated by ablation experiments, and the generalization analysis is carried out on the cross-corpus SER tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"633 ","pages":"Article 129836"},"PeriodicalIF":5.5,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1