Neural Networks最新文献_第9页

Multi-knowledge informed deep learning model for multi-point prediction of Alzheimer’s disease progression

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-02-01 DOI: 10.1016/j.neunet.2025.107203

Kai Wu , Hong Wang , Feiyan Feng , Tianyu Liu , Yanshen Sun

The diagnosis of Alzheimer’s disease (AD) based on visual features-informed by clinical knowledge has achieved excellent results. Our study endeavors to present an innovative and detailed deep learning framework designed to accurately predict the progression of Alzheimer’s disease. We propose Mul-KMPP, a Multi-Knowledge Informed Deep Learning Model for Multi-Point Prediction of AD progression, intended to facilitate precise assessments of AD progression in older adults. Firstly, we designed a dual-path methodology to capture global and local brain characteristics for visual feature extraction (utilizing MRIs). Then, we developed a diagnostic module before the prediction module, leveraging AAL (Anatomical Automatic Labeling) knowledge. Following this, predictions are informed by clinical insights. For this purpose, we devised a new composite loss function, including diagnosis loss, prediction loss, and consistency loss of the two modules. To validate our model, we compiled a dataset comprising 819 samples and the results demonstrate that our Mul-KMPP model achieved an accuracy of 86.8%, sensitivity of 86.1%, specificity of 92.1%, and area under the curve (AUC) of 95.9%, significantly outperforming several competing diagnostic methods at every time point. The source code for our model is available at https://github.com/Camelus-to/Mul-KMPP.

{"title":"Multi-knowledge informed deep learning model for multi-point prediction of Alzheimer’s disease progression","authors":"Kai Wu , Hong Wang , Feiyan Feng , Tianyu Liu , Yanshen Sun","doi":"10.1016/j.neunet.2025.107203","DOIUrl":"10.1016/j.neunet.2025.107203","url":null,"abstract":"<div><div>The diagnosis of Alzheimer’s disease (AD) based on visual features-informed by clinical knowledge has achieved excellent results. Our study endeavors to present an innovative and detailed deep learning framework designed to accurately predict the progression of Alzheimer’s disease. We propose <strong>Mul-KMPP</strong>, a <strong>Mul</strong>ti-<strong>K</strong>nowledge Informed Deep Learning Model for <strong>M</strong>ulti-<strong>P</strong>oint <strong>P</strong>rediction of AD progression, intended to facilitate precise assessments of AD progression in older adults. Firstly, we designed a dual-path methodology to capture global and local brain characteristics for visual feature extraction (utilizing MRIs). Then, we developed a diagnostic module before the prediction module, leveraging AAL (Anatomical Automatic Labeling) knowledge. Following this, predictions are informed by clinical insights. For this purpose, we devised a new composite loss function, including diagnosis loss, prediction loss, and consistency loss of the two modules. To validate our model, we compiled a dataset comprising 819 samples and the results demonstrate that our Mul-KMPP model achieved an accuracy of 86.8%, sensitivity of 86.1%, specificity of 92.1%, and area under the curve (AUC) of 95.9%, significantly outperforming several competing diagnostic methods at every time point. The source code for our model is available at <span><span>https://github.com/Camelus-to/Mul-KMPP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107203"},"PeriodicalIF":6.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What is the impact of discrete memristor on the performance of neural network: A research on discrete memristor-based BP neural network

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-02-01 DOI: 10.1016/j.neunet.2025.107213

Yuexi Peng , Maolin Li , Zhijun Li , Minglin Ma , Mengjiao Wang , Shaobo He

Artificial neural networks are receiving increasing attention from researchers. However, with the advent of big data era, artificial neural networks are limited by the Von Neumann architecture, making it difficult to make new breakthroughs in hardware implementation. Discrete-time memristor, emerging as a research focus in recent years, are anticipated to address this challenge effectively. To enrich the theoretical research of memristors in artificial neural networks, this paper studies BP neural networks based on various discrete memristors. Firstly, the concept of discrete memristor and several classical discrete memristor models are introduced. Based on these models, the discrete memristor-based BP neural networks are designed. Finally, these networks are utilized for achieving handwritten digit classification and speech feature classification, respectively. The results show that linear discrete memristors perform better than nonlinear discrete memristors, and a simple linear discrete memristor-based BP neural network has the best performance, reaching 97.40% (handwritten digit classification) and 93.78% (speech feature classification), respectively. In addition, some fundamental issues are also discussed, such as the effects of linear, nonlinear memristors, and initial charges on the performance of neural networks.

{"title":"What is the impact of discrete memristor on the performance of neural network: A research on discrete memristor-based BP neural network","authors":"Yuexi Peng , Maolin Li , Zhijun Li , Minglin Ma , Mengjiao Wang , Shaobo He","doi":"10.1016/j.neunet.2025.107213","DOIUrl":"10.1016/j.neunet.2025.107213","url":null,"abstract":"<div><div>Artificial neural networks are receiving increasing attention from researchers. However, with the advent of big data era, artificial neural networks are limited by the Von Neumann architecture, making it difficult to make new breakthroughs in hardware implementation. Discrete-time memristor, emerging as a research focus in recent years, are anticipated to address this challenge effectively. To enrich the theoretical research of memristors in artificial neural networks, this paper studies BP neural networks based on various discrete memristors. Firstly, the concept of discrete memristor and several classical discrete memristor models are introduced. Based on these models, the discrete memristor-based BP neural networks are designed. Finally, these networks are utilized for achieving handwritten digit classification and speech feature classification, respectively. The results show that linear discrete memristors perform better than nonlinear discrete memristors, and a simple linear discrete memristor-based BP neural network has the best performance, reaching 97.40% (handwritten digit classification) and 93.78% (speech feature classification), respectively. In addition, some fundamental issues are also discussed, such as the effects of linear, nonlinear memristors, and initial charges on the performance of neural networks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107213"},"PeriodicalIF":6.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143140551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FxTS-Net: Fixed-time stable learning framework for Neural ODEs

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-02-01 DOI: 10.1016/j.neunet.2025.107219

Chaoyang Luo , Yan Zou , Wanying Li , Nanjing Huang

Neural Ordinary Differential Equations (Neural ODEs), as a novel category of modeling big data methods, cleverly link traditional neural networks and dynamical systems. However, it is challenging to ensure the dynamics system reaches a correctly predicted state within a user-defined fixed time. To address this problem, we propose a new method for training Neural ODEs using fixed-time stability (FxTS) Lyapunov conditions. Our framework, called FxTS-Net, is based on the novel FxTS loss (FxTS-Loss) designed on Lyapunov functions, which aims to encourage convergence to accurate predictions in a user-defined fixed time. We also provide an innovative approach for constructing Lyapunov functions to meet various tasks and network architecture requirements, achieved by leveraging supervised information during training. By developing a more precise time upper bound estimation for bounded non-vanishingly perturbed systems, we demonstrate that minimizing FxTS-Loss not only guarantees FxTS behavior of the dynamics but also input perturbation robustness. For optimizing FxTS-Loss, we also propose a learning algorithm, in which the simulated perturbation sampling method can capture sample points in critical regions to approximate FxTS-Loss. Experimentally, we find that FxTS-Net provides better prediction performance and better robustness under input perturbation.

{"title":"FxTS-Net: Fixed-time stable learning framework for Neural ODEs","authors":"Chaoyang Luo , Yan Zou , Wanying Li , Nanjing Huang","doi":"10.1016/j.neunet.2025.107219","DOIUrl":"10.1016/j.neunet.2025.107219","url":null,"abstract":"<div><div>Neural Ordinary Differential Equations (Neural ODEs), as a novel category of modeling big data methods, cleverly link traditional neural networks and dynamical systems. However, it is challenging to ensure the dynamics system reaches a correctly predicted state within a user-defined fixed time. To address this problem, we propose a new method for training Neural ODEs using fixed-time stability (FxTS) Lyapunov conditions. Our framework, called FxTS-Net, is based on the novel FxTS loss (FxTS-Loss) designed on Lyapunov functions, which aims to encourage convergence to accurate predictions in a user-defined fixed time. We also provide an innovative approach for constructing Lyapunov functions to meet various tasks and network architecture requirements, achieved by leveraging supervised information during training. By developing a more precise time upper bound estimation for bounded non-vanishingly perturbed systems, we demonstrate that minimizing FxTS-Loss not only guarantees FxTS behavior of the dynamics but also input perturbation robustness. For optimizing FxTS-Loss, we also propose a learning algorithm, in which the simulated perturbation sampling method can capture sample points in critical regions to approximate FxTS-Loss. Experimentally, we find that FxTS-Net provides better prediction performance and better robustness under input perturbation.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107219"},"PeriodicalIF":6.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143140555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pan-sharpening via Symmetric Multi-Scale Correction-Enhancement Transformers

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-02-01 DOI: 10.1016/j.neunet.2025.107226

Yong Li , Yi Wang , Shuai Shi , Jiaming Wang , Ruiyang Wang , Mengqian Lu , Fan Zhang

Pan-sharpening is a widely employed technique for enhancing the quality and accuracy of remote sensing images, particularly in high-resolution image downstream tasks. However, existing deep-learning methods often neglect the self-similarity in remote sensing images. Ignoring it can result in poor fusion of texture and spectral details, leading to artifacts like ringing and reduced clarity in the fused image. To address these limitations, we propose the Symmetric Multi-Scale Correction-Enhancement Transformers (SMCET) model. SMCET incorporates a Self-Similarity Refinement Transformers (SSRT) module to capture self-similarity from frequency and spatial domain within a single scale, and an encoder–decoder framework to employ multi-scale transformations to simulate the self-similarity process across scales. Our experiments on multiple satellite datasets demonstrate that SMCET outperforms existing methods, offering superior texture and spectral details. The SMCET source code can be accessed at https://github.com/yonglleee/SMCET.

{"title":"Pan-sharpening via Symmetric Multi-Scale Correction-Enhancement Transformers","authors":"Yong Li , Yi Wang , Shuai Shi , Jiaming Wang , Ruiyang Wang , Mengqian Lu , Fan Zhang","doi":"10.1016/j.neunet.2025.107226","DOIUrl":"10.1016/j.neunet.2025.107226","url":null,"abstract":"<div><div>Pan-sharpening is a widely employed technique for enhancing the quality and accuracy of remote sensing images, particularly in high-resolution image downstream tasks. However, existing deep-learning methods often neglect the self-similarity in remote sensing images. Ignoring it can result in poor fusion of texture and spectral details, leading to artifacts like ringing and reduced clarity in the fused image. To address these limitations, we propose the Symmetric Multi-Scale Correction-Enhancement Transformers (SMCET) model. SMCET incorporates a Self-Similarity Refinement Transformers (SSRT) module to capture self-similarity from frequency and spatial domain within a single scale, and an encoder–decoder framework to employ multi-scale transformations to simulate the self-similarity process across scales. Our experiments on multiple satellite datasets demonstrate that SMCET outperforms existing methods, offering superior texture and spectral details. The SMCET source code can be accessed at <span><span>https://github.com/yonglleee/SMCET</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107226"},"PeriodicalIF":6.0,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143360782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A prompt tuning method based on relation graphs for few-shot relation extraction

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-31 DOI: 10.1016/j.neunet.2025.107214

Zirui Zhang , Yiyu Yang , Benhui Chen

Prompt-tuning has recently proven effective in addressing few-shot tasks. However, task resources remain severely limited in the specific domain of few-shot relation extraction. Despite its successes, prompt-tuning faces challenges distinguishing between similar relations, resulting in occasional prediction errors. Therefore, it is critical to extract maximum information from these scarce resources. This paper introduces the integration of global relation graphs and local relation subgraphs into the prompt-tuning framework to tackle this issue and fully exploit the available resources for differentiating between various relations. A global relation graph is initially constructed to enhance feature representations of samples across different relations based on label consistency. Subsequently, this global relation graph is partitioned to create local relation subgraphs for each relation type, optimizing the feature representations of samples within the same relation. This dual approach effectively utilizes the limited supervised information and improves tuning efficiency. Additionally, recognizing the substantial semantic knowledge embedded in relation labels, this study integrates such knowledge into the prompt-tuning process. Extensive experiments conducted on four low-resource datasets validate the efficacy of the proposed method, demonstrating significant performance improvements. Notably, the model also exhibits robust performance in discerning similar relations.

{"title":"A prompt tuning method based on relation graphs for few-shot relation extraction","authors":"Zirui Zhang , Yiyu Yang , Benhui Chen","doi":"10.1016/j.neunet.2025.107214","DOIUrl":"10.1016/j.neunet.2025.107214","url":null,"abstract":"<div><div>Prompt-tuning has recently proven effective in addressing few-shot tasks. However, task resources remain severely limited in the specific domain of few-shot relation extraction. Despite its successes, prompt-tuning faces challenges distinguishing between similar relations, resulting in occasional prediction errors. Therefore, it is critical to extract maximum information from these scarce resources. This paper introduces the integration of global relation graphs and local relation subgraphs into the prompt-tuning framework to tackle this issue and fully exploit the available resources for differentiating between various relations. A global relation graph is initially constructed to enhance feature representations of samples across different relations based on label consistency. Subsequently, this global relation graph is partitioned to create local relation subgraphs for each relation type, optimizing the feature representations of samples within the same relation. This dual approach effectively utilizes the limited supervised information and improves tuning efficiency. Additionally, recognizing the substantial semantic knowledge embedded in relation labels, this study integrates such knowledge into the prompt-tuning process. Extensive experiments conducted on four low-resource datasets validate the efficacy of the proposed method, demonstrating significant performance improvements. Notably, the model also exhibits robust performance in discerning similar relations.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107214"},"PeriodicalIF":6.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143140518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

COSTA: Contrastive Spatial and Temporal Debiasing framework for next POI recommendation

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-31 DOI: 10.1016/j.neunet.2025.107212

Yu Lei , Limin Shen , Zhu Sun , TianTian He , Shanshan Feng , Guanfeng Liu

Current research on next point-of-interest (POI) recommendation focuses on capturing users’ behavior patterns residing in their mobility trajectories. However, the learning process will inevitably cause discrepancies between the recommendation and individuals’ spatial and temporal preferences, and consequently lead to specific biases in the next POI recommendation, namely the spatial bias and temporal bias. This work, for the first time, reveals the existence of such spatial and temporal biases and explores their detrimental impact on user experiences via in-depth data analysis. To mitigate the spatial and temporal biases, we propose a novel Contrastive Spatial and Temporal Debiasing framework for the next POI recommendation (COSTA). COSTA enhances spatial–temporal signals from both the user and POI sides through the user- and location-side spatial–temporal signal encoders. Based on these enhanced representations, it utilizes contrastive learning to strengthen the alignment between user representations and suitable POI representations, while distinguishing them from mismatched POI representations. Furthermore, we introduce two novel metrics, Discounted Spatial Cumulative Gain (DSCG) and Discounted Temporal Cumulative Gain (DTCG), to quantify the severity of spatial and temporal biases. Extensive experiments conducted on three real-world datasets demonstrate that COSTA significantly outperforms state-of-the-art next POI recommendation approaches in terms of debiasing metrics without compromising recommendation accuracy.

{"title":"COSTA: Contrastive Spatial and Temporal Debiasing framework for next POI recommendation","authors":"Yu Lei , Limin Shen , Zhu Sun , TianTian He , Shanshan Feng , Guanfeng Liu","doi":"10.1016/j.neunet.2025.107212","DOIUrl":"10.1016/j.neunet.2025.107212","url":null,"abstract":"<div><div>Current research on next point-of-interest (POI) recommendation focuses on capturing users’ behavior patterns residing in their mobility trajectories. However, the learning process will inevitably cause discrepancies between the recommendation and individuals’ spatial and temporal preferences, and consequently lead to specific biases in the next POI recommendation, namely the spatial bias and temporal bias. This work, for the first time, reveals the existence of such spatial and temporal biases and explores their detrimental impact on user experiences via in-depth data analysis. To mitigate the spatial and temporal biases, we propose a novel <u>Co</u>ntrastive <u>S</u>patial and <u>T</u>emporal Debi<u>a</u>sing framework for the next POI recommendation (COSTA). COSTA enhances spatial–temporal signals from both the user and POI sides through the user- and location-side spatial–temporal signal encoders. Based on these enhanced representations, it utilizes contrastive learning to strengthen the alignment between user representations and suitable POI representations, while distinguishing them from mismatched POI representations. Furthermore, we introduce two novel metrics, Discounted Spatial Cumulative Gain (DSCG) and Discounted Temporal Cumulative Gain (DTCG), to quantify the severity of spatial and temporal biases. Extensive experiments conducted on three real-world datasets demonstrate that COSTA significantly outperforms state-of-the-art next POI recommendation approaches in terms of debiasing metrics without compromising recommendation accuracy.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107212"},"PeriodicalIF":6.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143097924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust graph structure learning under heterophily

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-30 DOI: 10.1016/j.neunet.2025.107206

Xuanting Xie, Wenyu Chen, Zhao Kang

A graph is a fundamental mathematical structure in characterizing relations between different objects and has been widely used on various learning tasks. Most methods implicitly assume a given graph to be accurate and complete. However, real data is inevitably noisy and sparse, which will lead to inferior results in downstream tasks, such as node classification and clustering. Despite the remarkable success of recent graph representation learning methods, they inherently presume that the graph is homophilic, and largely overlook heterophily, where most connected nodes are from different classes. In this regard, we propose a novel robust graph structure learning method to achieve a high-quality graph from heterophilic data for downstream tasks. We first apply a high-pass filter to make each node more distinctive from its neighbors by encoding structure information into the node features. Then, we learn a robust graph with an adaptive norm characterizing different levels of noise. Afterwards, we propose a novel regularizer to further refine the graph structure. Clustering and semi-supervised classification experiments on heterophilic graphs verify the effectiveness of our method. In particular, our simple method can have better performance than fancy deep learning methods in handling heterophilic graphs by delivering superior accuracy.

图是描述不同对象之间关系的基本数学结构，已被广泛应用于各种学习任务中。大多数方法都隐含地假定给定的图是准确和完整的。然而，真实数据不可避免地存在噪声和稀疏性，这将导致下游任务（如节点分类和聚类）的效果不佳。尽管最近的图表示学习方法取得了巨大成功，但它们本质上假定图是同亲的，而在很大程度上忽略了异亲的情况，即大多数连接节点来自不同类别。为此，我们提出了一种新颖稳健的图结构学习方法，以便从异类数据中获得高质量的图，用于下游任务。我们首先应用高通滤波器，通过将结构信息编码到节点特征中，使每个节点与其相邻节点更有区别。然后，我们通过自适应规范来学习具有不同噪声水平特征的鲁棒图。之后，我们提出了一种新颖的正则器来进一步完善图结构。对异质图的聚类和半监督分类实验验证了我们方法的有效性。特别是，在处理异嗜图时，我们的简单方法比花哨的深度学习方法有更好的表现，能提供更高的准确性。

{"title":"Robust graph structure learning under heterophily","authors":"Xuanting Xie, Wenyu Chen, Zhao Kang","doi":"10.1016/j.neunet.2025.107206","DOIUrl":"10.1016/j.neunet.2025.107206","url":null,"abstract":"<div><div>A graph is a fundamental mathematical structure in characterizing relations between different objects and has been widely used on various learning tasks. Most methods implicitly assume a given graph to be accurate and complete. However, real data is inevitably noisy and sparse, which will lead to inferior results in downstream tasks, such as node classification and clustering. Despite the remarkable success of recent graph representation learning methods, they inherently presume that the graph is homophilic, and largely overlook heterophily, where most connected nodes are from different classes. In this regard, we propose a novel robust graph structure learning method to achieve a high-quality graph from heterophilic data for downstream tasks. We first apply a high-pass filter to make each node more distinctive from its neighbors by encoding structure information into the node features. Then, we learn a robust graph with an adaptive norm characterizing different levels of noise. Afterwards, we propose a novel regularizer to further refine the graph structure. Clustering and semi-supervised classification experiments on heterophilic graphs verify the effectiveness of our method. In particular, our simple method can have better performance than fancy deep learning methods in handling heterophilic graphs by delivering superior accuracy.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107206"},"PeriodicalIF":6.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conditional diffusion model for recommender systems

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-30 DOI: 10.1016/j.neunet.2025.107204

Ruixin Chen, Jianping Fan, Meiqin Wu, Sining Ma

Recommender systems are used to filter personalized information for users, as it help avoid information overload. The diffusion model is an advanced deep generative model that has been used in recommender systems due to its effectiveness in reconstructing users’ interaction vectors and predicting their preferences. The conditional diffusion model is an improvement of the diffusion model that introduces the guidance information in the reverse diffusion process, where the guidance information is usually labels or features related to the reconstructed vector. The main contribution of this article is developing an effective recommendation method based on the conditional diffusion model, which aims to introduce the user’s preference feature into the reverse diffusion process and improve the recommendation performance. For this purpose, we propose an effective strategy utilizing the user’s own interaction vectors as conditional guidance information and using neural networks as encoders. The above two approaches contribute 7.41% and 6.00% to the performance improvement, respectively. We select five datasets on movies, music, beauty, and sports products for our experiments, with sample sizes ranging from 50,000 to 500,000, and sparsity ranging from 0.05% to 3.42%. Compared to the best performance of selected baselines, our proposed model improves the Top10 metrics by 5.59% and the Top20 metrics by 4.38%. Besides, the hyper-parameters sensitivity analysis shows that the small diffusion steps and the moderate introduced noise result in good performance. Finally, we present the limitations of C-DiffRec in relationship network applications and the scalability of the model framework depth.

{"title":"Conditional diffusion model for recommender systems","authors":"Ruixin Chen, Jianping Fan, Meiqin Wu, Sining Ma","doi":"10.1016/j.neunet.2025.107204","DOIUrl":"10.1016/j.neunet.2025.107204","url":null,"abstract":"<div><div>Recommender systems are used to filter personalized information for users, as it help avoid information overload. The diffusion model is an advanced deep generative model that has been used in recommender systems due to its effectiveness in reconstructing users’ interaction vectors and predicting their preferences. The conditional diffusion model is an improvement of the diffusion model that introduces the guidance information in the reverse diffusion process, where the guidance information is usually labels or features related to the reconstructed vector. The main contribution of this article is developing an effective recommendation method based on the conditional diffusion model, which aims to introduce the user’s preference feature into the reverse diffusion process and improve the recommendation performance. For this purpose, we propose an effective strategy utilizing the user’s own interaction vectors as conditional guidance information and using neural networks as encoders. The above two approaches contribute 7.41% and 6.00% to the performance improvement, respectively. We select five datasets on movies, music, beauty, and sports products for our experiments, with sample sizes ranging from 50,000 to 500,000, and sparsity ranging from 0.05% to 3.42%. Compared to the best performance of selected baselines, our proposed model improves the Top10 metrics by 5.59% and the Top20 metrics by 4.38%. Besides, the hyper-parameters sensitivity analysis shows that the small diffusion steps and the moderate introduced noise result in good performance. Finally, we present the limitations of C-DiffRec in relationship network applications and the scalability of the model framework depth.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107204"},"PeriodicalIF":6.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143097911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sample-efficient and occlusion-robust reinforcement learning for robotic manipulation via multimodal fusion dualization and representation normalization

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-30 DOI: 10.1016/j.neunet.2025.107202

Samyeul Noh , Wooju Lee , Hyun Myung

Recent advances in visual reinforcement learning (visual RL), which learns from high-dimensional image observations, have narrowed the gap between state-based and image-based training. However, visual RL continues to face significant challenges in robotic manipulation tasks involving occlusions, such as lifting obscured objects. Although high-resolution tactile sensors have shown promise in addressing these occlusion issues through visuotactile manipulation, their high cost and complexity limit widespread adoption. In this paper, we propose a novel RL approach that introduces multimodal fusion dualization and representation normalization to enhance sample efficiency and robustness in robotic manipulation tasks involving occlusions — without relying on tactile feedback. Our multimodal fusion dualization technique separates the fusion process into two distinct modules, each optimized individually for the actor and the critic, resulting in tailored representations for each network. Additionally, representation normalization techniques, including

L a y e r N o r m

and

S i m p l e x N o r m

, are incorporated into the representation learning process to stabilize training and prevent issues such as gradient explosion. We demonstrate that our method not only effectively tackles challenging robotic manipulation tasks involving occlusions but also outperforms state-of-the-art visual RL and state-based RL methods in both sample efficiency and task performance. Notably, this is achieved without relying on tactile sensors or prior knowledge, such as predefined low-dimensional coordinate states or pre-trained representations, making our approach both cost-effective and scalable for real-world robotic applications.

{"title":"Sample-efficient and occlusion-robust reinforcement learning for robotic manipulation via multimodal fusion dualization and representation normalization","authors":"Samyeul Noh , Wooju Lee , Hyun Myung","doi":"10.1016/j.neunet.2025.107202","DOIUrl":"10.1016/j.neunet.2025.107202","url":null,"abstract":"<div><div>Recent advances in visual reinforcement learning (visual RL), which learns from high-dimensional image observations, have narrowed the gap between state-based and image-based training. However, visual RL continues to face significant challenges in robotic manipulation tasks involving occlusions, such as lifting obscured objects. Although high-resolution tactile sensors have shown promise in addressing these occlusion issues through visuotactile manipulation, their high cost and complexity limit widespread adoption. In this paper, we propose a novel RL approach that introduces <em>multimodal fusion dualization</em> and <em>representation normalization</em> to enhance sample efficiency and robustness in robotic manipulation tasks involving occlusions — without relying on tactile feedback. Our multimodal fusion dualization technique separates the fusion process into two distinct modules, each optimized individually for the actor and the critic, resulting in tailored representations for each network. Additionally, representation normalization techniques, including <span><math><mstyle><mi>L</mi><mi>a</mi><mi>y</mi><mi>e</mi><mi>r</mi><mi>N</mi><mi>o</mi><mi>r</mi><mi>m</mi></mstyle></math></span> and <span><math><mstyle><mi>S</mi><mi>i</mi><mi>m</mi><mi>p</mi><mi>l</mi><mi>e</mi><mi>x</mi><mi>N</mi><mi>o</mi><mi>r</mi><mi>m</mi></mstyle></math></span>, are incorporated into the representation learning process to stabilize training and prevent issues such as gradient explosion. We demonstrate that our method not only effectively tackles challenging robotic manipulation tasks involving occlusions but also outperforms state-of-the-art visual RL and state-based RL methods in both sample efficiency and task performance. Notably, this is achieved without relying on tactile sensors or prior knowledge, such as predefined low-dimensional coordinate states or pre-trained representations, making our approach both cost-effective and scalable for real-world robotic applications.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107202"},"PeriodicalIF":6.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143140554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

U3UNet: An accurate and reliable segmentation model for forest fire monitoring based on UAV vision

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-30 DOI: 10.1016/j.neunet.2025.107207

Hailin Feng , Jiefan Qiu , Long Wen , Jinhong Zhang , Jiening Yang , Zhihan Lyu , Tongcun Liu , Kai Fang

Forest fires pose a serious threat to the global ecological environment, and the critical steps in reducing the impact of fires are fire warning and real-time monitoring. Traditional monitoring methods, like ground observation and satellite sensing, were limited by monitoring coverage or low spatio-temporal resolution, making it difficult to meet the needs for precise shape of fire sources. Therefore, we propose an accurate and reliable forest fire monitoring segmentation model U3UNet based on UAV vision, which uses a nested U-shaped structure for feature fusion at different scales to retain important feature information. The idea of a full-scale connection is utilized to balance the global information of detailed features to ensure the full fusion of features. We conducted a series of comparative experiments with U-Net, UNet 3＋, U2-Net, Yolov9, FPS-U2Net, PSPNet, DeeplabV3＋ and TransFuse on the Unreal Engine platform and several real forest fire scenes. According to the designed composite metric S, in static scenarios 71. 44% is achieved, which is 0.3% lower than the best method. In the dynamic scenario, it reaches 80.53%, which is 8.94% higher than the optimal method. In addition, we also tested the real-time performance of U3UNet on edge computing device equipped on UAV.

{"title":"U3UNet: An accurate and reliable segmentation model for forest fire monitoring based on UAV vision","authors":"Hailin Feng , Jiefan Qiu , Long Wen , Jinhong Zhang , Jiening Yang , Zhihan Lyu , Tongcun Liu , Kai Fang","doi":"10.1016/j.neunet.2025.107207","DOIUrl":"10.1016/j.neunet.2025.107207","url":null,"abstract":"<div><div>Forest fires pose a serious threat to the global ecological environment, and the critical steps in reducing the impact of fires are fire warning and real-time monitoring. Traditional monitoring methods, like ground observation and satellite sensing, were limited by monitoring coverage or low spatio-temporal resolution, making it difficult to meet the needs for precise shape of fire sources. Therefore, we propose an accurate and reliable forest fire monitoring segmentation model U3UNet based on UAV vision, which uses a nested U-shaped structure for feature fusion at different scales to retain important feature information. The idea of a full-scale connection is utilized to balance the global information of detailed features to ensure the full fusion of features. We conducted a series of comparative experiments with U-Net, UNet 3＋, U2-Net, Yolov9, FPS-U2Net, PSPNet, DeeplabV3＋ and TransFuse on the Unreal Engine platform and several real forest fire scenes. According to the designed composite metric S, in static scenarios 71. 44% is achieved, which is 0.3% lower than the best method. In the dynamic scenario, it reaches 80.53%, which is 8.94% higher than the optimal method. In addition, we also tested the real-time performance of U3UNet on edge computing device equipped on UAV.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107207"},"PeriodicalIF":6.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0