Neural Networks最新文献_第10页

Deterministic Autoencoder using Wasserstein loss for tabular data generation

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-29 DOI: 10.1016/j.neunet.2025.107208

Alex X. Wang , Binh P. Nguyen

Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model’s latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE’s superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE’s advantages, establishing TWAE as a robust solution for complex tabular data synthesis.

{"title":"Deterministic Autoencoder using Wasserstein loss for tabular data generation","authors":"Alex X. Wang , Binh P. Nguyen","doi":"10.1016/j.neunet.2025.107208","DOIUrl":"10.1016/j.neunet.2025.107208","url":null,"abstract":"<div><div>Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model’s latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE’s superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE’s advantages, establishing TWAE as a robust solution for complex tabular data synthesis.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107208"},"PeriodicalIF":6.0,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TDAG: A multi-agent framework based on dynamic Task Decomposition and Agent Generation

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-28 DOI: 10.1016/j.neunet.2025.107200

Yaoxiang Wang , Zhiyong Wu , Junfeng Yao , Jinsong Su

The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents’ abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios.

{"title":"TDAG: A multi-agent framework based on dynamic Task Decomposition and Agent Generation","authors":"Yaoxiang Wang , Zhiyong Wu , Junfeng Yao , Jinsong Su","doi":"10.1016/j.neunet.2025.107200","DOIUrl":"10.1016/j.neunet.2025.107200","url":null,"abstract":"<div><div>The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents’ abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107200"},"PeriodicalIF":6.0,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143097913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Functionally specialized spectral organization of the resting human cortex

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-27 DOI: 10.1016/j.neunet.2025.107195

Wenjun Bai , Okito Yamashita , Junichiro Yoshimoto

Ample studies across various neuroimaging modalities have suggested that the human cortex at rest is hierarchically organized along the spectral and functional axes. However, the relationship between the spectral and functional organizations of the human cortex remains largely unexplored. Here, we reveal the confluence of functional and spectral cortical organizations by examining the functional specialization in spectral gradients of the cortex. These spectral gradients, derived from functional magnetic resonance imaging data at rest using our temporal de-correlation method to enhance spectral resolution, demonstrate regional frequency biases. The grading of spectral gradients across the cortex – aligns with many existing brain maps – is found to be highly functionally specialized through discovered frequency-specific resting-state functional networks, functionally distinctive spectral profiles, and an intrinsic coordinate system that is functionally specialized. By demonstrating the functionally specialized spectral gradients of the cortex, we shed light on the close relation between functional and spectral organizations of the resting human cortex.

{"title":"Functionally specialized spectral organization of the resting human cortex","authors":"Wenjun Bai , Okito Yamashita , Junichiro Yoshimoto","doi":"10.1016/j.neunet.2025.107195","DOIUrl":"10.1016/j.neunet.2025.107195","url":null,"abstract":"<div><div>Ample studies across various neuroimaging modalities have suggested that the human cortex at rest is hierarchically organized along the spectral and functional axes. However, the relationship between the spectral and functional organizations of the human cortex remains largely unexplored. Here, we reveal the confluence of functional and spectral cortical organizations by examining the functional specialization in spectral gradients of the cortex. These spectral gradients, derived from functional magnetic resonance imaging data at rest using our temporal de-correlation method to enhance spectral resolution, demonstrate regional frequency biases. The grading of spectral gradients across the cortex – aligns with many existing brain maps – is found to be highly functionally specialized through discovered frequency-specific resting-state functional networks, functionally distinctive spectral profiles, and an intrinsic coordinate system that is functionally specialized. By demonstrating the functionally specialized spectral gradients of the cortex, we shed light on the close relation between functional and spectral organizations of the resting human cortex.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107195"},"PeriodicalIF":6.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dissecting the effectiveness of deep features as metric of perceptual image quality

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-27 DOI: 10.1016/j.neunet.2025.107189

Pablo Hernández-Cámara, Jorge Vila-Tomás, Valero Laparra, Jesús Malo

There is an open debate on the role of artificial networks to understand the visual brain. Internal representations of images in artificial networks develop human-like properties. In particular, evaluating distortions using differences between internal features is correlated to human perception of distortion. However, the origins of this correlation are not well understood.

Here, we dissect the different factors involved in the emergence of human-like behavior: function, architecture, and environment. To do so, we evaluate the aforementioned human-network correlation at different depths of 46 pre-trained model configurations that include no psycho-visual information. The results show that most of the models correlate better with human opinion than SSIM (a de-facto standard in subjective image quality). Moreover, some models are better than state-of-the-art networks specifically tuned for the application (LPIPS, DISTS). Regarding the function, supervised classification leads to nets that correlate better with humans than the explored models for self- and non-supervised tasks. However, we found that better performance in the task does not imply more human behavior. Regarding the architecture, simpler models correlate better with humans than very deep nets and generally, the highest correlation is not achieved in the last layer. Finally, regarding the environment, training with large natural datasets leads to bigger correlations than training in smaller databases with restricted content, as expected. We also found that the best classification models are not the best for predicting human distances.

In the general debate about understanding human vision, our empirical findings imply that explanations have not to be focused on a single abstraction level, but all function, architecture, and environment are relevant.

{"title":"Dissecting the effectiveness of deep features as metric of perceptual image quality","authors":"Pablo Hernández-Cámara, Jorge Vila-Tomás, Valero Laparra, Jesús Malo","doi":"10.1016/j.neunet.2025.107189","DOIUrl":"10.1016/j.neunet.2025.107189","url":null,"abstract":"<div><div>There is an open debate on the role of artificial networks to understand the visual brain. Internal representations of images in artificial networks develop human-like properties. In particular, evaluating distortions using differences between internal features is correlated to human perception of distortion. However, the origins of this correlation are not well understood.</div><div>Here, we dissect the different factors involved in the emergence of human-like behavior: <em>function</em>, <em>architecture</em>, and <em>environment</em>. To do so, we evaluate the aforementioned human-network correlation at different depths of 46 pre-trained model configurations that include no psycho-visual information. The results show that most of the models correlate better with human opinion than SSIM (a de-facto standard in subjective image quality). Moreover, some models are better than state-of-the-art networks specifically tuned for the application (LPIPS, DISTS). Regarding the function, supervised classification leads to nets that correlate better with humans than the explored models for self- and non-supervised tasks. However, we found that better performance in the task does not imply more human behavior. Regarding the architecture, simpler models correlate better with humans than very deep nets and generally, the highest correlation is not achieved in the last layer. Finally, regarding the environment, training with large natural datasets leads to bigger correlations than training in smaller databases with restricted content, as expected. We also found that the best classification models are not the best for predicting human distances.</div><div>In the general debate about understanding human vision, our empirical findings imply that explanations have not to be focused on a single abstraction level, but all <em>function</em>, <em>architecture</em>, and <em>environment</em> are relevant.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107189"},"PeriodicalIF":6.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143061432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep one-class probability learning for end-to-end image classification

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-27 DOI: 10.1016/j.neunet.2025.107201

Jia Liu, Wenhua Zhang, Fang Liu, Jingxiang Yang, Liang Xiao

One-class learning has many application potentials in novelty, anomaly, and outlier detection systems. It aims to distinguish both positive and negative samples with a model trained via only positive samples or one-class annotated samples. With the difficulty in training an end-to-end classification network, existing methods usually make decisions indirectly. To fully exploit the learning capability of a deep network, in this paper, we propose to design a deep end-to-end binary image classifier based on convolutional neural network with input of image and output of classification result. Without negative training samples, we establish a probabilistic model driven by an energy to learn the distribution of positive samples. The energy is proposed based on the output of the network which subtly models the deep discriminations into statistics. During optimization, to overcome the difficulty of distribution estimation, we propose a novel particle swarm optimization algorithm based sampling method. Compared with existing methods, the proposed method is able to directly output classification results without additional thresholding or estimating operations. Moreover, the deep network is directly optimized via the probabilistic model which results in better adaptation of positive distribution and classification task. Experiments demonstrate the effectiveness and state-of-the-art performance of the proposed method.

{"title":"Deep one-class probability learning for end-to-end image classification","authors":"Jia Liu, Wenhua Zhang, Fang Liu, Jingxiang Yang, Liang Xiao","doi":"10.1016/j.neunet.2025.107201","DOIUrl":"10.1016/j.neunet.2025.107201","url":null,"abstract":"<div><div>One-class learning has many application potentials in novelty, anomaly, and outlier detection systems. It aims to distinguish both positive and negative samples with a model trained via only positive samples or one-class annotated samples. With the difficulty in training an end-to-end classification network, existing methods usually make decisions indirectly. To fully exploit the learning capability of a deep network, in this paper, we propose to design a deep end-to-end binary image classifier based on convolutional neural network with input of image and output of classification result. Without negative training samples, we establish a probabilistic model driven by an energy to learn the distribution of positive samples. The energy is proposed based on the output of the network which subtly models the deep discriminations into statistics. During optimization, to overcome the difficulty of distribution estimation, we propose a novel particle swarm optimization algorithm based sampling method. Compared with existing methods, the proposed method is able to directly output classification results without additional thresholding or estimating operations. Moreover, the deep network is directly optimized via the probabilistic model which results in better adaptation of positive distribution and classification task. Experiments demonstrate the effectiveness and state-of-the-art performance of the proposed method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107201"},"PeriodicalIF":6.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143140552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Federated learning with bilateral defense via blockchain

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-27 DOI: 10.1016/j.neunet.2025.107199

Jue Xiao , Hewang Nie , Zepu Yi , Xueming Tang , Songfeng Lu

Federated Learning (FL) offers benefits in protecting client data privacy but also faces multiple security challenges, such as privacy breaches from unencrypted data transmission and poisoning attacks that compromise model performance, however, most existing solutions address only one of these issues. In this paper, we consider a more challenging threat model—the non-fully trusted model, wherein both malicious clients and honest-but-curious servers coexist. To this end, we propose a Federated Learning with Bilateral Defense via Blockchain (FedBASS) scheme that tackles both threats by implementing a dual-server architecture (Analyzer and Verifier), using CKKS encryption to secure client-uploaded gradients, and employing cosine similarity to detect malicious clients. Additionally, we address the problem of non-IID data by proposing a gradient compensation strategy based on dynamic clustering. To further enhance privacy during clustering, we propose a weakened differential privacy scheme augmented with shuffling. Moreover, in FedBASS, the communication process between servers is recorded on the blockchain to ensure the robustness and transparency of FedBASS and to prevent selfish behaviors by clients and servers. Finally, extensive experiments conducted on three datasets prove that FedBASS effectively achieves a balance among model fidelity, robustness, efficiency, privacy, and practicality.

{"title":"Federated learning with bilateral defense via blockchain","authors":"Jue Xiao , Hewang Nie , Zepu Yi , Xueming Tang , Songfeng Lu","doi":"10.1016/j.neunet.2025.107199","DOIUrl":"10.1016/j.neunet.2025.107199","url":null,"abstract":"<div><div>Federated Learning (FL) offers benefits in protecting client data privacy but also faces multiple security challenges, such as privacy breaches from unencrypted data transmission and poisoning attacks that compromise model performance, however, most existing solutions address only one of these issues. In this paper, we consider a more challenging threat model—the non-fully trusted model, wherein both malicious clients and honest-but-curious servers coexist. To this end, we propose a Federated Learning with Bilateral Defense via Blockchain (FedBASS) scheme that tackles both threats by implementing a dual-server architecture (Analyzer and Verifier), using CKKS encryption to secure client-uploaded gradients, and employing cosine similarity to detect malicious clients. Additionally, we address the problem of non-IID data by proposing a gradient compensation strategy based on dynamic clustering. To further enhance privacy during clustering, we propose a weakened differential privacy scheme augmented with shuffling. Moreover, in FedBASS, the communication process between servers is recorded on the blockchain to ensure the robustness and transparency of FedBASS and to prevent selfish behaviors by clients and servers. Finally, extensive experiments conducted on three datasets prove that FedBASS effectively achieves a balance among model fidelity, robustness, efficiency, privacy, and practicality.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107199"},"PeriodicalIF":6.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CreINNs: Credal-Set Interval Neural Networks for Uncertainty Estimation in Classification Tasks

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-27 DOI: 10.1016/j.neunet.2025.107198

Kaizheng Wang , Keivan Shariatmadar , Shireen Kudukkil Manchingal , Fabio Cuzzolin , David Moens , Hans Hallez

Effective uncertainty estimation is becoming increasingly attractive for enhancing the reliability of neural networks. This work presents a novel approach, termed Credal-Set Interval Neural Networks (CreINNs), for classification. CreINNs retain the fundamental structure of traditional Interval Neural Networks, capturing weight uncertainty through deterministic intervals. CreINNs are designed to predict an upper and a lower probability bound for each class, rather than a single probability value. The probability intervals can define a credal set, facilitating estimating different types of uncertainties associated with predictions. Experiments on standard multiclass and binary classification tasks demonstrate that the proposed CreINNs can achieve superior or comparable quality of uncertainty estimation compared to variational Bayesian Neural Networks (BNNs) and Deep Ensembles. Furthermore, CreINNs significantly reduce the computational complexity of variational BNNs during inference. Moreover, the effective uncertainty quantification of CreINNs is also verified when the input data are intervals.

{"title":"CreINNs: Credal-Set Interval Neural Networks for Uncertainty Estimation in Classification Tasks","authors":"Kaizheng Wang , Keivan Shariatmadar , Shireen Kudukkil Manchingal , Fabio Cuzzolin , David Moens , Hans Hallez","doi":"10.1016/j.neunet.2025.107198","DOIUrl":"10.1016/j.neunet.2025.107198","url":null,"abstract":"<div><div>Effective uncertainty estimation is becoming increasingly attractive for enhancing the reliability of neural networks. This work presents a novel approach, termed Credal-Set Interval Neural Networks (CreINNs), for classification. CreINNs retain the fundamental structure of traditional Interval Neural Networks, capturing weight uncertainty through deterministic intervals. CreINNs are designed to predict an upper and a lower probability bound for each class, rather than a single probability value. The probability intervals can define a credal set, facilitating estimating different types of uncertainties associated with predictions. Experiments on standard multiclass and binary classification tasks demonstrate that the proposed CreINNs can achieve superior or comparable quality of uncertainty estimation compared to variational Bayesian Neural Networks (BNNs) and Deep Ensembles. Furthermore, CreINNs significantly reduce the computational complexity of variational BNNs during inference. Moreover, the effective uncertainty quantification of CreINNs is also verified when the input data are intervals.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107198"},"PeriodicalIF":6.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143097912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal cross-scale context clusters for classification of mental disorders using functional and structural MRI

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-26 DOI: 10.1016/j.neunet.2025.107209

Shuqi Yang , Qing Lan , Lijuan Zhang , Kuangling Zhang , Guangmin Tang , Huan Huang , Ping Liang , Jiaqing Miao , Boxun Zhang , Rui Tan , Dezhong Yao , Cheng Luo , Ying Tan

The brain is a complex system with multiple scales and hierarchies, making it challenging to identify abnormalities in individuals with mental disorders. The dynamic segregation and integration of activities across brain regions enable flexible switching between local and global information processing modes. Modeling these scale dynamics within and between brain regions can uncover hidden correlates of brain structure and function in mental disorders. Consequently, we propose a multimodal cross-scale context clusters (MCCocs) model. First, the complementary information in the multimodal image voxels of the brain is integrated and mapped to the original target space to establish a novel voxel-level brain representation. Within each region of interest (ROI), the Voxel Reducer uses a convolution operator to extract local associations among neighboring features and achieves quantitative dimensionality reduction. Among multiple ROIs, the ROI Context Cluster Block performs unsupervised clustering of whole-brain features, capturing nonlinear relationships between ROIs through bidirectional feature aggregation to simulate the effective integration of information across regions. By alternately executing the Voxel Reducer and ROI Context Cluster Block modules multiple times, our model simulates dynamic scale switching within and between ROIs. Experimental results show that MCCocs can recognize potential discriminative biomarkers and achieve state-of-the-art performance in multiple mental disorder classification tasks. The code is available at https://github.com/yangshuqigit/MCCocs.

{"title":"Multimodal cross-scale context clusters for classification of mental disorders using functional and structural MRI","authors":"Shuqi Yang , Qing Lan , Lijuan Zhang , Kuangling Zhang , Guangmin Tang , Huan Huang , Ping Liang , Jiaqing Miao , Boxun Zhang , Rui Tan , Dezhong Yao , Cheng Luo , Ying Tan","doi":"10.1016/j.neunet.2025.107209","DOIUrl":"10.1016/j.neunet.2025.107209","url":null,"abstract":"<div><div>The brain is a complex system with multiple scales and hierarchies, making it challenging to identify abnormalities in individuals with mental disorders. The dynamic segregation and integration of activities across brain regions enable flexible switching between local and global information processing modes. Modeling these scale dynamics within and between brain regions can uncover hidden correlates of brain structure and function in mental disorders. Consequently, we propose a multimodal cross-scale context clusters (MCCocs) model. First, the complementary information in the multimodal image voxels of the brain is integrated and mapped to the original target space to establish a novel voxel-level brain representation. Within each region of interest (ROI), the Voxel Reducer uses a convolution operator to extract local associations among neighboring features and achieves quantitative dimensionality reduction. Among multiple ROIs, the ROI Context Cluster Block performs unsupervised clustering of whole-brain features, capturing nonlinear relationships between ROIs through bidirectional feature aggregation to simulate the effective integration of information across regions. By alternately executing the Voxel Reducer and ROI Context Cluster Block modules multiple times, our model simulates dynamic scale switching within and between ROIs. Experimental results show that MCCocs can recognize potential discriminative biomarkers and achieve state-of-the-art performance in multiple mental disorder classification tasks. The code is available at <span><span>https://github.com/yangshuqigit/MCCocs</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107209"},"PeriodicalIF":6.0,"publicationDate":"2025-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143069226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Rank Representation with Empirical Kernel Space Embedding of Manifolds

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-25 DOI: 10.1016/j.neunet.2025.107196

Wenyi Feng , Zhe Wang , Ting Xiao

Low-Rank Representation (LRR) methods integrate low-rank constraints and projection operators to model the mapping from the sample space to low-dimensional manifolds. Nonetheless, existing approaches typically apply Euclidean algorithms directly to manifold data in the original input space, leading to suboptimal classification accuracy. To mitigate this limitation, we introduce an unsupervised low-rank projection learning method named Low-Rank Representation with Empirical Kernel Space Embedding of Manifolds (LRR-EKM). LRR-EKM leverages an empirical kernel mapping to project samples into the Reproduced Kernel Hilbert Space (RKHS), enabling the linear separability of non-linearly structured samples and facilitating improved low-dimensional manifold representations through Euclidean distance metrics. By incorporating a row sparsity constraint on the projection matrix, LRR-EKM not only identifies discriminative features and removes redundancies but also enhances the interpretability of the learned subspace. Additionally, we introduce a manifold structure preserving constraint to retain the original representation and distance information of the samples during projection. Comprehensive experimental evaluations across various real-world datasets validate the superior performance of our proposed method compared to the state-of-the-art methods. The code is publicly available at https://github.com/ff-raw-war/LRR-EKM.

{"title":"Low-Rank Representation with Empirical Kernel Space Embedding of Manifolds","authors":"Wenyi Feng , Zhe Wang , Ting Xiao","doi":"10.1016/j.neunet.2025.107196","DOIUrl":"10.1016/j.neunet.2025.107196","url":null,"abstract":"<div><div>Low-Rank Representation (LRR) methods integrate low-rank constraints and projection operators to model the mapping from the sample space to low-dimensional manifolds. Nonetheless, existing approaches typically apply Euclidean algorithms directly to manifold data in the original input space, leading to suboptimal classification accuracy. To mitigate this limitation, we introduce an unsupervised low-rank projection learning method named Low-Rank Representation with Empirical Kernel Space Embedding of Manifolds (LRR-EKM). LRR-EKM leverages an empirical kernel mapping to project samples into the Reproduced Kernel Hilbert Space (RKHS), enabling the linear separability of non-linearly structured samples and facilitating improved low-dimensional manifold representations through Euclidean distance metrics. By incorporating a row sparsity constraint on the projection matrix, LRR-EKM not only identifies discriminative features and removes redundancies but also enhances the interpretability of the learned subspace. Additionally, we introduce a manifold structure preserving constraint to retain the original representation and distance information of the samples during projection. Comprehensive experimental evaluations across various real-world datasets validate the superior performance of our proposed method compared to the state-of-the-art methods. The code is publicly available at <span><span>https://github.com/ff-raw-war/LRR-EKM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107196"},"PeriodicalIF":6.0,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143141058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An online trajectory guidance framework via imitation learning and interactive feedback in robot-assisted surgery

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks

Pub Date : 2025-01-25 DOI: 10.1016/j.neunet.2025.107197

Ziyang Chen, Ke Fan

Improving the manipulation performance of surgical instruments is important for novice surgeons, as it directly affects the safety and outcome of robot-assisted surgery. To reduce the difference between expert and novice surgeons, learning the instrument movement trajectories generated by experts is an effective approach for novices to foster their muscle memory and improve manipulation skills. In this work, we propose an online trajectory guidance framework to generate expert-like movement trajectories so that novice surgeons can receive intra-operative trajectory guidance to achieve a similar manipulation performance as experts. First, Dynamic Movement Primitives (DMP) based Imitation Learning (IL) is implemented to model the 3D trajectories demonstrated by experts for adaptive trajectory generation at different start and end points. To introduce the obstacle avoidance capability into IL, we propose a vision-based strategy involving stereo reconstruction, object detection and segmentation to recover the 3D information of obstacles so that they can be coupled into DMP as an obstacle avoidance term. Furthermore, we introduce Augmented Reality (AR) and Interactive Feedback (IF) including visual and force feedback to enhance the trajectory reproduction accuracy of novice surgeons during operation. The experiment was conducted based on a 3D peg-transfer task in two different scenes (with changed start and end points, and with the obstacle present) using a standard da Vinci Research Kit robot. Ten non-expert human subjects were invited to evaluate the online trajectory guidance framework by reproducing the expert-like manipulation trajectories, and the experimental results showed that the novices with the assistance of AR and IF achieved promising trajectory reproduction performance (the mean distance error

E_{mean}

was reduced by 76.47% and 65.15% in two different intra-operative scenes, respectively), narrowing the manipulation gap with experts.

{"title":"An online trajectory guidance framework via imitation learning and interactive feedback in robot-assisted surgery","authors":"Ziyang Chen, Ke Fan","doi":"10.1016/j.neunet.2025.107197","DOIUrl":"10.1016/j.neunet.2025.107197","url":null,"abstract":"<div><div>Improving the manipulation performance of surgical instruments is important for novice surgeons, as it directly affects the safety and outcome of robot-assisted surgery. To reduce the difference between expert and novice surgeons, learning the instrument movement trajectories generated by experts is an effective approach for novices to foster their muscle memory and improve manipulation skills. In this work, we propose an online trajectory guidance framework to generate expert-like movement trajectories so that novice surgeons can receive intra-operative trajectory guidance to achieve a similar manipulation performance as experts. First, Dynamic Movement Primitives (DMP) based Imitation Learning (IL) is implemented to model the 3D trajectories demonstrated by experts for adaptive trajectory generation at different start and end points. To introduce the obstacle avoidance capability into IL, we propose a vision-based strategy involving stereo reconstruction, object detection and segmentation to recover the 3D information of obstacles so that they can be coupled into DMP as an obstacle avoidance term. Furthermore, we introduce Augmented Reality (AR) and Interactive Feedback (IF) including visual and force feedback to enhance the trajectory reproduction accuracy of novice surgeons during operation. The experiment was conducted based on a 3D peg-transfer task in two different scenes (with changed start and end points, and with the obstacle present) using a standard da Vinci Research Kit robot. Ten non-expert human subjects were invited to evaluate the online trajectory guidance framework by reproducing the expert-like manipulation trajectories, and the experimental results showed that the novices with the assistance of AR and IF achieved promising trajectory reproduction performance (the mean distance error <span><math><msub><mrow><mi>E</mi></mrow><mrow><mtext>mean</mtext></mrow></msub></math></span> was reduced by 76.47% and 65.15% in two different intra-operative scenes, respectively), narrowing the manipulation gap with experts.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107197"},"PeriodicalIF":6.0,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143360783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0