Multi-label zero-shot learning (ML-ZSL) strives to recognize all objects in an image, regardless of whether they are present in the training data. Recent methods incorporate an attention mechanism to locate labels in the image and generate class-specific semantic information. However, the attention mechanism built on visual features treats label embeddings equally in the prediction score, leading to severe semantic ambiguity. This study focuses on efficiently utilizing semantic information in the attention mechanism. We propose a contrastive label-based attention method (CLA) to associate each label with the most relevant image regions. Specifically, our label-based attention, guided by the latent label embedding, captures discriminative image details. To distinguish region-wise correlations, we implement a region-level contrastive loss. In addition, we utilize a global feature alignment module to identify labels with general information. Extensive experiments on two benchmarks, NUS-WIDE and Open Images, demonstrate that our CLA outperforms the state-of-the-art methods. Especially under the ZSL setting, our method achieves 2.0% improvements in mean Average Precision (mAP) for NUS-WIDE and 4.0% for Open Images compared with recent methods.
{"title":"Multi-Label Zero-Shot Learning Via Contrastive Label-Based Attention.","authors":"Shixuan Meng, Rongxin Jiang, Xiang Tian, Fan Zhou, Yaowu Chen, Junjie Liu, Chen Shen","doi":"10.1142/S0129065725500108","DOIUrl":"https://doi.org/10.1142/S0129065725500108","url":null,"abstract":"<p><p>Multi-label zero-shot learning (ML-ZSL) strives to recognize all objects in an image, regardless of whether they are present in the training data. Recent methods incorporate an attention mechanism to locate labels in the image and generate class-specific semantic information. However, the attention mechanism built on visual features treats label embeddings equally in the prediction score, leading to severe semantic ambiguity. This study focuses on efficiently utilizing semantic information in the attention mechanism. We propose a contrastive label-based attention method (CLA) to associate each label with the most relevant image regions. Specifically, our label-based attention, guided by the latent label embedding, captures discriminative image details. To distinguish region-wise correlations, we implement a region-level contrastive loss. In addition, we utilize a global feature alignment module to identify labels with general information. Extensive experiments on two benchmarks, NUS-WIDE and Open Images, demonstrate that our CLA outperforms the state-of-the-art methods. Especially under the ZSL setting, our method achieves 2.0% improvements in mean Average Precision (mAP) for NUS-WIDE and 4.0% for Open Images compared with recent methods.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550010"},"PeriodicalIF":0.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual semantic decoding aims to extract perceived semantic information from the visual responses of the human brain and convert it into interpretable semantic labels. Although significant progress has been made in semantic decoding across individual visual cortices, studies on the semantic decoding of the ventral and dorsal cortical visual pathways remain limited. This study proposed a graph neural network (GNN)-based semantic decoding model on a natural scene dataset (NSD) to investigate the decoding differences between the dorsal and ventral pathways in process various parts of speech, including verbs, nouns, and adjectives. Our results indicate that the decoding accuracies for verbs and nouns with motion attributes were significantly higher for the dorsal pathway as compared to those for the ventral pathway. Comparative analyses reveal that the dorsal pathway significantly outperformed the ventral pathway in terms of decoding performance for verbs and nouns with motion attributes, with evidence showing that this superiority largely stemmed from higher-level visual cortices rather than lower-level ones. Furthermore, these two pathways appear to converge in their heightened sensitivity toward semantic content related to actions. These findings reveal unique visual neural mechanisms through which the dorsal and ventral cortical pathways segregate and converge when processing stimuli with different semantic categories.
{"title":"Unraveling the Differential Efficiency of Dorsal and Ventral Pathways in Visual Semantic Decoding.","authors":"Wei Huang, Ying Tang, Sizhuo Wang, Jingpeng Li, Kaiwen Cheng, Hongmei Yan","doi":"10.1142/S0129065725500091","DOIUrl":"https://doi.org/10.1142/S0129065725500091","url":null,"abstract":"<p><p>Visual semantic decoding aims to extract perceived semantic information from the visual responses of the human brain and convert it into interpretable semantic labels. Although significant progress has been made in semantic decoding across individual visual cortices, studies on the semantic decoding of the ventral and dorsal cortical visual pathways remain limited. This study proposed a graph neural network (GNN)-based semantic decoding model on a natural scene dataset (NSD) to investigate the decoding differences between the dorsal and ventral pathways in process various parts of speech, including verbs, nouns, and adjectives. Our results indicate that the decoding accuracies for verbs and nouns with motion attributes were significantly higher for the dorsal pathway as compared to those for the ventral pathway. Comparative analyses reveal that the dorsal pathway significantly outperformed the ventral pathway in terms of decoding performance for verbs and nouns with motion attributes, with evidence showing that this superiority largely stemmed from higher-level visual cortices rather than lower-level ones. Furthermore, these two pathways appear to converge in their heightened sensitivity toward semantic content related to actions. These findings reveal unique visual neural mechanisms through which the dorsal and ventral cortical pathways segregate and converge when processing stimuli with different semantic categories.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550009"},"PeriodicalIF":0.0,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142960753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stroke, an abrupt cerebrovascular ailment resulting in brain tissue damage, has prompted the adoption of motor imagery (MI)-based brain-computer interface (BCI) systems in stroke rehabilitation. However, analyzing electroencephalogram (EEG) signals from stroke patients poses challenges. To address the issues of low accuracy and efficiency in EEG classification, particularly involving MI, the study proposes a residual graph convolutional network (M-ResGCN) framework based on the modified S-transform (MST), and introduces the self-attention mechanism into residual graph convolutional network (ResGCN). This study uses MST to extract EEG time-frequency domain features, derives spatial EEG features by calculating the absolute Pearson correlation coefficient (aPcc) between channels, and devises a method to construct the adjacency matrix of the brain network using aPcc to measure the strength of the connection between channels. Experimental results involving 16 stroke patients and 16 healthy subjects demonstrate significant improvements in classification quality and robustness across tests and subjects. The highest classification accuracy reached 94.91% and a Kappa coefficient of 0.8918. The average accuracy and F1 scores from 10 times 10-fold cross-validation are 94.38% and 94.36%, respectively. By validating the feasibility and applicability of brain networks constructed using the aPcc in EEG signal analysis and feature encoding, it was established that the aPcc effectively reflects overall brain activity. The proposed method presents a novel approach to exploring channel relationships in MI-EEG and improving classification performance. It holds promise for real-time applications in MI-based BCI systems.
中风是一种导致脑组织损伤的突发性脑血管疾病,它促使人们在中风康复中采用基于运动图像(MI)的脑机接口(BCI)系统。然而,分析中风患者的脑电图(EEG)信号是一项挑战。为了解决脑电图分类(尤其是涉及 MI 的脑电图分类)的低准确率和低效率问题,本研究提出了一种基于修正 S 变换(MST)的残差图卷积网络(M-ResGCN)框架,并在残差图卷积网络(ResGCN)中引入了自注意机制。本研究利用 MST 提取脑电图时频域特征,通过计算通道间的绝对皮尔逊相关系数(aPcc)得出脑电图空间特征,并设计了一种利用 aPcc 构建脑网络邻接矩阵的方法,以衡量通道间连接的强度。16 名中风患者和 16 名健康受试者的实验结果表明,在不同的测试和受试者中,分类质量和鲁棒性都有显著提高。最高分类准确率达到 94.91%,Kappa 系数为 0.8918。10 次 10 倍交叉验证的平均准确率和 F1 分数分别为 94.38% 和 94.36%。通过验证利用 aPcc 构建的脑网络在脑电信号分析和特征编码中的可行性和适用性,确定了 aPcc 能有效反映大脑的整体活动。所提出的方法为探索 MI-EEG 中的通道关系和提高分类性能提供了一种新方法。它有望在基于 MI 的生物识别(BCI)系统中得到实时应用。
{"title":"Enhancing Motor Imagery Classification with Residual Graph Convolutional Networks and Multi-Feature Fusion.","authors":"Fangzhou Xu, Weiyou Shi, Chengyan Lv, Yuan Sun, Shuai Guo, Chao Feng, Yang Zhang, Tzyy-Ping Jung, Jiancai Leng","doi":"10.1142/S0129065724500692","DOIUrl":"10.1142/S0129065724500692","url":null,"abstract":"<p><p>Stroke, an abrupt cerebrovascular ailment resulting in brain tissue damage, has prompted the adoption of motor imagery (MI)-based brain-computer interface (BCI) systems in stroke rehabilitation. However, analyzing electroencephalogram (EEG) signals from stroke patients poses challenges. To address the issues of low accuracy and efficiency in EEG classification, particularly involving MI, the study proposes a residual graph convolutional network (M-ResGCN) framework based on the modified <i>S</i>-transform (MST), and introduces the self-attention mechanism into residual graph convolutional network (ResGCN). This study uses MST to extract EEG time-frequency domain features, derives spatial EEG features by calculating the absolute Pearson correlation coefficient (aPcc) between channels, and devises a method to construct the adjacency matrix of the brain network using aPcc to measure the strength of the connection between channels. Experimental results involving 16 stroke patients and 16 healthy subjects demonstrate significant improvements in classification quality and robustness across tests and subjects. The highest classification accuracy reached 94.91% and a Kappa coefficient of 0.8918. The average accuracy and <i>F</i>1 scores from 10 times 10-fold cross-validation are 94.38% and 94.36%, respectively. By validating the feasibility and applicability of brain networks constructed using the aPcc in EEG signal analysis and feature encoding, it was established that the aPcc effectively reflects overall brain activity. The proposed method presents a novel approach to exploring channel relationships in MI-EEG and improving classification performance. It holds promise for real-time applications in MI-based BCI systems.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2450069"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional retinal implants involve complex surgical procedures and require invasive implantation. Temporal Interference Stimulation (TIS) has achieved noninvasive and focused stimulation of deep brain regions by delivering high-frequency currents with small frequency differences on multiple electrodes. In this study, we conducted in silico investigations to evaluate extraocular TIS's potential as a novel visual restoration approach. Different from the previously published retinal TIS model, the new model of extraocular TIS incorporated a biophysically detailed retinal ganglion cell (RGC) population, enabling a more accurate simulation of retinal outputs under electrical stimulation. Using this improved model, we made the following major discoveries: (1) the maximum value of TIS envelope electric potential ([Formula: see text] showed a strong correlation with TIS-induced RGC activation; (2) the preferred stimulating/return electrode (SE/RE) locations to achieve focalized TIS were predicted; (3) the performance of extraocular TIS was better than same-frequency sinusoidal stimulation (SSS) in terms of lower RGC threshold and more focused RGC activation; (4) the optimal stimulation parameters to achieve lower threshold and focused activation were identified; and (5) spatial selectivity of TIS could be improved by integrating current steering strategy and reducing electrode size. This study provides insights into the feasibility and effectiveness of a low-invasive stimulation approach in enhancing vision restoration.
{"title":"Spatially Selective Retinal Ganglion Cell Activation Using Low Invasive Extraocular Temporal Interference Stimulation.","authors":"Xiaoyu Song, Tianruo Guo, Saidong Ma, Feng Zhou, Jiaxin Tian, Zhengyang Liu, Jiao Liu, Heng Li, Yao Chen, Xinyu Chai, Liming Li","doi":"10.1142/S0129065724500667","DOIUrl":"10.1142/S0129065724500667","url":null,"abstract":"<p><p>Conventional retinal implants involve complex surgical procedures and require invasive implantation. Temporal Interference Stimulation (TIS) has achieved noninvasive and focused stimulation of deep brain regions by delivering high-frequency currents with small frequency differences on multiple electrodes. In this study, we conducted <i>in silico</i> investigations to evaluate extraocular TIS's potential as a novel visual restoration approach. Different from the previously published retinal TIS model, the new model of extraocular TIS incorporated a biophysically detailed retinal ganglion cell (RGC) population, enabling a more accurate simulation of retinal outputs under electrical stimulation. Using this improved model, we made the following major discoveries: (1) the maximum value of TIS envelope electric potential ([Formula: see text] showed a strong correlation with TIS-induced RGC activation; (2) the preferred stimulating/return electrode (SE/RE) locations to achieve focalized TIS were predicted; (3) the performance of extraocular TIS was better than same-frequency sinusoidal stimulation (SSS) in terms of lower RGC threshold and more focused RGC activation; (4) the optimal stimulation parameters to achieve lower threshold and focused activation were identified; and (5) spatial selectivity of TIS could be improved by integrating current steering strategy and reducing electrode size. This study provides insights into the feasibility and effectiveness of a low-invasive stimulation approach in enhancing vision restoration.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2450066"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-09-30DOI: 10.1142/S0129065724500680
Zhihua Wang, Jingjun Gu, Wang Zhou, Quansong He, Tianli Zhao, Jialong Guo, Li Lu, Tao He, Jiajun Bu
With the rapid advancement of deep learning, computer-aided diagnosis and treatment have become crucial in medicine. UNet is a widely used architecture for medical image segmentation, and various methods for improving UNet have been extensively explored. One popular approach is incorporating transformers, though their quadratic computational complexity poses challenges. Recently, State-Space Models (SSMs), exemplified by Mamba, have gained significant attention as a promising alternative due to their linear computational complexity. Another approach, neural memory Ordinary Differential Equations (nmODEs), exhibits similar principles and achieves good results. In this paper, we explore the respective strengths and weaknesses of nmODEs and SSMs and propose a novel architecture, the nmSSM decoder, which combines the advantages of both approaches. This architecture possesses powerful nonlinear representation capabilities while retaining the ability to preserve input and process global information. We construct nmSSM-UNet using the nmSSM decoder and conduct comprehensive experiments on the PH2, ISIC2018, and BU-COCO datasets to validate its effectiveness in medical image segmentation. The results demonstrate the promising application value of nmSSM-UNet. Additionally, we conducted ablation experiments to verify the effectiveness of our proposed improvements on SSMs and nmODEs.
{"title":"Neural Memory State Space Models for Medical Image Segmentation.","authors":"Zhihua Wang, Jingjun Gu, Wang Zhou, Quansong He, Tianli Zhao, Jialong Guo, Li Lu, Tao He, Jiajun Bu","doi":"10.1142/S0129065724500680","DOIUrl":"10.1142/S0129065724500680","url":null,"abstract":"<p><p>With the rapid advancement of deep learning, computer-aided diagnosis and treatment have become crucial in medicine. UNet is a widely used architecture for medical image segmentation, and various methods for improving UNet have been extensively explored. One popular approach is incorporating transformers, though their quadratic computational complexity poses challenges. Recently, State-Space Models (SSMs), exemplified by Mamba, have gained significant attention as a promising alternative due to their linear computational complexity. Another approach, neural memory Ordinary Differential Equations (nmODEs), exhibits similar principles and achieves good results. In this paper, we explore the respective strengths and weaknesses of nmODEs and SSMs and propose a novel architecture, the nmSSM decoder, which combines the advantages of both approaches. This architecture possesses powerful nonlinear representation capabilities while retaining the ability to preserve input and process global information. We construct nmSSM-UNet using the nmSSM decoder and conduct comprehensive experiments on the PH2, ISIC2018, and BU-COCO datasets to validate its effectiveness in medical image segmentation. The results demonstrate the promising application value of nmSSM-UNet. Additionally, we conducted ablation experiments to verify the effectiveness of our proposed improvements on SSMs and nmODEs.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2450068"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-11-15DOI: 10.1142/S0129065724500709
Kendra K Noneman, J Patrick Mayo
Eye movements are the primary way primates interact with the world. Understanding how the brain controls the eyes is therefore crucial for improving human health and designing visual rehabilitation devices. However, brain activity is challenging to decipher. Here, we leveraged machine learning algorithms to reconstruct tracking eye movements from high-resolution neuronal recordings. We found that continuous eye position could be decoded with high accuracy using spiking data from only a few dozen cortical neurons. We tested eight decoders and found that neural network models yielded the highest decoding accuracy. Simpler models performed well above chance with a substantial reduction in training time. We measured the impact of data quantity (e.g. number of neurons) and data format (e.g. bin width) on training time, inference time, and generalizability. Training models with more input data improved performance, as expected, but the format of the behavioral output was critical for emphasizing or omitting specific oculomotor events. Our results provide the first demonstration, to our knowledge, of continuously decoded eye movements across a large field of view. Our comprehensive investigation of predictive power and computational efficiency for common decoder architectures provides a much-needed foundation for future work on real-time gaze-tracking devices.
{"title":"Decoding Continuous Tracking Eye Movements from Cortical Spiking Activity.","authors":"Kendra K Noneman, J Patrick Mayo","doi":"10.1142/S0129065724500709","DOIUrl":"10.1142/S0129065724500709","url":null,"abstract":"<p><p>Eye movements are the primary way primates interact with the world. Understanding how the brain controls the eyes is therefore crucial for improving human health and designing visual rehabilitation devices. However, brain activity is challenging to decipher. Here, we leveraged machine learning algorithms to reconstruct tracking eye movements from high-resolution neuronal recordings. We found that continuous eye position could be decoded with high accuracy using spiking data from only a few dozen cortical neurons. We tested eight decoders and found that neural network models yielded the highest decoding accuracy. Simpler models performed well above chance with a substantial reduction in training time. We measured the impact of data quantity (e.g. number of neurons) and data format (e.g. bin width) on training time, inference time, and generalizability. Training models with more input data improved performance, as expected, but the format of the behavioral output was critical for emphasizing or omitting specific oculomotor events. Our results provide the first demonstration, to our knowledge, of continuously decoded eye movements across a large field of view. Our comprehensive investigation of predictive power and computational efficiency for common decoder architectures provides a much-needed foundation for future work on real-time gaze-tracking devices.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2450070"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142640247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-11-19DOI: 10.1142/S0129065725500017
Liang Zhao, Renling Zou, Linpeng Jin
Paroxysmal kinesigenic dyskinesia (PKD) is a rare neurological disorder marked by transient involuntary movements triggered by sudden actions. Current diagnostic approaches, including genetic screening, face challenges in identifying secondary cases due to symptom overlap with other disorders. This study introduces a novel PKD recognition method utilizing a resting-state electroencephalogram (EEG) functional connectivity matrix and a deep learning architecture (AT-1CBL). Resting-state EEG data from 44 PKD patients and 44 healthy controls (HCs) were collected using a 128-channel EEG system. Functional connectivity matrices were computed and transformed into graph data to examine brain network property differences between PKD patients and controls through graph theory. Source localization was conducted to explore neural circuit differences in patients. The AT-1CBL model, integrating 1D-CNN and Bi-LSTM with attentional mechanisms, achieved a classification accuracy of 93.77% on phase lag index (PLI) features in the Theta band. Graph theoretic analysis revealed significant phase synchronization impairments in the Theta band of the functional brain network in PKD patients, particularly in the distribution of weak connections compared to HCs. Source localization analyses indicated greater differences in functional connectivity in sensorimotor regions and the frontal-limbic system in PKD patients, suggesting abnormalities in motor integration related to clinical symptoms. This study highlights the potential of deep learning models based on EEG functional connectivity for accurate and cost-effective PKD diagnosis, supporting the development of portable EEG devices for clinical monitoring and diagnosis. However, the limited dataset size may affect generalizability, and further exploration of multimodal data integration and advanced deep learning architectures is necessary to enhance the robustness of PKD diagnostic models.
{"title":"Deep Learning Recognition of Paroxysmal Kinesigenic Dyskinesia Based on EEG Functional Connectivity.","authors":"Liang Zhao, Renling Zou, Linpeng Jin","doi":"10.1142/S0129065725500017","DOIUrl":"10.1142/S0129065725500017","url":null,"abstract":"<p><p>Paroxysmal kinesigenic dyskinesia (PKD) is a rare neurological disorder marked by transient involuntary movements triggered by sudden actions. Current diagnostic approaches, including genetic screening, face challenges in identifying secondary cases due to symptom overlap with other disorders. This study introduces a novel PKD recognition method utilizing a resting-state electroencephalogram (EEG) functional connectivity matrix and a deep learning architecture (AT-1CBL). Resting-state EEG data from 44 PKD patients and 44 healthy controls (HCs) were collected using a 128-channel EEG system. Functional connectivity matrices were computed and transformed into graph data to examine brain network property differences between PKD patients and controls through graph theory. Source localization was conducted to explore neural circuit differences in patients. The AT-1CBL model, integrating 1D-CNN and Bi-LSTM with attentional mechanisms, achieved a classification accuracy of 93.77% on phase lag index (PLI) features in the Theta band. Graph theoretic analysis revealed significant phase synchronization impairments in the Theta band of the functional brain network in PKD patients, particularly in the distribution of weak connections compared to HCs. Source localization analyses indicated greater differences in functional connectivity in sensorimotor regions and the frontal-limbic system in PKD patients, suggesting abnormalities in motor integration related to clinical symptoms. This study highlights the potential of deep learning models based on EEG functional connectivity for accurate and cost-effective PKD diagnosis, supporting the development of portable EEG devices for clinical monitoring and diagnosis. However, the limited dataset size may affect generalizability, and further exploration of multimodal data integration and advanced deep learning architectures is necessary to enhance the robustness of PKD diagnostic models.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550001"},"PeriodicalIF":0.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-23DOI: 10.1142/S0129065725500078
Matteo Cavaleri, Claudio Zandron
In the last few decades, Artificial Neural Networks have become more and more important, evolving into a powerful tool to implement learning algorithms. Spiking neural networks represent the third generation of Artificial Neural Networks; they have earned growing significance due to their remarkable achievements in pattern recognition, finding extensive utility across diverse domains such as e.g. diagnostic medicine. Usually, Spiking Neural Networks are slightly less accurate than other Artificial Neural Networks, but they require a reduced amount of energy to perform calculations; this amount of energy further reduces in a very significant manner if they are implemented on hardware specifically designed for them, like neuromorphic hardware. In this work, we focus on exploring the versatility of Spiking Neural Networks and their potential applications across a range of scenarios by exploiting their adaptability and dynamic processing capabilities, which make them suitable for various tasks. A first rough network is designed based on the dataset's general attributes; the network is then refined through an extensive grid search algorithm to identify the optimal values for hyperparameters. This dual-step process ensures that the Spiking Neural Network can be tailored to diverse and potentially very different situations in a direct and intuitive manner. We test this by considering three different scenarios: epileptic seizure detection, both considering binary and multi-classification tasks, as well as wine classification. The proposed methodology turned out to be highly effective in binary class scenarios: the Spiking Neural Networks models achieved significantly lower energy consumption compared to Artificial Neural Networks while approaching nearly 100% accuracy. In the case of multi-class classification, the model achieved an accuracy of approximately 90%, thus indicating that it can still be further improved.
{"title":"Exploring the Versatility of Spiking Neural Networks: Applications Across Diverse Scenarios.","authors":"Matteo Cavaleri, Claudio Zandron","doi":"10.1142/S0129065725500078","DOIUrl":"https://doi.org/10.1142/S0129065725500078","url":null,"abstract":"<p><p>In the last few decades, Artificial Neural Networks have become more and more important, evolving into a powerful tool to implement learning algorithms. Spiking neural networks represent the third generation of Artificial Neural Networks; they have earned growing significance due to their remarkable achievements in pattern recognition, finding extensive utility across diverse domains such as e.g. diagnostic medicine. Usually, Spiking Neural Networks are slightly less accurate than other Artificial Neural Networks, but they require a reduced amount of energy to perform calculations; this amount of energy further reduces in a very significant manner if they are implemented on hardware specifically designed for them, like neuromorphic hardware. In this work, we focus on exploring the versatility of Spiking Neural Networks and their potential applications across a range of scenarios by exploiting their adaptability and dynamic processing capabilities, which make them suitable for various tasks. A first rough network is designed based on the dataset's general attributes; the network is then refined through an extensive grid search algorithm to identify the optimal values for hyperparameters. This dual-step process ensures that the Spiking Neural Network can be tailored to diverse and potentially very different situations in a direct and intuitive manner. We test this by considering three different scenarios: epileptic seizure detection, both considering binary and multi-classification tasks, as well as wine classification. The proposed methodology turned out to be highly effective in binary class scenarios: the Spiking Neural Networks models achieved significantly lower energy consumption compared to Artificial Neural Networks while approaching nearly 100% accuracy. In the case of multi-class classification, the model achieved an accuracy of approximately 90%, thus indicating that it can still be further improved.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550007"},"PeriodicalIF":0.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142879122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cloud cover experiences rapid fluctuations, significantly impacting the irradiance reaching the ground and causing frequent variations in photovoltaic power output. Accurate detection of thin and fragmented clouds is crucial for reliable photovoltaic power generation forecasting. In this paper, we introduce a novel cloud detection method, termed Adaptive Laplacian Coordination Enhanced Cross-Feature U-Net (ALCU-Net). This method augments the traditional U-Net architecture with three innovative components: an Adaptive Feature Coordination (AFC) module, an Adaptive Laplacian Cross-Feature U-Net with a Multi-Grained Laplacian-Enhanced (MLE) feature module, and a Criss-Cross Feature Fused Detection (CCFE) module. The AFC module enhances spatial coherence and bridges semantic gaps across multi-channel images. The Adaptive Laplacian Cross-Feature U-Net integrates features from adjacent hierarchical levels, using the MLE module to refine cloud characteristics and edge details over time. The CCFE module, embedded in the U-Net decoder, leverages criss-cross features to improve detection accuracy. Experimental evaluations show that ALCU-Net consistently outperforms existing cloud detection methods, demonstrating superior accuracy in identifying both thick and thin clouds and in mapping fragmented cloud patches across various environments, including oceans, polar regions, and complex ocean-land mixtures.
{"title":"A Cloud Detection Network Based on Adaptive Laplacian Coordination Enhanced Cross-Feature U-Net.","authors":"Kaizheng Wang, Ruohan Zhou, Jian Wang, Ferrante Neri, Yitong Fu, Shunzhen Zhou","doi":"10.1142/S0129065725500054","DOIUrl":"https://doi.org/10.1142/S0129065725500054","url":null,"abstract":"<p><p>Cloud cover experiences rapid fluctuations, significantly impacting the irradiance reaching the ground and causing frequent variations in photovoltaic power output. Accurate detection of thin and fragmented clouds is crucial for reliable photovoltaic power generation forecasting. In this paper, we introduce a novel cloud detection method, termed Adaptive Laplacian Coordination Enhanced Cross-Feature U-Net (ALCU-Net). This method augments the traditional U-Net architecture with three innovative components: an Adaptive Feature Coordination (AFC) module, an Adaptive Laplacian Cross-Feature U-Net with a Multi-Grained Laplacian-Enhanced (MLE) feature module, and a Criss-Cross Feature Fused Detection (CCFE) module. The AFC module enhances spatial coherence and bridges semantic gaps across multi-channel images. The Adaptive Laplacian Cross-Feature U-Net integrates features from adjacent hierarchical levels, using the MLE module to refine cloud characteristics and edge details over time. The CCFE module, embedded in the U-Net decoder, leverages criss-cross features to improve detection accuracy. Experimental evaluations show that ALCU-Net consistently outperforms existing cloud detection methods, demonstrating superior accuracy in identifying both thick and thin clouds and in mapping fragmented cloud patches across various environments, including oceans, polar regions, and complex ocean-land mixtures.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2550005"},"PeriodicalIF":0.0,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142824883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2024-07-25DOI: 10.1142/S0129065724500564
Xi Zheng, Yi Yang, Dehan Li, Yi Deng, Yuexiong Xie, Zhang Yi, Litai Ma, Lei Xu
In the evaluation of cervical spine disorders, precise positioning of anatomo-physiological hallmarks is fundamental for calculating diverse measurement metrics. Despite the fact that deep learning has achieved impressive results in the field of keypoint localization, there are still many limitations when facing medical image. First, these methods often encounter limitations when faced with the inherent variability in cervical spine datasets, arising from imaging factors. Second, predicting keypoints for only 4% of the entire X-ray image surface area poses a significant challenge. To tackle these issues, we propose a deep neural network architecture, NF-DEKR, specifically tailored for predicting keypoints in cervical spine physiological anatomy. Leveraging neural memory ordinary differential equation with its distinctive memory learning separation and convergence to a singular global attractor characteristic, our design effectively mitigates inherent data variability. Simultaneously, we introduce a Multi-Resolution Focus module to preprocess feature maps before entering the disentangled regression branch and the heatmap branch. Employing a differentiated strategy for feature maps of varying scales, this approach yields more accurate predictions of densely localized keypoints. We construct a medical dataset, SCUSpineXray, comprising X-ray images annotated by orthopedic specialists and conduct similar experiments on the publicly available UWSpineCT dataset. Experimental results demonstrate that compared to the baseline DEKR network, our proposed method enhances average precision by 2% to 3%, accompanied by a marginal increase in model parameters and the floating-point operations (FLOPs). The code (https://github.com/Zhxyi/NF-DEKR) is available.
在评估颈椎疾病时,解剖生理特征的精确定位是计算各种测量指标的基础。尽管深度学习在关键点定位领域取得了令人瞩目的成果,但在面对医学影像时仍存在许多局限性。首先,面对颈椎数据集因成像因素而产生的固有变异,这些方法往往会遇到限制。其次,预测仅占整个 X 射线图像表面积 4% 的关键点也是一个巨大的挑战。为了解决这些问题,我们提出了一种深度神经网络架构 NF-DEKR,专门用于预测颈椎生理解剖中的关键点。利用神经记忆常微分方程的独特记忆学习分离和收敛到奇异全局吸引子的特性,我们的设计有效地缓解了固有的数据变异性。同时,我们引入了多分辨率聚焦模块,在进入分离回归分支和热图分支之前对特征图进行预处理。这种方法针对不同尺度的特征图采用了不同的策略,能更准确地预测密集定位的关键点。我们构建了一个医疗数据集 SCUSpineXray,其中包括由骨科专家注释的 X 光图像,并在公开可用的 UWSpineCT 数据集上进行了类似的实验。实验结果表明,与基线 DEKR 网络相比,我们提出的方法将平均精度提高了 2% 到 3%,同时模型参数和浮点运算 (FLOP) 略有增加。代码 (https://github.com/Zhxyi/NF-DEKR) 可供下载。
{"title":"Precise Localization for Anatomo-Physiological Hallmarks of the Cervical Spine by Using Neural Memory Ordinary Differential Equation.","authors":"Xi Zheng, Yi Yang, Dehan Li, Yi Deng, Yuexiong Xie, Zhang Yi, Litai Ma, Lei Xu","doi":"10.1142/S0129065724500564","DOIUrl":"10.1142/S0129065724500564","url":null,"abstract":"<p><p>In the evaluation of cervical spine disorders, precise positioning of anatomo-physiological hallmarks is fundamental for calculating diverse measurement metrics. Despite the fact that deep learning has achieved impressive results in the field of keypoint localization, there are still many limitations when facing medical image. First, these methods often encounter limitations when faced with the inherent variability in cervical spine datasets, arising from imaging factors. Second, predicting keypoints for only 4% of the entire X-ray image surface area poses a significant challenge. To tackle these issues, we propose a deep neural network architecture, NF-DEKR, specifically tailored for predicting keypoints in cervical spine physiological anatomy. Leveraging neural memory ordinary differential equation with its distinctive memory learning separation and convergence to a singular global attractor characteristic, our design effectively mitigates inherent data variability. Simultaneously, we introduce a Multi-Resolution Focus module to preprocess feature maps before entering the disentangled regression branch and the heatmap branch. Employing a differentiated strategy for feature maps of varying scales, this approach yields more accurate predictions of densely localized keypoints. We construct a medical dataset, SCUSpineXray, comprising X-ray images annotated by orthopedic specialists and conduct similar experiments on the publicly available UWSpineCT dataset. Experimental results demonstrate that compared to the baseline DEKR network, our proposed method enhances average precision by 2% to 3%, accompanied by a marginal increase in model parameters and the floating-point operations (FLOPs). The code (https://github.com/Zhxyi/NF-DEKR) is available.</p>","PeriodicalId":94052,"journal":{"name":"International journal of neural systems","volume":" ","pages":"2450056"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141763528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}