Emotion recognition based on electroencephalography (EEG) has significant advantages in terms of reliability and accuracy. However, individual differences in EEG limit the ability of sentiment classifiers to generalize across subjects. Furthermore, due to the nonstationarity of EEG, subject signals can vary with time, an important challenge for temporal emotion recognition. Several emotion recognition methods have been developed that consider the alignment of conditional distributions, but do not balance the weights of conditional and marginal distributions. In this article, we propose a novel approach to generalize emotion recognition models across individuals and time, i.e., global and local associative domain adaptation (GLADA). The proposed method consists of three parts: 1) deep neural networks are used to extract deep features from emotional EEG data; 2) considering that marginal and conditional distributions between domains can contribute to adaptation differently, a method that combines coarse-grained adversarial adaptation and fine-grained adversarial adaptation is used to narrow the domain distance of the joint distribution in the EEG data between subjects (i.e., reduce intersubject variability), and the weights of the marginal and conditional distributions are automatically balanced using dynamic balancing factors; and 3) domain adaptation is used to accelerate model convergence. Using GLADA, subject-independent EEG emotion recognition is improved by reducing the influence of the subject’s personal information on EEG emotion. Experimental results demonstrate that the GLADA model effectively addresses the domain transfer problem, resulting in improved performance across multiple EEG emotion recognition tasks.
{"title":"GLADA: Global and Local Associative Domain Adaptation for EEG-Based Emotion Recognition","authors":"Tianxu Pan;Nuo Su;Jun Shan;Yang Tang;Guoqiang Zhong;Tianzi Jiang;Nianming Zuo","doi":"10.1109/TCDS.2024.3432752","DOIUrl":"10.1109/TCDS.2024.3432752","url":null,"abstract":"Emotion recognition based on electroencephalography (EEG) has significant advantages in terms of reliability and accuracy. However, individual differences in EEG limit the ability of sentiment classifiers to generalize across subjects. Furthermore, due to the nonstationarity of EEG, subject signals can vary with time, an important challenge for temporal emotion recognition. Several emotion recognition methods have been developed that consider the alignment of conditional distributions, but do not balance the weights of conditional and marginal distributions. In this article, we propose a novel approach to generalize emotion recognition models across individuals and time, i.e., global and local associative domain adaptation (GLADA). The proposed method consists of three parts: 1) deep neural networks are used to extract deep features from emotional EEG data; 2) considering that marginal and conditional distributions between domains can contribute to adaptation differently, a method that combines coarse-grained adversarial adaptation and fine-grained adversarial adaptation is used to narrow the domain distance of the joint distribution in the EEG data between subjects (i.e., reduce intersubject variability), and the weights of the marginal and conditional distributions are automatically balanced using dynamic balancing factors; and 3) domain adaptation is used to accelerate model convergence. Using GLADA, subject-independent EEG emotion recognition is improved by reducing the influence of the subject’s personal information on EEG emotion. Experimental results demonstrate that the GLADA model effectively addresses the domain transfer problem, resulting in improved performance across multiple EEG emotion recognition tasks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"167-178"},"PeriodicalIF":5.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given that emotional content spreads more widely than rational content in social networks, as well as the complexity of user cognition and the interaction of derivative topics, this article proposes a derivative topic dissemination model that integrates multidimensional cognition and game theory. First, regarding the issue of user emotional reactions in mining topics. In this article, we quantify the affective influence among users by considering user behaviors as continuous conversations through conversation-level sentiment analysis and the proximity centrality of social networks. Second, considering that user behavior is influenced by multidimensional cognition, this article proposes a method based on S(Sensibility) R(Rationality) 2vec to simulate the dialectical relationship between sensibility and rationality in the user decision-making process. Finally, considering the cooperative and competitive relationship among derived topics, this article uses evolutionary game theory to analyze the topic life cycle and quantify its impact on user behavior by time discretization method. Accordingly, we propose a CG-back-propagation (BP) model incorporating a BP neural network to efficiently simulate the nonlinear relationship of user behavior. Experiments show that the model can not only effectively tap the influence of multidimensional cognition on users’ retweeting behavior, but also effectively perceive the propagation dynamics of derived topics.
{"title":"A Derivative Topic Propagation Model Based on Multidimensional Cognition and Game Theory","authors":"Qian Li;Long Gao;Wenyi Xi;Tun Li;Rong Wang;Junwei Ge;Yunpeng Xiao","doi":"10.1109/TCDS.2024.3432337","DOIUrl":"10.1109/TCDS.2024.3432337","url":null,"abstract":"Given that emotional content spreads more widely than rational content in social networks, as well as the complexity of user cognition and the interaction of derivative topics, this article proposes a derivative topic dissemination model that integrates multidimensional cognition and game theory. First, regarding the issue of user emotional reactions in mining topics. In this article, we quantify the affective influence among users by considering user behaviors as continuous conversations through conversation-level sentiment analysis and the proximity centrality of social networks. Second, considering that user behavior is influenced by multidimensional cognition, this article proposes a method based on S(Sensibility) R(Rationality) 2vec to simulate the dialectical relationship between sensibility and rationality in the user decision-making process. Finally, considering the cooperative and competitive relationship among derived topics, this article uses evolutionary game theory to analyze the topic life cycle and quantify its impact on user behavior by time discretization method. Accordingly, we propose a CG-back-propagation (BP) model incorporating a BP neural network to efficiently simulate the nonlinear relationship of user behavior. Experiments show that the model can not only effectively tap the influence of multidimensional cognition on users’ retweeting behavior, but also effectively perceive the propagation dynamics of derived topics.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"189-204"},"PeriodicalIF":5.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech imagery (SI)-based brain–computer interface (BCI) using electroencephalogram (EEG) signal is a promising area of research for individuals with severe speech production disorders. Recent advances in deep learning (DL) have led to significant improvements in this domain. However, there is a lack of comprehensive review that covers the application of DL methods for decoding imagined speech via EEG. In this article, we survey SI and DL literature to address critical questions regarding preferred paradigms, preprocessing necessity, optimal input formulations, and current trends in DL-based techniques. Specifically, we first search major databases across science and engineering disciplines for relevant studies. Then, we analyze the DL-based techniques applied in SI decoding from five main perspectives: dataset, preprocessing, input formulation, DL architecture, and performance evaluation. Moreover, we summarize the key findings of this work and propose a set of practical recommendations. Finally, we highlight the practical challenges of DL-based imagined speech decoding and suggest future research directions.
{"title":"Speech Imagery Decoding Using EEG Signals and Deep Learning: A Survey","authors":"Liying Zhang;Yueying Zhou;Peiliang Gong;Daoqiang Zhang","doi":"10.1109/TCDS.2024.3431224","DOIUrl":"10.1109/TCDS.2024.3431224","url":null,"abstract":"Speech imagery (SI)-based brain–computer interface (BCI) using electroencephalogram (EEG) signal is a promising area of research for individuals with severe speech production disorders. Recent advances in deep learning (DL) have led to significant improvements in this domain. However, there is a lack of comprehensive review that covers the application of DL methods for decoding imagined speech via EEG. In this article, we survey SI and DL literature to address critical questions regarding preferred paradigms, preprocessing necessity, optimal input formulations, and current trends in DL-based techniques. Specifically, we first search major databases across science and engineering disciplines for relevant studies. Then, we analyze the DL-based techniques applied in SI decoding from five main perspectives: dataset, preprocessing, input formulation, DL architecture, and performance evaluation. Moreover, we summarize the key findings of this work and propose a set of practical recommendations. Finally, we highlight the practical challenges of DL-based imagined speech decoding and suggest future research directions.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"22-39"},"PeriodicalIF":5.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-11DOI: 10.1109/TCDS.2024.3425845
Mei Guo;Douyin Zhang;Wenhai Guo;Gang Dou;Junwei Sun
Emotion plays an important role in human life. In recent years, memristor-based emotion circuits have been proposed extensively, but few circuits simulate the neural circuity that generates specific emotions in the limbic system. In this article, a memristor-based circuit of brain-like fear generalization is proposed. It is described from two dimensions of perception and higher cognition, respectively, both of which are realized by simulating the limbic system of human brain. The main difference between these two dimensions lies in the circuit design of the hippocampus module. Moreover, the memory enhancement effect caused by fear is one of the reasons for the phenomenon of fear generalization. That is, high arousal of fear leads to enhanced memory. Herein, the memristor-based circuit associated with different emotional arousal and memory is designed. The simulation results in SPICE show that the circuit is able to implement the brain-like fear generalization and the emotional memory under different arousal. The circuit design of these neural networks may provide some references for the field of brain-like robots.
{"title":"Implementing Brain-Like Fear Generalization and Emotional Arousal Associated With Memory","authors":"Mei Guo;Douyin Zhang;Wenhai Guo;Gang Dou;Junwei Sun","doi":"10.1109/TCDS.2024.3425845","DOIUrl":"10.1109/TCDS.2024.3425845","url":null,"abstract":"Emotion plays an important role in human life. In recent years, memristor-based emotion circuits have been proposed extensively, but few circuits simulate the neural circuity that generates specific emotions in the limbic system. In this article, a memristor-based circuit of brain-like fear generalization is proposed. It is described from two dimensions of perception and higher cognition, respectively, both of which are realized by simulating the limbic system of human brain. The main difference between these two dimensions lies in the circuit design of the hippocampus module. Moreover, the memory enhancement effect caused by fear is one of the reasons for the phenomenon of fear generalization. That is, high arousal of fear leads to enhanced memory. Herein, the memristor-based circuit associated with different emotional arousal and memory is designed. The simulation results in SPICE show that the circuit is able to implement the brain-like fear generalization and the emotional memory under different arousal. The circuit design of these neural networks may provide some references for the field of brain-like robots.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"155-166"},"PeriodicalIF":5.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.1109/TCDS.2024.3406168
Xiaoshan Wu;Weihua He;Man Yao;Ziyang Zhang;Yaoyuan Wang;Bo Xu;Guoqi Li
Event cameras have gained popularity in depth estimation due to their superior features such as high-temporal resolution, low latency, and low-power consumption. Spiking neural network (SNN) is a promising approach for processing event camera inputs due to its spike-based event-driven nature. However, SNNs face performance degradation when the network becomes deeper, affecting their performance in depth estimation tasks. To address this issue, we propose a deep spiking U-Net model. Our spiking U-Net architecture leverages refined shortcuts and residual blocks to avoid performance degradation and boost task performance. We also propose a new event representation method designed for multistep SNNs to effectively utilize depth information in the temporal dimension. Our experiments on MVSEC dataset show that the proposed method improves accuracy by 18.50% and 25.18% compared to current state-of-the-art (SOTA) ANN and SNN models, respectively. Moreover, the energy efficiency can be improved up to 58 times by our proposed SNN model compared with the corresponding ANN with the same network structure.
{"title":"Event-Based Depth Prediction With Deep Spiking Neural Network","authors":"Xiaoshan Wu;Weihua He;Man Yao;Ziyang Zhang;Yaoyuan Wang;Bo Xu;Guoqi Li","doi":"10.1109/TCDS.2024.3406168","DOIUrl":"10.1109/TCDS.2024.3406168","url":null,"abstract":"Event cameras have gained popularity in depth estimation due to their superior features such as high-temporal resolution, low latency, and low-power consumption. Spiking neural network (SNN) is a promising approach for processing event camera inputs due to its spike-based event-driven nature. However, SNNs face performance degradation when the network becomes deeper, affecting their performance in depth estimation tasks. To address this issue, we propose a deep spiking U-Net model. Our spiking U-Net architecture leverages refined shortcuts and residual blocks to avoid performance degradation and boost task performance. We also propose a new event representation method designed for multistep SNNs to effectively utilize depth information in the temporal dimension. Our experiments on MVSEC dataset show that the proposed method improves accuracy by 18.50% and 25.18% compared to current state-of-the-art (SOTA) ANN and SNN models, respectively. Moreover, the energy efficiency can be improved up to 58 times by our proposed SNN model compared with the corresponding ANN with the same network structure.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"2008-2018"},"PeriodicalIF":5.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1109/TCDS.2024.3417534
Raveendrababu Vempati;Lakhan Dev Sharma;Rajesh Kumar Tripathy
Emotions are mental states that determine the behavior of a person in society. Automated identification of a person's emotion is vital in different applications such as brain–computer interfaces (BCIs), recommender systems (RSs), and cognitive neuroscience. This article proposes an automated approach based on multivariate fast iterative filtering (MvFIF) and an ensemble machine learning model to recognize cross-subject emotions from electroencephalogram (EEG) signals. The multichannel EEG signals are initially decomposed into multichannel intrinsic mode functions (MIMFs) using the MvFIF. The features, such as differential entropy (DE), dispersion entropy (DispEn), permutation entropy (PE), spectral entropy (SE), and distribution entropy (DistEn), are extracted from MIMFs. The binary atom search optimization (BASO) technique is employed to reduce the dimension of the feature space. The light gradient boosting machine (LGBM), extreme learning machine (ELM), and ensemble bagged tree (EBT) classifiers are used to recognize different human emotions using the features of EEG signals. The results demonstrate that the LGBM classifier has achieved the highest average accuracy of 99.50% and 98.79%, respectively, using multichannel EEG signals from the GAMEEMO and DREAMER databases for cross-subject emotion recognition (ER). Compared to other multivariate signal decomposition algorithms, the MvFIF-based method has demonstrated higher accuracy in recognizing emotions using multichannel EEG signals. The proposed (MvFIF+DE+BASO+LGBM) technique outperforms the existing state-of-the-art methods in ER using EEG signals.
{"title":"Cross-Subject Emotion Recognition From Multichannel EEG Signals Using Multivariate Decomposition and Ensemble Learning","authors":"Raveendrababu Vempati;Lakhan Dev Sharma;Rajesh Kumar Tripathy","doi":"10.1109/TCDS.2024.3417534","DOIUrl":"10.1109/TCDS.2024.3417534","url":null,"abstract":"Emotions are mental states that determine the behavior of a person in society. Automated identification of a person's emotion is vital in different applications such as brain–computer interfaces (BCIs), recommender systems (RSs), and cognitive neuroscience. This article proposes an automated approach based on multivariate fast iterative filtering (MvFIF) and an ensemble machine learning model to recognize cross-subject emotions from electroencephalogram (EEG) signals. The multichannel EEG signals are initially decomposed into multichannel intrinsic mode functions (MIMFs) using the MvFIF. The features, such as differential entropy (DE), dispersion entropy (DispEn), permutation entropy (PE), spectral entropy (SE), and distribution entropy (DistEn), are extracted from MIMFs. The binary atom search optimization (BASO) technique is employed to reduce the dimension of the feature space. The light gradient boosting machine (LGBM), extreme learning machine (ELM), and ensemble bagged tree (EBT) classifiers are used to recognize different human emotions using the features of EEG signals. The results demonstrate that the LGBM classifier has achieved the highest average accuracy of 99.50% and 98.79%, respectively, using multichannel EEG signals from the GAMEEMO and DREAMER databases for cross-subject emotion recognition (ER). Compared to other multivariate signal decomposition algorithms, the MvFIF-based method has demonstrated higher accuracy in recognizing emotions using multichannel EEG signals. The proposed (MvFIF+DE+BASO+LGBM) technique outperforms the existing state-of-the-art methods in ER using EEG signals.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"77-88"},"PeriodicalIF":5.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141571650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1109/TCDS.2024.3424457
Stavros Ntalampiras;Alessandro Scalambrino
There is a direct correlation between noise and human health, while negative consequences may vary from sleep disruption and stress to hearing loss and reduced productivity. Despite its undeniable relevance, the underlying process governing the relationship between unpleasant sound events, and the annoyance they may cause has not been systematically studied yet. In this context, this work focuses on the disturbance caused by interfloor sound events, i.e., the audio signals transmitted through the floors of a building. Activities such as walking, running, using household appliances or other daily actions generate sounds that can be heard by those on an adjacent floor. To this end, we implemented a suitable dataset including diverse interfloor sound events annotated according to the perceived disturbance. Subsequently, we propose a framework able to quantify similarities exhibited by interfloor sound events starting from standardized time-frequency representations, which are processed by a Siamese neural network composed of a series of convolutional layers. Such similarities are then employed by a $k$-medoids regression scheme making disturbance predictions based on interfloor sound events with neighboring latent representations. After thorough experiments, we demonstrate the effectiveness of such a framework and its superiority over popular regression algorithms. Last but not least, the proposed solution offers interpretable predictions, which may be meaningfully utilized by human experts.
{"title":"Automatic Prediction of Disturbance Caused by Interfloor Sound Events","authors":"Stavros Ntalampiras;Alessandro Scalambrino","doi":"10.1109/TCDS.2024.3424457","DOIUrl":"10.1109/TCDS.2024.3424457","url":null,"abstract":"There is a direct correlation between noise and human health, while negative consequences may vary from sleep disruption and stress to hearing loss and reduced productivity. Despite its undeniable relevance, the underlying process governing the relationship between unpleasant sound events, and the annoyance they may cause has not been systematically studied yet. In this context, this work focuses on the disturbance caused by interfloor sound events, i.e., the audio signals transmitted through the floors of a building. Activities such as walking, running, using household appliances or other daily actions generate sounds that can be heard by those on an adjacent floor. To this end, we implemented a suitable dataset including diverse interfloor sound events annotated according to the perceived disturbance. Subsequently, we propose a framework able to quantify similarities exhibited by interfloor sound events starting from standardized time-frequency representations, which are processed by a Siamese neural network composed of a series of convolutional layers. Such similarities are then employed by a <inline-formula><tex-math>$k$</tex-math></inline-formula>-medoids regression scheme making disturbance predictions based on interfloor sound events with neighboring latent representations. After thorough experiments, we demonstrate the effectiveness of such a framework and its superiority over popular regression algorithms. Last but not least, the proposed solution offers interpretable predictions, which may be meaningfully utilized by human experts.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"147-154"},"PeriodicalIF":5.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141577775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Event cameras have unique advantages in object detection, capturing asynchronous events without continuous frames. They excel in dynamic range, low latency, and high-speed motion scenarios, with lower power consumption. However, aggregating event data into image frames leads to information loss and reduced detection performance. Applying traditional neural networks to event camera outputs is challenging due to event data's distinct characteristics. In this study, we present a novel spiking neural networks (SNNs)-based object detection model, the spiking vision transformer (SpikingViT) to address these issues. First, we design a dedicated event data converting module that effectively captures the unique characteristics of event data, mitigating the risk of information loss while preserving its spatiotemporal features. Second, we introduce SpikingViT, a novel object detection model that leverages SNNs capable of extracting spatiotemporal information among events data. SpikingViT combines the advantages of SNNs and transformer models, incorporating mechanisms such as attention and residual voltage memory to further enhance detection performance. Extensive experiments have substantiated the remarkable proficiency of SpikingViT in event-based object detection, positioning it as a formidable contender. Our proposed approach adeptly retains spatiotemporal information inherent in event data, leading to a substantial enhancement in detection performance.
{"title":"SpikingViT: A Multiscale Spiking Vision Transformer Model for Event-Based Object Detection","authors":"Lixing Yu;Hanqi Chen;Ziming Wang;Shaojie Zhan;Jiankun Shao;Qingjie Liu;Shu Xu","doi":"10.1109/TCDS.2024.3422873","DOIUrl":"10.1109/TCDS.2024.3422873","url":null,"abstract":"Event cameras have unique advantages in object detection, capturing asynchronous events without continuous frames. They excel in dynamic range, low latency, and high-speed motion scenarios, with lower power consumption. However, aggregating event data into image frames leads to information loss and reduced detection performance. Applying traditional neural networks to event camera outputs is challenging due to event data's distinct characteristics. In this study, we present a novel spiking neural networks (SNNs)-based object detection model, the spiking vision transformer (SpikingViT) to address these issues. First, we design a dedicated event data converting module that effectively captures the unique characteristics of event data, mitigating the risk of information loss while preserving its spatiotemporal features. Second, we introduce SpikingViT, a novel object detection model that leverages SNNs capable of extracting spatiotemporal information among events data. SpikingViT combines the advantages of SNNs and transformer models, incorporating mechanisms such as attention and residual voltage memory to further enhance detection performance. Extensive experiments have substantiated the remarkable proficiency of SpikingViT in event-based object detection, positioning it as a formidable contender. Our proposed approach adeptly retains spatiotemporal information inherent in event data, leading to a substantial enhancement in detection performance.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"130-146"},"PeriodicalIF":5.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The NMDA receptor (NMDAR), as a ubiquitous type of synapse in neural systems of the brain, presents slow dynamics to modulate neural spiking activity. For the cerebellum, NMDARs have been suggested for contributing complex spikes in Purkinje cells (PCs) as a mechanism for cognitive activity, learning, and memory. Recent experimental studies are debating the role of NMDAR in PC dendritic input, yet it remains unclear how the distribution of NMDARs in PC dendrites can affect their neural spiking coding properties. In this work, a detailed multiple-compartment PC model was used to study how slow-scale NMDARs together with fast-scale AMPA, regulate neural coding. We find that NMDARs act as a band-pass filter, increasing the excitability of PC firing under low-frequency input while reducing it under high frequency. This effect is positively related to the strength of NMDARs. For a response sequence containing a large number of regular and irregular spiking patterns, NMDARs reduce the overall regularity under high-frequency input while increasing the local regularity under low-frequency. Moreover, the inhibitory effect of NMDA receptors during high-frequency stimulation is associated with a reduced conductance of large conductance calcium-activated potassium (BK) channel. Taken together, our results suggest that NMDAR plays an important role in the regulation of neural coding strategies by utilizing its complex dendritic structure.
{"title":"Regulating Temporal Neural Coding via Fast and Slow Synaptic Dynamics","authors":"Yuanhong Tang;Lingling An;Xingyu Zhang;Huiling Huang;Zhaofei Yu","doi":"10.1109/TCDS.2024.3417477","DOIUrl":"10.1109/TCDS.2024.3417477","url":null,"abstract":"The NMDA receptor (NMDAR), as a ubiquitous type of synapse in neural systems of the brain, presents slow dynamics to modulate neural spiking activity. For the cerebellum, NMDARs have been suggested for contributing complex spikes in Purkinje cells (PCs) as a mechanism for cognitive activity, learning, and memory. Recent experimental studies are debating the role of NMDAR in PC dendritic input, yet it remains unclear how the distribution of NMDARs in PC dendrites can affect their neural spiking coding properties. In this work, a detailed multiple-compartment PC model was used to study how slow-scale NMDARs together with fast-scale AMPA, regulate neural coding. We find that NMDARs act as a band-pass filter, increasing the excitability of PC firing under low-frequency input while reducing it under high frequency. This effect is positively related to the strength of NMDARs. For a response sequence containing a large number of regular and irregular spiking patterns, NMDARs reduce the overall regularity under high-frequency input while increasing the local regularity under low-frequency. Moreover, the inhibitory effect of NMDA receptors during high-frequency stimulation is associated with a reduced conductance of large conductance calcium-activated potassium (BK) channel. Taken together, our results suggest that NMDAR plays an important role in the regulation of neural coding strategies by utilizing its complex dendritic structure.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"102-114"},"PeriodicalIF":5.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1109/TCDS.2024.3418841
Anastasios E. Giannopoulos;Ioanna Zioga;Vaios Ziogas;Panos Papageorgiou;Georgios N. Papageorgiou;Charalabos Papageorgiou
The acoustic startle reflex (ASR) relies on the sensorimotor system and is affected by aging, sex, and psychopathology. ASR can be modulated by the prepulse inhibition (PPI) paradigm, which achieves the inhibition of reactivity to a startling stimulus (pulse) following a weak prepulse stimulus. Additionally, neurophysiological studies have found that brain activity is characterized by irregular patterns with high complexity, which however reduces with age. Our study investigated the relationship between prestartle nonlinear dynamics and PPI in healthy children versus adults. Fifty-six individuals took part in the experiment: 31 children and adolescents and 25 adults. Participants heard 51 pairs of tones (prepulse and startle) with a time difference of 30 to 500 ms. Subsequently, we assessed neural complexity by computing the largest Lyapunov exponent (LLE) during the prestartle period and assessed PPI by analyzing the poststartle event-related potentials (ERPs). Results showed higher neural complexity for children compared to adults, in line with previous research showing reduced complexity in the physiological signals in aging. As expected, PPI (as reflected in the P50 and P200 components) was enhanced in adults compared to children, potentially due to the maturation of the ASR for the former. Interestingly, prestartle complexity was correlated with the P50 component in children only, but not in adults, potentially due to the different stage of sensorimotor maturation between age groups. Overall, our study offers novel contributions for investigating brain dynamics, linking nonlinear with linear measures. Our findings are consistent with the loss of neural complexity in aging, and suggest differentiated links between nonlinear and linear metrics in children and adults.
{"title":"Prepulse Inhibition and Prestimulus Nonlinear Brain Dynamics in Childhood: A Lyapunov Exponent Approach","authors":"Anastasios E. Giannopoulos;Ioanna Zioga;Vaios Ziogas;Panos Papageorgiou;Georgios N. Papageorgiou;Charalabos Papageorgiou","doi":"10.1109/TCDS.2024.3418841","DOIUrl":"10.1109/TCDS.2024.3418841","url":null,"abstract":"The acoustic startle reflex (ASR) relies on the sensorimotor system and is affected by aging, sex, and psychopathology. ASR can be modulated by the prepulse inhibition (PPI) paradigm, which achieves the inhibition of reactivity to a startling stimulus (pulse) following a weak prepulse stimulus. Additionally, neurophysiological studies have found that brain activity is characterized by irregular patterns with high complexity, which however reduces with age. Our study investigated the relationship between prestartle nonlinear dynamics and PPI in healthy children versus adults. Fifty-six individuals took part in the experiment: 31 children and adolescents and 25 adults. Participants heard 51 pairs of tones (prepulse and startle) with a time difference of 30 to 500 ms. Subsequently, we assessed neural complexity by computing the largest Lyapunov exponent (LLE) during the prestartle period and assessed PPI by analyzing the poststartle event-related potentials (ERPs). Results showed higher neural complexity for children compared to adults, in line with previous research showing reduced complexity in the physiological signals in aging. As expected, PPI (as reflected in the P50 and P200 components) was enhanced in adults compared to children, potentially due to the maturation of the ASR for the former. Interestingly, prestartle complexity was correlated with the P50 component in children only, but not in adults, potentially due to the different stage of sensorimotor maturation between age groups. Overall, our study offers novel contributions for investigating brain dynamics, linking nonlinear with linear measures. Our findings are consistent with the loss of neural complexity in aging, and suggest differentiated links between nonlinear and linear metrics in children and adults.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"115-129"},"PeriodicalIF":5.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10572331","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}