首页 > 最新文献

IEEE Transactions on Cognitive and Developmental Systems最新文献

英文 中文
Brain Network Reorganization in Response to Multilevel Mental Workload in Simulated Flight Tasks 模拟飞行任务中多层次心理负荷对脑网络重组的响应
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-03 DOI: 10.1109/TCDS.2024.3511394
Kuijun Wu;Jingjia Yuan;Xianliang Ge;Ioannis Kakkos;Linze Qian;Sujie Wang;Yamei Yu;Chuantao Li;Yu Sun
In various real-world situations, inappropriate mental workload (MWL) can impair task performance and may cause operational safety risks. Growing efforts have been made to reveal the underlying neural mechanisms of MWL. However, most studies have been limited to well-controlled cognitive tasks, overlooking the exploration of the underlying neural mechanisms in close-to-real human–machine interaction tasks. Here, we investigated the brain network reorganization in response to MWL in a close-to-real simulated flight task. Specifically, a dual-task (primary flight simulation + secondary auditory choice reaction time task) design flight simulation paradigm to mimic real-flight cognitive challenges was introduced to induce varying levels of MWL. The perceived subjective task difficulty and secondary task performance validated the effectiveness of our experimental design. Moreover, multilevel MWL classification was performed to delve into the changes of functional connectivity (FC) in response to different MWL and achieved satisfactory performance (three levels, accuracy $=$ 71.85%). Further inspection of the discriminative FCs highlighted the importance of frontal and parietal-occipital brain regions in MWL modulation. Additional graph theoretical analysis revealed increased information transfer efficiency across distributed brain regions with the increase of MWL. Overall, our research offers valuable insights into the neural mechanisms underlying MWL, with potential implications for improving safety in aviation contexts.
在现实世界的各种情况下,不适当的心理负荷(MWL)会损害任务性能,并可能导致操作安全风险。越来越多的研究揭示了MWL的潜在神经机制。然而,大多数研究都局限于控制良好的认知任务,忽视了在接近真实的人机交互任务中对潜在神经机制的探索。在此,我们研究了在接近真实的模拟飞行任务中,脑网络重组对MWL的反应。具体而言,引入双任务(主要飞行模拟+次要听觉选择反应时间任务)设计飞行模拟范式来模拟真实飞行认知挑战,以诱导不同程度的MWL。主观任务难度和次要任务表现的感知验证了实验设计的有效性。此外,我们还进行了多层次的MWL分类,深入研究了不同MWL对功能连通性(FC)的影响,并取得了令人满意的结果(三个层次,准确率$=$ 71.85%)。进一步的鉴别FCs检查强调了额叶和顶叶-枕叶脑区在MWL调节中的重要性。另外,图理论分析表明,随着MWL的增加,分布脑区的信息传递效率也随之提高。总的来说,我们的研究为MWL背后的神经机制提供了有价值的见解,对提高航空环境下的安全性具有潜在的意义。
{"title":"Brain Network Reorganization in Response to Multilevel Mental Workload in Simulated Flight Tasks","authors":"Kuijun Wu;Jingjia Yuan;Xianliang Ge;Ioannis Kakkos;Linze Qian;Sujie Wang;Yamei Yu;Chuantao Li;Yu Sun","doi":"10.1109/TCDS.2024.3511394","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3511394","url":null,"abstract":"In various real-world situations, inappropriate mental workload (MWL) can impair task performance and may cause operational safety risks. Growing efforts have been made to reveal the underlying neural mechanisms of MWL. However, most studies have been limited to well-controlled cognitive tasks, overlooking the exploration of the underlying neural mechanisms in close-to-real human–machine interaction tasks. Here, we investigated the brain network reorganization in response to MWL in a close-to-real simulated flight task. Specifically, a dual-task (primary flight simulation + secondary auditory choice reaction time task) design flight simulation paradigm to mimic real-flight cognitive challenges was introduced to induce varying levels of MWL. The perceived subjective task difficulty and secondary task performance validated the effectiveness of our experimental design. Moreover, multilevel MWL classification was performed to delve into the changes of functional connectivity (FC) in response to different MWL and achieved satisfactory performance (three levels, accuracy <inline-formula><tex-math>$=$</tex-math></inline-formula> 71.85%). Further inspection of the discriminative FCs highlighted the importance of frontal and parietal-occipital brain regions in MWL modulation. Additional graph theoretical analysis revealed increased information transfer efficiency across distributed brain regions with the increase of MWL. Overall, our research offers valuable insights into the neural mechanisms underlying MWL, with potential implications for improving safety in aviation contexts.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"698-709"},"PeriodicalIF":5.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Cognitive and Developmental Systems Publication Information IEEE认知与发展系统汇刊
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-03 DOI: 10.1109/TCDS.2024.3482591
{"title":"IEEE Transactions on Cognitive and Developmental Systems Publication Information","authors":"","doi":"10.1109/TCDS.2024.3482591","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3482591","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"C2-C2"},"PeriodicalIF":5.0,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10774066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142761427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMART: Sequential Multiagent Reinforcement Learning With Role Assignment Using Transformer SMART:使用变压器进行角色分配的顺序多智能体强化学习
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-29 DOI: 10.1109/TCDS.2024.3504256
Yixing Lan;Hao Gao;Xin Xu;Qiang Fang;Yujun Zeng
Multiagent reinforcement learning (MARL) has received increasing attention and been used to solve cooperative multiagent decision-making and learning control tasks. However, the high complexity of the joint action space and the nonstationary learning process are two major problems that negatively impact on the sample efficiency and solution quality of MARL. To this end, this article proposes a novel approach named sequential MARL with role assignment using transformer (SMART). By learning the effects of different actions on state transitions and rewards, SMART realizes the action abstraction of the original action space and the adaptive role cognitive modeling of multiagent, which reduces the complexity of the multiagent exploration and learning process. Meanwhile, SMART uses causal transformer networks to update role assignment policy and action selection policy sequentially, alleviating the influence of nonstationary multiagent policy learning. The convergence characteristic of SMART is theoretically analyzed. Extensive experiments on the challenging Google football and StarCraft multiagent challenge are conducted, demonstrating that compared with mainstream MARL algorithms such as MAT and HAPPO, SMART achieves a new state-of-the-art performance. Meanwhile, the learned policies through SMART have good generalization ability when the number of agents changes.
多智能体强化学习(MARL)越来越受到人们的关注,并被用于解决多智能体协作决策和学习控制任务。然而,联合动作空间的高度复杂性和非平稳学习过程是影响MARL的采样效率和求解质量的两个主要问题。为此,本文提出了一种新的方法,即使用变压器进行角色分配的顺序MARL (SMART)。SMART通过学习不同动作对状态转移和奖励的影响,实现了原始动作空间的动作抽象和多智能体的自适应角色认知建模,降低了多智能体探索和学习过程的复杂性。同时,SMART利用因果变压器网络依次更新角色分配策略和动作选择策略,减轻了非平稳多智能体策略学习的影响。从理论上分析了SMART的收敛特性。在谷歌足球挑战赛和星际争霸多智能体挑战赛中进行了大量的实验,结果表明,与MAT和HAPPO等主流MARL算法相比,SMART达到了新的最先进的性能。同时,通过SMART学习到的策略在智能体数量变化时具有良好的泛化能力。
{"title":"SMART: Sequential Multiagent Reinforcement Learning With Role Assignment Using Transformer","authors":"Yixing Lan;Hao Gao;Xin Xu;Qiang Fang;Yujun Zeng","doi":"10.1109/TCDS.2024.3504256","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3504256","url":null,"abstract":"Multiagent reinforcement learning (MARL) has received increasing attention and been used to solve cooperative multiagent decision-making and learning control tasks. However, the high complexity of the joint action space and the nonstationary learning process are two major problems that negatively impact on the sample efficiency and solution quality of MARL. To this end, this article proposes a novel approach named sequential MARL with role assignment using transformer (SMART). By learning the effects of different actions on state transitions and rewards, SMART realizes the action abstraction of the original action space and the adaptive role cognitive modeling of multiagent, which reduces the complexity of the multiagent exploration and learning process. Meanwhile, SMART uses causal transformer networks to update role assignment policy and action selection policy sequentially, alleviating the influence of nonstationary multiagent policy learning. The convergence characteristic of SMART is theoretically analyzed. Extensive experiments on the challenging Google football and StarCraft multiagent challenge are conducted, demonstrating that compared with mainstream MARL algorithms such as MAT and HAPPO, SMART achieves a new state-of-the-art performance. Meanwhile, the learned policies through SMART have good generalization ability when the number of agents changes.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"615-630"},"PeriodicalIF":5.0,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effect of Audio Trigger’s Frequency on Autonomous Sensory Meridian Response 音频触发频率对自主感觉经络反应的影响
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-27 DOI: 10.1109/TCDS.2024.3506039
Lili Li;Zhiqing Wu;Zhongliang Yu;Zhibin He;Zhizhong Wang;Liyu Lin;Shaolong Kuang
Autonomous sensory meridian response (ASMR) is an experience-dependent sensation in response to audio and audio–visual triggers. The acoustical characteristics of audio trigger have been speculated to be in connection with ASMR. To explore the effect of audio trigger’s frequency on ASMR and then to discover ASMR’s mechanism, the ASMR phenomenon under random-frequency audio, high-frequency audio, low-frequency audio, original audio, white-noise and rest were analyzed by EEG. The differential entropy and power spectral density were applied to quantitative analysis. The results suggest the audio’s frequency can modulate the brain activities on θ, α, β, γ, and high γ frequencies. Moreover, ASMR responder and nonresponder may be more sensitive to low-frequency audio and white-noise by suppressing brain activities of central areas in γ and high γ frequencies. Further, for ASMR responders, ASMR evoked by low-frequency audio trigger may involve more attentional selection or semantic processing and may not alter the brain functions in information processing and execution.
自主感觉经络反应(ASMR)是一种经验依赖的感觉,是对音频和视听触发的反应。推测音频触发器的声学特性与ASMR有关。为了探讨音频触发频率对ASMR的影响,进而发现ASMR的发生机制,通过脑电图分析了随机音频、高频音频、低频音频、原始音频、白噪声和静止音频下的ASMR现象。利用微分熵和功率谱密度进行定量分析。结果表明,音频可以调节大脑在θ、α、β、γ和高γ频率上的活动。此外,ASMR反应者和无反应者可能通过抑制γ和高γ频率的大脑中央区域活动而对低频音频和白噪声更敏感。此外,低频音频触发诱发的ASMR可能涉及更多的注意选择或语义加工,可能不会改变大脑在信息加工和执行方面的功能。
{"title":"The Effect of Audio Trigger’s Frequency on Autonomous Sensory Meridian Response","authors":"Lili Li;Zhiqing Wu;Zhongliang Yu;Zhibin He;Zhizhong Wang;Liyu Lin;Shaolong Kuang","doi":"10.1109/TCDS.2024.3506039","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3506039","url":null,"abstract":"Autonomous sensory meridian response (ASMR) is an experience-dependent sensation in response to audio and audio–visual triggers. The acoustical characteristics of audio trigger have been speculated to be in connection with ASMR. To explore the effect of audio trigger’s frequency on ASMR and then to discover ASMR’s mechanism, the ASMR phenomenon under random-frequency audio, high-frequency audio, low-frequency audio, original audio, white-noise and rest were analyzed by EEG. The differential entropy and power spectral density were applied to quantitative analysis. The results suggest the audio’s frequency can modulate the brain activities on <italic>θ</i>, <italic>α</i>, <italic>β</i>, <italic>γ</i>, and high <italic>γ</i> frequencies. Moreover, ASMR responder and nonresponder may be more sensitive to low-frequency audio and white-noise by suppressing brain activities of central areas in <italic>γ</i> and high <italic>γ</i> frequencies. Further, for ASMR responders, ASMR evoked by low-frequency audio trigger may involve more attentional selection or semantic processing and may not alter the brain functions in information processing and execution.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"672-681"},"PeriodicalIF":5.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Location-Guided Head Pose Estimation for Fisheye Image 鱼眼图像位置导向头部姿态估计
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-27 DOI: 10.1109/TCDS.2024.3506060
Bing Li;Dong Zhang;Cheng Huang;Yun Xian;Ming Li;Dah-Jye Lee
Camera with a fisheye or ultra-wide lens covers a wide field of view that cannot be modeled by the perspective projection. Serious fisheye lens distortion in the peripheral region of the image leads to degraded performance of the existing head pose estimation models trained on undistorted images. This article presents a new approach for head pose estimation that uses the knowledge of head location in the image to reduce the negative effect of fisheye distortion. We develop an end-to-end convolutional neural network to estimate the head pose with the multitask learning of head pose and head location. Our proposed network estimates the head pose directly from the fisheye image without the operation of rectification or calibration. We also created a fisheye-distorted version of the three popular head pose estimation datasets, BIWI, 300W-LP, and AFLW2000 for our experiments. Experimental results show that our network remarkably improves the accuracy of head pose estimation compared with other state-of-the-art one-stage and two-stage methods.
带有鱼眼或超广角镜头的相机覆盖了无法通过透视投影建模的广阔视野。图像外围区域严重的鱼眼镜头畸变会导致在未失真图像上训练的现有头部姿态估计模型的性能下降。本文提出了一种新的头部姿态估计方法,利用图像中头部位置的知识来减少鱼眼失真的负面影响。我们开发了一个端到端的卷积神经网络,通过对头部姿势和头部位置的多任务学习来估计头部姿势。我们提出的网络直接从鱼眼图像中估计头部姿态,而无需进行校正或校准操作。我们还为我们的实验创建了三个流行的头部姿势估计数据集BIWI, 300W-LP和AFLW2000的鱼眼扭曲版本。实验结果表明,与现有的一段和两段方法相比,该方法显著提高了头部姿态估计的精度。
{"title":"Location-Guided Head Pose Estimation for Fisheye Image","authors":"Bing Li;Dong Zhang;Cheng Huang;Yun Xian;Ming Li;Dah-Jye Lee","doi":"10.1109/TCDS.2024.3506060","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3506060","url":null,"abstract":"Camera with a fisheye or ultra-wide lens covers a wide field of view that cannot be modeled by the perspective projection. Serious fisheye lens distortion in the peripheral region of the image leads to degraded performance of the existing head pose estimation models trained on undistorted images. This article presents a new approach for head pose estimation that uses the knowledge of head location in the image to reduce the negative effect of fisheye distortion. We develop an end-to-end convolutional neural network to estimate the head pose with the multitask learning of head pose and head location. Our proposed network estimates the head pose directly from the fisheye image without the operation of rectification or calibration. We also created a fisheye-distorted version of the three popular head pose estimation datasets, BIWI, 300W-LP, and AFLW2000 for our experiments. Experimental results show that our network remarkably improves the accuracy of head pose estimation compared with other state-of-the-art one-stage and two-stage methods.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"682-697"},"PeriodicalIF":5.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Biomathematical Model for Classifying Sleep Stages Using Deep Learning Techniques 使用深度学习技术分类睡眠阶段的生物数学模型
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-21 DOI: 10.1109/TCDS.2024.3503767
Ruijie He;Wei Tong;Miaomiao Zhang;Guangyu Zhu;Edmond Q. Wu
A biomathematical model is a framework that calculates corresponding indices based on biological and physiological parameters, and can be used to study the fatigue states of submarine crew members during long-duration operations. Submarine personnel are prone to fatigue and decreased vigilance, leading to unnecessary risks. Sleep quality plays a crucial role in assessing human vigilance; however, traditional biomathematical models generally categorize human sleep into two different pressure stages based on circadian rhythms. To accurately classify sleep stages based on physiological signals, this article proposes a novel deep learning architecture using single-channel EEG signals. This architecture comprises four modules: beginning with a feature preliminary extraction module employing a multiscale convolutional neural network (MSCNN), followed by a feature aggregation module combining reparameterizable large kernel network with temporal convolutions network (RepLKnet), then utilizing a multivariate weighted recurrent network as the tensor encoder (MWRN), and finally, decoding with a dynamic graph convolutional neural network (DGCNN). The output is provided by a final classifier. We assessed the effectiveness of the proposed model using two publicly available datasets. The results demonstrate that our model surpasses current leading benchmarks.
生物数学模型是基于生物和生理参数计算相应指标的框架,可用于研究潜艇艇员在长时间作业时的疲劳状态。潜艇人员容易疲劳,警惕性下降,导致不必要的风险。睡眠质量在评估人类警觉性方面起着至关重要的作用;然而,传统的生物数学模型通常根据昼夜节律将人类睡眠分为两个不同的压力阶段。为了基于生理信号对睡眠阶段进行准确分类,本文提出了一种基于单通道脑电图信号的深度学习新架构。该体系结构包括四个模块:首先是采用多尺度卷积神经网络(MSCNN)的特征初步提取模块,其次是将可重参数化大核网络与时间卷积网络相结合的特征聚合模块(RepLKnet),然后是利用多元加权循环网络作为张量编码器(MWRN),最后是使用动态图卷积神经网络(DGCNN)进行解码。输出由最终分类器提供。我们使用两个公开可用的数据集评估了所提出模型的有效性。结果表明,我们的模型超越了目前领先的基准。
{"title":"A Biomathematical Model for Classifying Sleep Stages Using Deep Learning Techniques","authors":"Ruijie He;Wei Tong;Miaomiao Zhang;Guangyu Zhu;Edmond Q. Wu","doi":"10.1109/TCDS.2024.3503767","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3503767","url":null,"abstract":"A biomathematical model is a framework that calculates corresponding indices based on biological and physiological parameters, and can be used to study the fatigue states of submarine crew members during long-duration operations. Submarine personnel are prone to fatigue and decreased vigilance, leading to unnecessary risks. Sleep quality plays a crucial role in assessing human vigilance; however, traditional biomathematical models generally categorize human sleep into two different pressure stages based on circadian rhythms. To accurately classify sleep stages based on physiological signals, this article proposes a novel deep learning architecture using single-channel EEG signals. This architecture comprises four modules: beginning with a feature preliminary extraction module employing a multiscale convolutional neural network (MSCNN), followed by a feature aggregation module combining reparameterizable large kernel network with temporal convolutions network (RepLKnet), then utilizing a multivariate weighted recurrent network as the tensor encoder (MWRN), and finally, decoding with a dynamic graph convolutional neural network (DGCNN). The output is provided by a final classifier. We assessed the effectiveness of the proposed model using two publicly available datasets. The results demonstrate that our model surpasses current leading benchmarks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"659-671"},"PeriodicalIF":5.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial–Temporal Spiking Feature Pruning in Spiking Transformer 尖峰变压器的时空尖峰特征剪枝
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-19 DOI: 10.1109/TCDS.2024.3500018
Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu
Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51$M$ network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).
脉冲神经网络(snn)以其受大脑启发的架构和低功耗而闻名。利用生物相容性和自关注机制,尖峰变压器成为最有前途的高精度SNN架构。然而,Spiking transformer仍然面临着高培训成本的挑战,例如一个51美元的网络需要在ImageNet上进行181小时的培训。在这项工作中,我们探索特征修剪以降低训练成本,并克服两个挑战:高修剪比和轻量级修剪方法。我们首先分析了尖峰特征,并找到了高修剪比的潜力。大多数信息集中在尖峰变压器中尖峰特征的一部分,这表明我们可以保留这部分令牌,并对其他令牌进行修剪。为了实现轻量化,提出了一种无参数时空尖峰特征剪枝方法,该方法只使用简单的加法排序操作。选择具有高峰值积累值的峰值特征/标记进行训练。其他的通过一个称为Softmatch的补偿模块进行修剪和合并。实验结果表明,该方法在不影响图像分类精度的前提下降低了训练成本。在ImageNet上,我们的方法将训练时间从181小时减少到128小时,同时达到相当的准确率(83.13%对83.07%)。
{"title":"Spatial–Temporal Spiking Feature Pruning in Spiking Transformer","authors":"Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu","doi":"10.1109/TCDS.2024.3500018","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3500018","url":null,"abstract":"Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51<inline-formula><tex-math>$M$</tex-math></inline-formula> network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"644-658"},"PeriodicalIF":5.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interaction Is Worth More Explanations: Improving Human–Object Interaction Representation With Propositional Knowledge 交互更值得解释:用命题知识改进人-物交互表征
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-11 DOI: 10.1109/TCDS.2024.3496566
Feng Yang;Yichao Cao;Xuanpeng Li;Weigong Zhang
Detecting human–object interactions (HOI) presents a formidable challenge, necessitating the discernment of intricate, high-level relationships between humans and objects. Recent studies have explored HOI vision-and-language modeling (HOI-VLM), which leverages linguistic information inspired by cross-modal technology. Despite its promise, current methodologies face challenges due to the constraints of limited annotation vocabularies and suboptimal word embeddings, which hinder effective alignment with visual features and consequently, the efficient transfer of linguistic knowledge. In this work, we propose a novel cross-modal framework that leverages external propositional knowledge which harmonize annotation text with a broader spectrum of world knowledge, enabling a more explicit and unambiguous representation of complex semantic relationships. Additionally, considering the prevalence of multiple complexities due to the symbiotic or distinctive relationships inherent in one HO pair, along with the identical interactions occurring with diverse HO pairs (e.g., “human ride bicycle” versus “human ride horse”). The challenge lies in understanding the subtle differences and similarities between interactions involving different objects or occurring in varied contexts. To this end, we propose the Jaccard contrast strategy to simultaneously optimize cross-modal representation consistency across HO pairs (especially for cases where multiple interactions occur), which encompasses both vision-to-vision and vision-to-knowledge alignment objectives. The effectiveness of our proposed method is comprehensively validated through extensive experiments, showcasing its superiority in the field of HOI analysis.
检测人-物交互(HOI)提出了一个艰巨的挑战,需要识别复杂的,高层次的人与物体之间的关系。最近的研究探索了HOI视觉和语言建模(HOI- vlm),它利用了受跨模态技术启发的语言信息。尽管有其前景,但由于有限的标注词汇表和次优词嵌入的限制,当前的方法面临挑战,这阻碍了与视觉特征的有效对齐,从而阻碍了语言知识的有效转移。在这项工作中,我们提出了一种新的跨模态框架,该框架利用外部命题知识来协调注释文本与更广泛的世界知识,从而能够更明确、更明确地表示复杂的语义关系。此外,考虑到由于一个HO对固有的共生或独特关系以及不同HO对发生的相同相互作用(例如,“人类骑自行车”与“人类骑马”)而导致的多重复杂性的普遍存在。挑战在于理解涉及不同对象或发生在不同环境中的交互之间的细微差异和相似之处。为此,我们提出了Jaccard对比策略,以同时优化HO对之间的跨模态表示一致性(特别是在多个交互发生的情况下),其中包括视觉到视觉和视觉到知识的对齐目标。通过大量的实验,全面验证了该方法的有效性,显示了其在HOI分析领域的优越性。
{"title":"Interaction Is Worth More Explanations: Improving Human–Object Interaction Representation With Propositional Knowledge","authors":"Feng Yang;Yichao Cao;Xuanpeng Li;Weigong Zhang","doi":"10.1109/TCDS.2024.3496566","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3496566","url":null,"abstract":"Detecting human–object interactions (HOI) presents a formidable challenge, necessitating the discernment of intricate, high-level relationships between humans and objects. Recent studies have explored HOI vision-and-language modeling (HOI-VLM), which leverages linguistic information inspired by cross-modal technology. Despite its promise, current methodologies face challenges due to the constraints of limited annotation vocabularies and suboptimal word embeddings, which hinder effective alignment with visual features and consequently, the efficient transfer of linguistic knowledge. In this work, we propose a novel cross-modal framework that leverages external propositional knowledge which harmonize annotation text with a broader spectrum of world knowledge, enabling a more explicit and unambiguous representation of complex semantic relationships. Additionally, considering the prevalence of multiple complexities due to the symbiotic or distinctive relationships inherent in one HO pair, along with the identical interactions occurring with diverse HO pairs (e.g., “human ride bicycle” versus “human ride horse”). The challenge lies in understanding the subtle differences and similarities between interactions involving different objects or occurring in varied contexts. To this end, we propose the Jaccard contrast strategy to simultaneously optimize cross-modal representation consistency across HO pairs (especially for cases where multiple interactions occur), which encompasses both vision-to-vision and vision-to-knowledge alignment objectives. The effectiveness of our proposed method is comprehensively validated through extensive experiments, showcasing its superiority in the field of HOI analysis.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"631-643"},"PeriodicalIF":5.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Task Engagement to Regulate Reinforcement Learning-Based Decoding for Online Brain Control 任务参与建模以调节基于强化学习的解码用于在线脑控制
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-05 DOI: 10.1109/TCDS.2024.3492199
Xiang Zhang;Xiang Shen;Yiwen Wang
Brain–machine interfaces (BMIs) offer significant promise for enabling paralyzed individuals to control external devices using their brain signals. One challenge is that during the online brain control (BC) process, subjects may not be completely immersed in the task, particularly when multiple steps are needed to achieve a goal. The decoder indiscriminately takes the less engaged trials as training data, which might decrease the decoding accuracy. In this article, we propose an alternative kernel RL-based decoder that trains online with continuous parameter update. We model neural activity from the medial prefrontal cortex (mPFC), a reward-related brain region, to represent task engagement. This information is incorporated into a stochastic learning rate using an exponential model, which measures the relevancy of neural data. The proposed algorithm was evaluated in the experiment where rats performed a cursor-reaching BC task. We found the neural activities from mPFC contained the engagement information which was negatively correlated with trial response time. Moreover, compared to the RL method without task engagement modeling, our proposed method enhanced the training efficiency. It used half of the training data to achieve the same reconstruction accuracy of the cursor trajectory. The results demonstrate the potential of our RL framework for improving online BC tasks.
脑机接口(bmi)为瘫痪患者使用他们的大脑信号控制外部设备提供了重要的希望。一个挑战是,在在线大脑控制(BC)过程中,受试者可能不会完全沉浸在任务中,特别是当需要多个步骤才能实现目标时。解码器不加选择地将较少参与的试验作为训练数据,这可能会降低解码的准确性。在本文中,我们提出了一种替代的基于内核强化学习的解码器,该解码器通过连续参数更新进行在线训练。我们模拟了内侧前额叶皮层(mPFC)的神经活动,这是一个与奖励相关的大脑区域,代表任务参与。使用指数模型将这些信息整合到随机学习率中,该模型测量神经数据的相关性。在大鼠进行光标到达BC任务的实验中对所提出的算法进行了评估。研究发现,mPFC的神经活动包含参与信息,参与信息与试验反应时间呈负相关。此外,与没有任务投入建模的强化学习方法相比,我们提出的方法提高了训练效率。它使用一半的训练数据来达到与光标轨迹相同的重建精度。结果证明了RL框架在改进在线BC任务方面的潜力。
{"title":"Modeling Task Engagement to Regulate Reinforcement Learning-Based Decoding for Online Brain Control","authors":"Xiang Zhang;Xiang Shen;Yiwen Wang","doi":"10.1109/TCDS.2024.3492199","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3492199","url":null,"abstract":"Brain–machine interfaces (BMIs) offer significant promise for enabling paralyzed individuals to control external devices using their brain signals. One challenge is that during the online brain control (BC) process, subjects may not be completely immersed in the task, particularly when multiple steps are needed to achieve a goal. The decoder indiscriminately takes the less engaged trials as training data, which might decrease the decoding accuracy. In this article, we propose an alternative kernel RL-based decoder that trains online with continuous parameter update. We model neural activity from the medial prefrontal cortex (mPFC), a reward-related brain region, to represent task engagement. This information is incorporated into a stochastic learning rate using an exponential model, which measures the relevancy of neural data. The proposed algorithm was evaluated in the experiment where rats performed a cursor-reaching BC task. We found the neural activities from mPFC contained the engagement information which was negatively correlated with trial response time. Moreover, compared to the RL method without task engagement modeling, our proposed method enhanced the training efficiency. It used half of the training data to achieve the same reconstruction accuracy of the cursor trajectory. The results demonstrate the potential of our RL framework for improving online BC tasks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"606-614"},"PeriodicalIF":5.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developmental Networks With Foveation 用注视发展网络
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-05 DOI: 10.1109/TCDS.2024.3492181
Xiang Wu;Juyang Weng
The foveated nature of the human vision system (HVS) means the acuity on the retina peaks at the center of the fovea and gradually descends to the periphery with increasing eccentricity. Foveation is general-purpose, meaning the fovea is more often used than the periphery. Self-generated saccades dynamically project the fovea to different parts of the visual world so that the high-acuity fovea can process interested parts at different times. It is still unclear why biological vision uses foveation. This work is the first foveated neural network as far as we are aware, but it has a limited scope. We study two subjects here as follows. 1) We design a biological density of cones (BDOCs) foveation method for image warping to simulate a biologically plausible foveated retina using a commonly available uniform-pixel camera. 2) The subject of this article is not specific to tasks, but we choose a challenging task, visual navigation, as an example of quantitative and spatiotemporal tasks, and compare it with deep learning. Our experimental results showed that 1) the BDOC foveation is logically and visually correct; and 2) the developmental network (DN) performs better than deep learning in a surprising way and foveation helps both network types.
人眼视觉系统(HVS)的焦点特性是指视网膜的敏锐度在中央凹中心达到峰值,并随着偏心距的增大逐渐下降到外围。中央凹是通用的,这意味着中央凹比外围凹更常被使用。自生成的扫视动态地将中央凹投射到视觉世界的不同部分,使得高灵敏度的中央凹可以在不同的时间处理感兴趣的部分。目前尚不清楚为什么生物视觉使用注视点。这项工作是我们所知的第一个注视点神经网络,但它的范围有限。我们在这里学习两个科目如下。1)我们设计了一种生物锥密度(BDOCs)注视点方法用于图像扭曲,以模拟生物上合理的注视点视网膜,使用常见的均匀像素相机。2)本文的主题并不具体到任务,但我们选择了一个具有挑战性的任务,视觉导航,作为定量和时空任务的一个例子,并将其与深度学习进行比较。实验结果表明:1)BDOC注视点在逻辑上和视觉上是正确的;2)发展性网络(DN)以一种令人惊讶的方式比深度学习表现得更好,注视点对两种网络类型都有帮助。
{"title":"Developmental Networks With Foveation","authors":"Xiang Wu;Juyang Weng","doi":"10.1109/TCDS.2024.3492181","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3492181","url":null,"abstract":"The foveated nature of the human vision system (HVS) means the acuity on the retina peaks at the center of the fovea and gradually descends to the periphery with increasing eccentricity. Foveation is general-purpose, meaning the fovea is more often used than the periphery. Self-generated saccades dynamically project the fovea to different parts of the visual world so that the high-acuity fovea can process interested parts at different times. It is still unclear why biological vision uses foveation. This work is the first foveated neural network as far as we are aware, but it has a limited scope. We study two subjects here as follows. 1) We design a biological density of cones (BDOCs) foveation method for image warping to simulate a biologically plausible foveated retina using a commonly available uniform-pixel camera. 2) The subject of this article is not specific to tasks, but we choose a challenging task, visual navigation, as an example of quantitative and spatiotemporal tasks, and compare it with deep learning. Our experimental results showed that 1) the BDOC foveation is logically and visually correct; and 2) the developmental network (DN) performs better than deep learning in a surprising way and foveation helps both network types.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"592-605"},"PeriodicalIF":5.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Cognitive and Developmental Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1