首页 > 最新文献

IEEE Transactions on Cognitive and Developmental Systems最新文献

英文 中文
The Methodology of Quantitative Social Intention Evaluation and Robot Gaze Behavior Control in Multiobjects Scenario 多目标情景下定量社会意向评价与机器人凝视行为控制方法
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-16 DOI: 10.1109/TCDS.2024.3461335
Haoyu Zhu;Xiaorui Liu;Hang Su;Wei Wang;Jinpeng Yu
This article focuses on the multiple objects selection problem for the robot in social scenarios, and proposes a novel methodology composed of quantitative social intention evaluation and gaze behavior control. For the social scenarios containing various persons and multimodal social cues, a combination of the entropy weight method (EWM) and gray correlation-order preference by similarity to the ideal solution (GC-TOPSIS) model is proposed to fuse the multimodal social cues, and evaluate the social intention of candidates. According to the quantitative evaluation of social intention, a robot can generate the interaction priority among multiple social candidates. To ensure this interaction selection mechanism in behavior level, an optimal control framework composed of model predictive controller (MPC) and online Gaussian process (GP) observer is employed to drive the eye-head coordinated gaze behavior of robot. Through the experiments conducted on the Xiaopang robot, the availability of the proposed methodology can be illustrated. This work enables robots to generate social behavior based on quantitative intention perception, which could bring the potential to explore the sensory principles and biomechanical mechanism underlying the human-robot interaction, and broaden the application of robot in the social scenario.
针对机器人在社交场景下的多目标选择问题,提出了一种由定量社交意图评估和凝视行为控制组成的新方法。针对包含多个人和多模态社会线索的社会场景,提出了一种结合熵权法(EWM)和灰色关联-排序理想解相似性偏好(GC-TOPSIS)模型的方法来融合多模态社会线索,并对候选人的社会意向进行评估。根据社会意向的定量评价,机器人可以生成多个社会候选者之间的交互优先级。为了保证这种行为层面的交互选择机制,采用由模型预测控制器(MPC)和在线高斯过程观测器(GP)组成的最优控制框架驱动机器人的眼-头协调注视行为。通过在小胖机器人上进行的实验,可以说明所提出方法的有效性。本研究使机器人能够基于定量意向感知产生社会行为,为探索人机交互的感觉原理和生物力学机制,拓宽机器人在社会场景中的应用提供了可能。
{"title":"The Methodology of Quantitative Social Intention Evaluation and Robot Gaze Behavior Control in Multiobjects Scenario","authors":"Haoyu Zhu;Xiaorui Liu;Hang Su;Wei Wang;Jinpeng Yu","doi":"10.1109/TCDS.2024.3461335","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3461335","url":null,"abstract":"This article focuses on the multiple objects selection problem for the robot in social scenarios, and proposes a novel methodology composed of quantitative social intention evaluation and gaze behavior control. For the social scenarios containing various persons and multimodal social cues, a combination of the entropy weight method (EWM) and gray correlation-order preference by similarity to the ideal solution (GC-TOPSIS) model is proposed to fuse the multimodal social cues, and evaluate the social intention of candidates. According to the quantitative evaluation of social intention, a robot can generate the interaction priority among multiple social candidates. To ensure this interaction selection mechanism in behavior level, an optimal control framework composed of model predictive controller (MPC) and online Gaussian process (GP) observer is employed to drive the eye-head coordinated gaze behavior of robot. Through the experiments conducted on the Xiaopang robot, the availability of the proposed methodology can be illustrated. This work enables robots to generate social behavior based on quantitative intention perception, which could bring the potential to explore the sensory principles and biomechanical mechanism underlying the human-robot interaction, and broaden the application of robot in the social scenario.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"400-409"},"PeriodicalIF":5.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mental Workload Assessment Using Deep Learning Models From EEG Signals: A Systematic Review 利用脑电图信号的深度学习模型评估心理负荷:系统综述
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-16 DOI: 10.1109/TCDS.2024.3460750
Kunjira Kingphai;Yashar Moshfeghi
Mental workload (MWL) assessment is crucial in information systems (IS), impacting task performance, user experience, and system effectiveness. Deep learning offers promising techniques for MWL classification using electroencephalography (EEG), which monitors cognitive states dynamically and unobtrusively. Our research explores deep learning's potential and challenges in EEG-based MWL classification, focusing on training inputs, cross-validation methods, and classification problem types. We identify five types of EEG-based MWL classification: within-subject, cross subject, cross session, cross task, and combined cross task and cross subject. Success depends on managing dataset uniqueness, session and task variability, and artifact removal. Despite the potential, real-world applications are limited. Enhancements are necessary for self-reporting methods, universal preprocessing standards, and MWL assessment accuracy. Specifically, inaccuracies are inflated when data are shuffled before splitting to train and test sets, disrupting EEG signals’ temporal sequence. In contrast, methods such as the time-series cross validation and leave-session-out approach better preserve temporal integrity, offering more accurate model performance evaluations. Utilizing deep learning for EEG-based MWL assessment could significantly improve IS functionality and adaptability in real time based on user cognitive states.
心理工作量(MWL)评估在信息系统(is)中至关重要,影响任务性能、用户体验和系统有效性。深度学习为使用脑电图(EEG)进行MWL分类提供了有前途的技术,该技术动态且不引人注目地监测认知状态。我们的研究探讨了深度学习在基于脑电图的MWL分类中的潜力和挑战,重点关注训练输入、交叉验证方法和分类问题类型。我们确定了五种基于脑电图的MWL分类:主题内分类、跨主题分类、跨会话分类、跨任务分类以及跨任务和跨主题组合分类。成功取决于对数据集唯一性、会话和任务可变性以及工件移除的管理。尽管有潜力,但实际应用是有限的。自我报告方法、通用预处理标准和MWL评估准确性需要改进。具体来说,当数据在分割成训练集和测试集之前被洗牌时,不准确性会被夸大,从而破坏脑电图信号的时间序列。相比之下,时间序列交叉验证和离开会话方法等方法更好地保持了时间完整性,提供了更准确的模型性能评估。利用深度学习进行基于脑电图的MWL评估可以显著提高基于用户认知状态的IS功能和实时适应性。
{"title":"Mental Workload Assessment Using Deep Learning Models From EEG Signals: A Systematic Review","authors":"Kunjira Kingphai;Yashar Moshfeghi","doi":"10.1109/TCDS.2024.3460750","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3460750","url":null,"abstract":"Mental workload (MWL) assessment is crucial in information systems (IS), impacting task performance, user experience, and system effectiveness. Deep learning offers promising techniques for MWL classification using electroencephalography (EEG), which monitors cognitive states dynamically and unobtrusively. Our research explores deep learning's potential and challenges in EEG-based MWL classification, focusing on training inputs, cross-validation methods, and classification problem types. We identify five types of EEG-based MWL classification: within-subject, cross subject, cross session, cross task, and combined cross task and cross subject. Success depends on managing dataset uniqueness, session and task variability, and artifact removal. Despite the potential, real-world applications are limited. Enhancements are necessary for self-reporting methods, universal preprocessing standards, and MWL assessment accuracy. Specifically, inaccuracies are inflated when data are shuffled before splitting to train and test sets, disrupting EEG signals’ temporal sequence. In contrast, methods such as the time-series cross validation and leave-session-out approach better preserve temporal integrity, offering more accurate model performance evaluations. Utilizing deep learning for EEG-based MWL assessment could significantly improve IS functionality and adaptability in real time based on user cognitive states.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"40-60"},"PeriodicalIF":5.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fatigue State Recognition System for Miners Based on a Multimodal Feature Extraction and Fusion Framework 基于多模态特征提取和融合框架的矿工疲劳状态识别系统
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-16 DOI: 10.1109/TCDS.2024.3461713
Hongguang Pan;Shiyu Tong;Xuqiang Wei;Bingyang Teng
The fatigue factor is widely recognized as a primary contributor to accidents in the mining industry. Proactively recognizing fatigue states in miners before starting work can effectively establish a safety boundary for both miners safety and coal mine production. Therefore, this study designs a fatigue state recognition system for miners based on a multimodal extraction and fusion framework. First, the system is equipped with various sensors, a core processor and a display to collect and process physiological data such as electrocardiogram (ECG), electrodermal activity (EDA), blood pressure (BP), blood oxygen saturation (SpO${}_{2}$), skin temperature (SKT), as well as facial data, and to present fatigue state, respectively. Second, based on the multimodal feature extraction and fusion framework, after the necessary preprocessing steps, the system extracts physiological features by time and frequency domain analysis, extracts facial features by ResNeXt-50 and gated recurrent unit (GRU), and fuses multifeatures by Transformer+. Finally, in the comprehensive laboratory for coal-related programs of Xi’an University of Science and Technology, we test the system and build a multimodal dataset, and the results demonstrate an average accuracy of 93.15%.
疲劳因素被广泛认为是矿业事故的主要原因。在矿工开工前主动识别疲劳状态,可以有效地为矿工安全和煤矿生产建立安全边界。因此,本研究设计了一个基于多模态提取与融合框架的矿工疲劳状态识别系统。首先,系统配备各种传感器、核心处理器和显示器,分别采集和处理生理数据,如心电图(ECG)、皮电活动(EDA)、血压(BP)、血氧饱和度(SpO${}_{2}$)、皮肤温度(SKT)以及面部数据,并呈现疲劳状态。其次,基于多模态特征提取与融合框架,经过必要的预处理步骤,通过时频域分析提取生理特征,通过ResNeXt-50和门控循环单元(GRU)提取人脸特征,通过Transformer+进行多特征融合。最后,在西安科技大学煤炭相关专业综合实验室对该系统进行了测试,并建立了多模态数据集,结果表明该系统的平均准确率为93.15%。
{"title":"Fatigue State Recognition System for Miners Based on a Multimodal Feature Extraction and Fusion Framework","authors":"Hongguang Pan;Shiyu Tong;Xuqiang Wei;Bingyang Teng","doi":"10.1109/TCDS.2024.3461713","DOIUrl":"10.1109/TCDS.2024.3461713","url":null,"abstract":"The fatigue factor is widely recognized as a primary contributor to accidents in the mining industry. Proactively recognizing fatigue states in miners before starting work can effectively establish a safety boundary for both miners safety and coal mine production. Therefore, this study designs a fatigue state recognition system for miners based on a multimodal extraction and fusion framework. First, the system is equipped with various sensors, a core processor and a display to collect and process physiological data such as electrocardiogram (ECG), electrodermal activity (EDA), blood pressure (BP), blood oxygen saturation (SpO<inline-formula><tex-math>${}_{2}$</tex-math></inline-formula>), skin temperature (SKT), as well as facial data, and to present fatigue state, respectively. Second, based on the multimodal feature extraction and fusion framework, after the necessary preprocessing steps, the system extracts physiological features by time and frequency domain analysis, extracts facial features by ResNeXt-50 and gated recurrent unit (GRU), and fuses multifeatures by Transformer+. Finally, in the comprehensive laboratory for coal-related programs of Xi’an University of Science and Technology, we test the system and build a multimodal dataset, and the results demonstrate an average accuracy of 93.15%.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"410-420"},"PeriodicalIF":5.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neighborhood-Curiosity-Based Exploration in Multiagent Reinforcement Learning 多代理强化学习中基于好奇心的邻域探索
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1109/TCDS.2024.3460368
Shike Yang;Ziming He;Jingchen Li;Haobin Shi;Qingbing Ji;Kao-Shing Hwang;Xianshan Li
Efficient exploration in cooperative multiagent reinforcement learning is still tricky in complex tasks. In this article, we propose a novel multiagent collaborative exploration method called neighborhood-curiosity-based exploration (NCE), by which agents can explore not only novel states but also new partnerships. Concretely, we use the attention mechanism in graph convolutional networks to perform a weighted summation of features from neighbors. The calculated attention weights can be regarded as an embodiment of the relationship among agents. Then, we use the prediction errors of the aggregated features as intrinsic rewards to facilitate exploration. When agents encounter novel states or new partnerships, NCE will produce large prediction errors, resulting in large intrinsic rewards. In addition, agents are more influenced by their neighbors and only interact directly with them in multiagent systems. Exploring partnerships between agents and their neighbors can enable agents to capture the most important cooperative relations with other agents. Therefore, NCE can effectively promote collaborative exploration even in environments with a large number of agents. Our experimental results show that NCE achieves significant performance improvements on the challenging StarCraft II micromanagement (SMAC) benchmark.
协作式多智能体强化学习的有效探索在复杂任务中仍然是一个棘手的问题。在本文中,我们提出了一种新的多智能体协作探索方法,称为基于邻域好奇心的探索(NCE),通过这种方法,智能体不仅可以探索新的状态,还可以探索新的伙伴关系。具体来说,我们使用图卷积网络中的注意机制来对邻居的特征进行加权求和。计算得到的注意权值可以看作是agent之间关系的体现。然后,我们使用聚合特征的预测误差作为内在奖励来促进探索。当智能体遇到新的状态或新的伙伴关系时,NCE会产生很大的预测误差,从而产生很大的内在奖励。此外,在多智能体系统中,智能体受邻居的影响更大,只能与邻居直接交互。探索代理与其邻居之间的伙伴关系可以使代理捕捉到与其他代理之间最重要的合作关系。因此,NCE可以有效地促进协作探索,即使在具有大量代理的环境中。我们的实验结果表明,NCE在具有挑战性的《星际争霸2》微管理(SMAC)基准测试中取得了显著的性能改进。
{"title":"Neighborhood-Curiosity-Based Exploration in Multiagent Reinforcement Learning","authors":"Shike Yang;Ziming He;Jingchen Li;Haobin Shi;Qingbing Ji;Kao-Shing Hwang;Xianshan Li","doi":"10.1109/TCDS.2024.3460368","DOIUrl":"10.1109/TCDS.2024.3460368","url":null,"abstract":"Efficient exploration in cooperative multiagent reinforcement learning is still tricky in complex tasks. In this article, we propose a novel multiagent collaborative exploration method called neighborhood-curiosity-based exploration (NCE), by which agents can explore not only novel states but also new partnerships. Concretely, we use the attention mechanism in graph convolutional networks to perform a weighted summation of features from neighbors. The calculated attention weights can be regarded as an embodiment of the relationship among agents. Then, we use the prediction errors of the aggregated features as intrinsic rewards to facilitate exploration. When agents encounter novel states or new partnerships, NCE will produce large prediction errors, resulting in large intrinsic rewards. In addition, agents are more influenced by their neighbors and only interact directly with them in multiagent systems. Exploring partnerships between agents and their neighbors can enable agents to capture the most important cooperative relations with other agents. Therefore, NCE can effectively promote collaborative exploration even in environments with a large number of agents. Our experimental results show that NCE achieves significant performance improvements on the challenging StarCraft II micromanagement (SMAC) benchmark.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"379-389"},"PeriodicalIF":5.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive-Learning-Based Assist-as-Needed Control for Ankle Rehabilitation 基于渐进学习的踝关节康复 "按需辅助 "控制系统
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-06 DOI: 10.1109/TCDS.2024.3455795
Kun Qian;Zhenhong Li;Yihui Zhao;Jie Zhang;Xianwen Kong;Samit Chakrabarty;Zhiqiang Zhang;Sheng Quan Xie
This article proposes a progressive-learning-based assist-as-needed (AAN) control scheme for ankle rehabilitation. To quantify the training performance, a fuzzy logic (FL) system is established to generate a holistic metric based on multiple kinematic and dynamic indicators. Subsequently, a cost function that contains both the tracking error and robot stiffness is constructed. A novel learning scheme is then proposed to enhance subjects’ engagement, leveraging the FL metric to uphold a declining trend in the robot's stiffness. The system stability is analyzed using the Lyapunov theory, the control ultimate bounds are specified and the effects of parameter tuning are discussed. Experiments are conducted on an ankle robot and the minimal assist-as-needed (MAAN) scheme is adopted for comparison. With a training session consisting of 11 trials, the quantitative performance evaluations, individual error convergences, progressive stiffness learning and human–robot interaction are evaluated. It is shown that within eight trials under the progressive AAN and MAAN, the robot assistive torques have an average reduction of 13.45% and 20.25% while subjects’ active torques are increased by 56.53% and 58.39%, respectively. During the late stage of training, the progressive AAN further improves two criteria by 9.44% and 6.29%, while the MAAN partially loses subjects’ participation (active torques are reduced by 36.38%) due to the occurrence of motion adaption.
本文提出了一种基于渐进式学习的踝关节康复辅助控制方案。为了量化训练效果,建立了一个模糊逻辑(FL)系统,生成基于多个运动和动态指标的整体度量。然后,构造了包含跟踪误差和机器人刚度的代价函数。然后提出了一种新的学习方案来提高受试者的参与度,利用FL指标来维持机器人刚度的下降趋势。利用李雅普诺夫理论分析了系统的稳定性,确定了控制极限界,讨论了参数整定的影响。在踝关节机器人上进行了实验,采用最小按需辅助(MAAN)方案进行比较。通过11次试验的训练,评估了定量性能评估、个体误差收敛、渐进式刚度学习和人机交互。结果表明,在渐进式AAN和MAAN的8个实验中,机器人的辅助扭矩平均减少了13.45%和20.25%,而受试者的主动扭矩分别增加了56.53%和58.39%。在训练后期,渐进式AAN进一步提高了9.44%和6.29%的两个标准,而由于运动适应的发生,MAAN部分失去了被试的参与(主动扭矩降低了36.38%)。
{"title":"Progressive-Learning-Based Assist-as-Needed Control for Ankle Rehabilitation","authors":"Kun Qian;Zhenhong Li;Yihui Zhao;Jie Zhang;Xianwen Kong;Samit Chakrabarty;Zhiqiang Zhang;Sheng Quan Xie","doi":"10.1109/TCDS.2024.3455795","DOIUrl":"10.1109/TCDS.2024.3455795","url":null,"abstract":"This article proposes a progressive-learning-based assist-as-needed (AAN) control scheme for ankle rehabilitation. To quantify the training performance, a fuzzy logic (FL) system is established to generate a holistic metric based on multiple kinematic and dynamic indicators. Subsequently, a cost function that contains both the tracking error and robot stiffness is constructed. A novel learning scheme is then proposed to enhance subjects’ engagement, leveraging the FL metric to uphold a declining trend in the robot's stiffness. The system stability is analyzed using the Lyapunov theory, the control ultimate bounds are specified and the effects of parameter tuning are discussed. Experiments are conducted on an ankle robot and the minimal assist-as-needed (MAAN) scheme is adopted for comparison. With a training session consisting of 11 trials, the quantitative performance evaluations, individual error convergences, progressive stiffness learning and human–robot interaction are evaluated. It is shown that within eight trials under the progressive AAN and MAAN, the robot assistive torques have an average reduction of 13.45% and 20.25% while subjects’ active torques are increased by 56.53% and 58.39%, respectively. During the late stage of training, the progressive AAN further improves two criteria by 9.44% and 6.29%, while the MAAN partially loses subjects’ participation (active torques are reduced by 36.38%) due to the occurrence of motion adaption.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"328-339"},"PeriodicalIF":5.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CCANet: Cross-Modality Comprehensive Feature Aggregation Network for Indoor Scene Semantic Segmentation CCANet:用于室内场景语义分割的跨模态综合特征聚合网络
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-06 DOI: 10.1109/TCDS.2024.3455356
Zhang Zihao;Yang Yale;Hou Huifang;Meng Fanman;Zhang Fan;Xie Kangzhan;Zhuang Chunsheng
The semantic segmentation of indoor scenes based on RGB and depth information has been a persistent and enduring research topic. However, how to fully utilize the complementarity of multimodal features and achieve efficient fusion remains a challenging research topic. To address this challenge, we proposed an innovative cross-modal comprehensive feature aggregation network (CCANet) to achieve high-precision semantic segmentation of indoor scenes. In this method, we first propose a bidirectional cross-modality feature rectification (BCFR) module to complement each other and remove noise in both channel and spatial correlations. After that, the adaptive criss-cross attention fusion (CAF) module is designed to realize multistage deep multimodal feature fusion. Finally, a multisupervision strategy is applied to accurately learn additional details of the target, guiding the gradual refinement of segmentation maps. By conducting thorough experiments on two openly accessible datasets of indoor scenes, the results demonstrate that CCANet exhibits outstanding performance and robustness in aggregating RGB and depth features.
基于RGB和深度信息的室内场景语义分割一直是一个经久不衰的研究课题。然而,如何充分利用多模态特征的互补性,实现高效融合仍然是一个具有挑战性的研究课题。为了解决这一挑战,我们提出了一种创新的跨模态综合特征聚合网络(CCANet)来实现室内场景的高精度语义分割。在该方法中,我们首先提出了双向交叉模态特征校正(BCFR)模块,以相互补充并去除信道和空间相关性中的噪声。然后,设计自适应交叉注意融合(CAF)模块,实现多阶段深度多模态特征融合。最后,采用多监督策略精确学习目标的附加细节,指导分割图的逐步细化。通过在两个开放访问的室内场景数据集上进行深入的实验,结果表明CCANet在聚合RGB和深度特征方面表现出出色的性能和鲁棒性。
{"title":"CCANet: Cross-Modality Comprehensive Feature Aggregation Network for Indoor Scene Semantic Segmentation","authors":"Zhang Zihao;Yang Yale;Hou Huifang;Meng Fanman;Zhang Fan;Xie Kangzhan;Zhuang Chunsheng","doi":"10.1109/TCDS.2024.3455356","DOIUrl":"10.1109/TCDS.2024.3455356","url":null,"abstract":"The semantic segmentation of indoor scenes based on RGB and depth information has been a persistent and enduring research topic. However, how to fully utilize the complementarity of multimodal features and achieve efficient fusion remains a challenging research topic. To address this challenge, we proposed an innovative cross-modal comprehensive feature aggregation network (CCANet) to achieve high-precision semantic segmentation of indoor scenes. In this method, we first propose a bidirectional cross-modality feature rectification (BCFR) module to complement each other and remove noise in both channel and spatial correlations. After that, the adaptive criss-cross attention fusion (CAF) module is designed to realize multistage deep multimodal feature fusion. Finally, a multisupervision strategy is applied to accurately learn additional details of the target, guiding the gradual refinement of segmentation maps. By conducting thorough experiments on two openly accessible datasets of indoor scenes, the results demonstrate that CCANet exhibits outstanding performance and robustness in aggregating RGB and depth features.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"366-378"},"PeriodicalIF":5.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Behavioral Decision-Making Model of Learning and Memory for Mobile Robot Triggered by Curiosity 由好奇心触发的移动机器人学习和记忆行为决策模型
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-05 DOI: 10.1109/TCDS.2024.3454779
Dongshu Wang;Qi Liu;Xulin Gao;Lei Liu
Learning and memorizing behavioral decision in the process of environmental cognition to guide future decision is an important aspect of research and application in mobile robotics. Traditional rule-based behavioral decision approaches have difficulty in adapting to complex and changing environments. The offline decision-making approaches lead to poor adaptability to dynamic environments, while behavioral decision-making based on reinforcement learning relies on data acquisition, and the learned knowledge cannot guide mobile robots to quickly adapt to new environments. To address this issue, this article proposes a brain-inspired behavioral decision model that can perform incremental learning by simulating the logical structure of memory classification in the brain, as well as the memory conversion mechanisms of hippocampus, prefrontal cortex, and anterior cingulate cortex. The model interacts with the environment through semisupervised learning and learns the current decision online, simulating the memory function of humans to enable mobile robots to adapt to changing environments. In addition, an internal reward mechanism driven by curiosity is designed, simulating the reinforcement mechanism of curiosity in human memory, encoding the memory of unfamiliar behavioral decisions for mobile robots, and consolidating the memory of frequently made behavioral decisions, improving the learning and memory capacity of mobile robots in environmental cognition. The feasibility of the proposed model is verified by physical experiments in different environments.
学习和记忆环境认知过程中的行为决策以指导未来决策是移动机器人研究和应用的一个重要方面。传统的基于规则的行为决策方法难以适应复杂多变的环境。离线决策方法导致对动态环境的适应性差,而基于强化学习的行为决策依赖于数据采集,学习到的知识不能指导移动机器人快速适应新环境。为了解决这一问题,本文提出了一种大脑启发的行为决策模型,该模型通过模拟大脑中记忆分类的逻辑结构,以及海马、前额叶皮层和前扣带皮层的记忆转换机制来进行增量学习。该模型通过半监督学习与环境交互,在线学习当前决策,模拟人类的记忆功能,使移动机器人能够适应不断变化的环境。此外,设计了好奇心驱动的内部奖励机制,模拟人类记忆中好奇心的强化机制,对移动机器人不熟悉的行为决策进行记忆编码,对频繁做出的行为决策进行记忆巩固,提高移动机器人在环境认知中的学习记忆能力。通过不同环境下的物理实验验证了该模型的可行性。
{"title":"A Behavioral Decision-Making Model of Learning and Memory for Mobile Robot Triggered by Curiosity","authors":"Dongshu Wang;Qi Liu;Xulin Gao;Lei Liu","doi":"10.1109/TCDS.2024.3454779","DOIUrl":"10.1109/TCDS.2024.3454779","url":null,"abstract":"Learning and memorizing behavioral decision in the process of environmental cognition to guide future decision is an important aspect of research and application in mobile robotics. Traditional rule-based behavioral decision approaches have difficulty in adapting to complex and changing environments. The offline decision-making approaches lead to poor adaptability to dynamic environments, while behavioral decision-making based on reinforcement learning relies on data acquisition, and the learned knowledge cannot guide mobile robots to quickly adapt to new environments. To address this issue, this article proposes a brain-inspired behavioral decision model that can perform incremental learning by simulating the logical structure of memory classification in the brain, as well as the memory conversion mechanisms of hippocampus, prefrontal cortex, and anterior cingulate cortex. The model interacts with the environment through semisupervised learning and learns the current decision online, simulating the memory function of humans to enable mobile robots to adapt to changing environments. In addition, an internal reward mechanism driven by curiosity is designed, simulating the reinforcement mechanism of curiosity in human memory, encoding the memory of unfamiliar behavioral decisions for mobile robots, and consolidating the memory of frequently made behavioral decisions, improving the learning and memory capacity of mobile robots in environmental cognition. The feasibility of the proposed model is verified by physical experiments in different environments.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"352-365"},"PeriodicalIF":5.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain Compensatory Mechanisms During the Prolonged Cognitive Task: fNIRS and Eye-Tracking Study 长时间认知任务中的大脑补偿机制:fNIRS 和眼动跟踪研究
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-04 DOI: 10.1109/TCDS.2024.3453590
A. A. Badarin;V. M. Antipov;V. V. Grubov;A. V. Andreev;E. N. Pitsik;S. A. Kurkin;A. E. Hramov
The problem of maintaining cognitive performance under fatigue is crucial in fields requiring high concentration and efficiency to successfully complete critical tasks. In this context, the study of compensatory mechanisms that help the brain overcome fatigue is particularly important. This research investigates the correlations between physiological, behavioral, and subjective measures while considering the impact of fatigue on the performance of working memory tasks. A combined approach of functional near-infrared spectroscopy (fNIRS) and eye-tracking was used to reconstruct brain functional networks based on fNIRS data and analyze them in terms of network characteristics such as global clustering coefficient and global efficiency. Results showed a significant increase in subjective fatigue but no significant change in performance during the experiment. The study confirmed that despite fatigue, subjects can maintain performance through compensatory mechanisms, increasing mental effort, with the level of compensation depending on the task's complexity. Furthermore, the study showed that compensatory effort maintains the efficiency of the frontoparietal network, and the degree of compensatory effort is related to the difference in response times between high- and low-complexity tasks.
在需要高度集中和高效率才能成功完成关键任务的领域中,在疲劳状态下保持认知表现的问题至关重要。在这种情况下,研究帮助大脑克服疲劳的补偿机制尤为重要。本研究在考虑疲劳对工作记忆任务表现的影响的同时,探讨了生理、行为和主观测量之间的相关性。采用功能近红外光谱(fNIRS)与眼动追踪相结合的方法,基于fNIRS数据重构脑功能网络,并对其全局聚类系数、全局效率等网络特征进行分析。实验结果显示,受试者主观疲劳程度明显增加,但表现无明显变化。该研究证实,尽管疲劳,受试者可以通过补偿机制保持表现,增加精神努力,补偿水平取决于任务的复杂性。此外,研究表明,代偿努力维持了额顶叶网络的效率,代偿努力的程度与高、低复杂性任务的反应时间差异有关。
{"title":"Brain Compensatory Mechanisms During the Prolonged Cognitive Task: fNIRS and Eye-Tracking Study","authors":"A. A. Badarin;V. M. Antipov;V. V. Grubov;A. V. Andreev;E. N. Pitsik;S. A. Kurkin;A. E. Hramov","doi":"10.1109/TCDS.2024.3453590","DOIUrl":"10.1109/TCDS.2024.3453590","url":null,"abstract":"The problem of maintaining cognitive performance under fatigue is crucial in fields requiring high concentration and efficiency to successfully complete critical tasks. In this context, the study of compensatory mechanisms that help the brain overcome fatigue is particularly important. This research investigates the correlations between physiological, behavioral, and subjective measures while considering the impact of fatigue on the performance of working memory tasks. A combined approach of functional near-infrared spectroscopy (fNIRS) and eye-tracking was used to reconstruct brain functional networks based on fNIRS data and analyze them in terms of network characteristics such as global clustering coefficient and global efficiency. Results showed a significant increase in subjective fatigue but no significant change in performance during the experiment. The study confirmed that despite fatigue, subjects can maintain performance through compensatory mechanisms, increasing mental effort, with the level of compensation depending on the task's complexity. Furthermore, the study showed that compensatory effort maintains the efficiency of the frontoparietal network, and the degree of compensatory effort is related to the difference in response times between high- and low-complexity tasks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"303-314"},"PeriodicalIF":5.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pretrained Dynamics Learning of Numerous Heterogeneous Robots and Gen2Real Transfer 众多异构机器人的预训练动态学习和 Gen2Real 传输
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-04 DOI: 10.1109/TCDS.2024.3454240
Dengpeng Xing;Yiming Yang;Jiale Li
Acquiring dynamics is vital for robotic learning and serves as the foundation for planning and control. This article addresses two essential inquiries: How can one develop a model that encompasses a vast array of diverse robotic dynamics? Is it possible to establish a model that alleviates the burdens of data collection and domain expertise necessary for constructing specific robot models? We explore the dynamics present in a dataset containing numerous serial articulated robots and introduce a novel concept, “Gen2Real,” to transfer simulated, generalized models to physical, and specialized robots. By randomizing dynamics parameters, topological configurations, and model dimensions, we generate an extensive dataset that corresponds to varying properties, connections, and quantities of robotic links. A structure adapted from the generative pretrained transformer is employed to approximate the dynamics of a multitude of heterogeneous robots. Within Gen2Real, we transfer the pretrained model to a target robot using distillation to enable real-time computation. The results corroborate the superiority of the proposed method in terms of accurately learning an immense scope of robotic dynamics, managing commonly encountered disturbances, and exhibiting versatility in transferring to distinct robots.
获取动力学对机器人学习至关重要,是机器人规划和控制的基础。本文解决了两个基本问题:如何开发一个包含大量不同机器人动力学的模型?是否有可能建立一个模型来减轻构建特定机器人模型所需的数据收集和领域专业知识的负担?我们探索了包含大量串行关节机器人的数据集中存在的动力学,并引入了一个新概念,“Gen2Real”,将模拟的、广义的模型转移到物理的和专门的机器人上。通过随机化动力学参数、拓扑配置和模型维度,我们生成了一个广泛的数据集,该数据集对应于机器人链路的不同属性、连接和数量。采用生成式预训练变压器的结构来近似多种异构机器人的动力学。在Gen2Real中,我们使用蒸馏技术将预训练模型转移到目标机器人上,以实现实时计算。结果证实了所提出的方法在准确学习机器人动力学的巨大范围,管理常见的干扰以及展示转移到不同机器人的多功能性方面的优越性。
{"title":"Pretrained Dynamics Learning of Numerous Heterogeneous Robots and Gen2Real Transfer","authors":"Dengpeng Xing;Yiming Yang;Jiale Li","doi":"10.1109/TCDS.2024.3454240","DOIUrl":"10.1109/TCDS.2024.3454240","url":null,"abstract":"Acquiring dynamics is vital for robotic learning and serves as the foundation for planning and control. This article addresses two essential inquiries: How can one develop a model that encompasses a vast array of diverse robotic dynamics? Is it possible to establish a model that alleviates the burdens of data collection and domain expertise necessary for constructing specific robot models? We explore the dynamics present in a dataset containing numerous serial articulated robots and introduce a novel concept, “Gen2Real,” to transfer simulated, generalized models to physical, and specialized robots. By randomizing dynamics parameters, topological configurations, and model dimensions, we generate an extensive dataset that corresponds to varying properties, connections, and quantities of robotic links. A structure adapted from the generative pretrained transformer is employed to approximate the dynamics of a multitude of heterogeneous robots. Within Gen2Real, we transfer the pretrained model to a target robot using distillation to enable real-time computation. The results corroborate the superiority of the proposed method in terms of accurately learning an immense scope of robotic dynamics, managing commonly encountered disturbances, and exhibiting versatility in transferring to distinct robots.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"315-327"},"PeriodicalIF":5.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomimetic Spiking Neural Network Based on Monolayer 2-D Synapse With Short-Term Plasticity for Auditory Brainstem Processing 基于具有短期可塑性的单层二维突触的仿生尖峰神经网络用于听觉脑干处理
IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-29 DOI: 10.1109/TCDS.2024.3450915
Jieun Kim;Peng Zhou;Unbok Wi;Bomin Joo;Donguk Choi;Myeong-Lok Seol;Sravya Pulavarthi;Linfeng Sun;Heejun Yang;Woo Jong Yu;Jin-Woo Han;Sung-Mo Kang;Bai-Sun Kong
In the sound localization of species, short-term depression (STD) plays an important role in maintaining interaural timing difference (ITD) sensitivity. In this article, a biomimetic spiking neural network (SNN) utilizing 2-D synaptic devices for mimicking biological sound localization is presented. A two-terminal monolayer device is used as the artificial synapse, whose temporal conductance change mimics the STD of a synapse. Alpha synaptic current and leaky integrate-and-fire (LIF) neuron models are used for realistic cortical operation. Lateral inhibition and superior olivary nucleus (SON) are adopted to increase the acuteness, to compensate for the interaural level difference (ILD)-induced disturbance, and to enlarge the sound intensity range. By combining solid-state STD synapses and bio-plausible cortical models with an ITD-based coincidence detection mechanism to mimic the auditory brainstem processing, our SNN achieved sound localization with a human-level resolution of 1°.
在物种的声音定位中,短期抑制(STD)对维持耳间时间差(ITD)的敏感性起着重要作用。本文提出一种利用二维突触装置模拟生物声音定位的仿生尖峰神经网络(SNN)。采用双端单层装置作为人工突触,其时间电导变化模拟了突触的STD。α突触电流和漏性整合-火(LIF)神经元模型用于实际皮层操作。采用侧抑制和上橄榄核(SON)来提高灵敏度,补偿耳间音阶差(ILD)引起的干扰,扩大声强范围。通过将固态STD突触和生物似是而非的皮层模型与基于itd的重合检测机制相结合来模拟听觉脑干处理,我们的SNN实现了声音定位,分辨率为人类水平的1°。
{"title":"Biomimetic Spiking Neural Network Based on Monolayer 2-D Synapse With Short-Term Plasticity for Auditory Brainstem Processing","authors":"Jieun Kim;Peng Zhou;Unbok Wi;Bomin Joo;Donguk Choi;Myeong-Lok Seol;Sravya Pulavarthi;Linfeng Sun;Heejun Yang;Woo Jong Yu;Jin-Woo Han;Sung-Mo Kang;Bai-Sun Kong","doi":"10.1109/TCDS.2024.3450915","DOIUrl":"10.1109/TCDS.2024.3450915","url":null,"abstract":"In the sound localization of species, short-term depression (STD) plays an important role in maintaining interaural timing difference (ITD) sensitivity. In this article, a biomimetic spiking neural network (SNN) utilizing 2-D synaptic devices for mimicking biological sound localization is presented. A two-terminal monolayer device is used as the artificial synapse, whose temporal conductance change mimics the STD of a synapse. Alpha synaptic current and leaky integrate-and-fire (LIF) neuron models are used for realistic cortical operation. Lateral inhibition and superior olivary nucleus (SON) are adopted to increase the acuteness, to compensate for the interaural level difference (ILD)-induced disturbance, and to enlarge the sound intensity range. By combining solid-state STD synapses and bio-plausible cortical models with an ITD-based coincidence detection mechanism to mimic the auditory brainstem processing, our SNN achieved sound localization with a human-level resolution of 1°.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"247-258"},"PeriodicalIF":5.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Cognitive and Developmental Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1