Pub Date : 2024-09-30DOI: 10.1109/TCDS.2024.3470068
Wei Li;Boling Hu;Aiguo Song;Kaizhu Huang
In the field of adversarial games, existing decision-making algorithms primarily rely on reinforcement learning, which can theoretically adapt to diverse scenarios through trial and error. However, these algorithms often face the challenges of low effectiveness and slow convergence in complex wargame environments. Inspired by how human commanders make decisions, this article proposes a novel method named full integration of hierarchical decision-making and tactical knowledge (HDMTK). This method comprises an upper reinforcement learning module and a lower multiagent reinforcement learning (MARL) module. To enable agents to efficiently learn the cooperative strategy, in HDMTK, we separate the whole task into explainable subtasks and devise their corresponding subgoals for shaping the online rewards based on tactical knowledge. Experimental results on the wargame simulation platform “MiaoSuan” show that, compared to the advanced MARL methods, HDMTK exhibits superior performance and faster convergence in the complex scenarios.
{"title":"HDMTK: Full Integration of Hierarchical Decision-Making and Tactical Knowledge in Multiagent Adversarial Games","authors":"Wei Li;Boling Hu;Aiguo Song;Kaizhu Huang","doi":"10.1109/TCDS.2024.3470068","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3470068","url":null,"abstract":"In the field of adversarial games, existing decision-making algorithms primarily rely on reinforcement learning, which can theoretically adapt to diverse scenarios through trial and error. However, these algorithms often face the challenges of low effectiveness and slow convergence in complex wargame environments. Inspired by how human commanders make decisions, this article proposes a novel method named full integration of hierarchical decision-making and tactical knowledge (HDMTK). This method comprises an upper reinforcement learning module and a lower multiagent reinforcement learning (MARL) module. To enable agents to efficiently learn the cooperative strategy, in HDMTK, we separate the whole task into explainable subtasks and devise their corresponding subgoals for shaping the online rewards based on tactical knowledge. Experimental results on the wargame simulation platform “MiaoSuan” show that, compared to the advanced MARL methods, HDMTK exhibits superior performance and faster convergence in the complex scenarios.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"465-479"},"PeriodicalIF":5.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-26DOI: 10.1109/TCDS.2024.3468712
Nikos Piperigkos;Christos Anagnostopoulos;Aris S. Lalos;Petros Kapsalas;Duong Van Nguyen
Simultaneous localization and mapping (SLAM) for positioning of robots and autonomous systems (RASs) and mapping of their surrounding environments is a task of major significance in various applications. However, the main disadvantage of traditional SLAM is that the deployed backend modules suffer from accumulative error caused by sharp viewpoint changes, diverse weather conditions, etc. As such, to improve the localization accuracy of the moving agents, we propose a cost-effective and loosely coupled relocalization backend, deployed on top of original SLAM algorithms, which exploits the topologies of poses and landmarks generated either by camera, LiDAR, or mechanical sensors, to couple and fuse them. This novel fusion scheme enhances the decision-making ability and adaptability of autonomous systems, akin to human cognition, by elaborating graph Laplacian processing concept with Kalman filters. Initially designed for cooperative localization of active road users, this approach optimally combines multisensor information through graph signal processing and Bayesian estimation for self-positioning. Conducted experiments were focused on evaluating how our approach can improve the positioning of autonomous ground vehicles, as prominent examples of RASs equipped with sensing capabilities, in challenging outdoor environments. More specifically, experiments were carried out using the CARLA simulator to generate different types of driving trajectories and environmental conditions, as well as real automotive data captured by an operating vehicle in Langen, Germany. Evaluation study demonstrates that localization accuracy is greatly improved both in terms of overall trajectory error as well as loop closing accuracy for each sensor fusion configuration.
{"title":"Graph-Laplacian-Processing-Based Multimodal Localization Backend for Robots and Autonomous Systems","authors":"Nikos Piperigkos;Christos Anagnostopoulos;Aris S. Lalos;Petros Kapsalas;Duong Van Nguyen","doi":"10.1109/TCDS.2024.3468712","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3468712","url":null,"abstract":"Simultaneous localization and mapping (SLAM) for positioning of robots and autonomous systems (RASs) and mapping of their surrounding environments is a task of major significance in various applications. However, the main disadvantage of traditional SLAM is that the deployed backend modules suffer from accumulative error caused by sharp viewpoint changes, diverse weather conditions, etc. As such, to improve the localization accuracy of the moving agents, we propose a cost-effective and loosely coupled relocalization backend, deployed on top of original SLAM algorithms, which exploits the topologies of poses and landmarks generated either by camera, LiDAR, or mechanical sensors, to couple and fuse them. This novel fusion scheme enhances the decision-making ability and adaptability of autonomous systems, akin to human cognition, by elaborating graph Laplacian processing concept with Kalman filters. Initially designed for cooperative localization of active road users, this approach optimally combines multisensor information through graph signal processing and Bayesian estimation for self-positioning. Conducted experiments were focused on evaluating how our approach can improve the positioning of autonomous ground vehicles, as prominent examples of RASs equipped with sensing capabilities, in challenging outdoor environments. More specifically, experiments were carried out using the CARLA simulator to generate different types of driving trajectories and environmental conditions, as well as real automotive data captured by an operating vehicle in Langen, Germany. Evaluation study demonstrates that localization accuracy is greatly improved both in terms of overall trajectory error as well as loop closing accuracy for each sensor fusion configuration.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"436-453"},"PeriodicalIF":5.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1109/TCDS.2024.3466553
Dawoon Jung;Chengyan Gu;Junmin Park;Joono Cheong
Human–robot collaboration (HRC) has recently attracted increasing attention as a vital component of next-generation automated manufacturing and assembly tasks, yet physical human–robot interaction (pHRI)—which is an inevitable component of collaboration—is often limited to rudimentary touches. This article therefore proposes a deep-learning-based pHRI method that utilizes predefined types of human touch gestures as intuitive communicative signs for collaborative tasks. To this end, a touch gesture network model is first designed upon the framework of the gated recurrent unit (GRU) network, which accepts a set of ground-truth dynamic responses (energy change, generalized momentum, and external joint torque) of robot manipulators under the action of known types of touch gestures and learns to predict the five representative touch gesture types and the corresponding link toward a random touch gesture input. After training the GRU-based touch gesture model using a collected dataset of dynamic responses of a robot manipulator, a total of 35 outputs (five gesture types with seven links each) is recognized with 96.94% accuracy. The experimental results of recognition accuracy correlated with the touch gesture types, and their strength results are shown to validate the performance and disclose the characteristics of the proposed touch gesture model. An example of an IKEA chair assembly task is also presented to demonstrate a collaborative task using the proposed touch gestures. By developing the proposed pHRI method and demonstrating its applicability, we expect that this method can help position physical interaction as one of the key modalities for communication in real-world HRC applications.
{"title":"Touch Gesture Recognition-Based Physical Human–Robot Interaction for Collaborative Tasks","authors":"Dawoon Jung;Chengyan Gu;Junmin Park;Joono Cheong","doi":"10.1109/TCDS.2024.3466553","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3466553","url":null,"abstract":"Human–robot collaboration (HRC) has recently attracted increasing attention as a vital component of next-generation automated manufacturing and assembly tasks, yet physical human–robot interaction (pHRI)—which is an inevitable component of collaboration—is often limited to rudimentary touches. This article therefore proposes a deep-learning-based pHRI method that utilizes predefined types of human touch gestures as intuitive communicative signs for collaborative tasks. To this end, a touch gesture network model is first designed upon the framework of the gated recurrent unit (GRU) network, which accepts a set of ground-truth dynamic responses (energy change, generalized momentum, and external joint torque) of robot manipulators under the action of known types of touch gestures and learns to predict the five representative touch gesture types and the corresponding link toward a random touch gesture input. After training the GRU-based touch gesture model using a collected dataset of dynamic responses of a robot manipulator, a total of 35 outputs (five gesture types with seven links each) is recognized with 96.94% accuracy. The experimental results of recognition accuracy correlated with the touch gesture types, and their strength results are shown to validate the performance and disclose the characteristics of the proposed touch gesture model. An example of an IKEA chair assembly task is also presented to demonstrate a collaborative task using the proposed touch gestures. By developing the proposed pHRI method and demonstrating its applicability, we expect that this method can help position physical interaction as one of the key modalities for communication in real-world HRC applications.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"421-435"},"PeriodicalIF":5.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-23DOI: 10.1109/TCDS.2024.3465602
SangEun Lee;Seoyun Kim;Yubeen Lee;Jufeng Yang;Eunil Park
Image emotion analysis has gained notable attention owing to the growing importance of computationally modeling human emotions. Most previous studies have focused on classifying the feelings evoked by an image into predefined emotion categories. Compared with these categorical approaches which cannot address the ambiguity and complexity of human emotions, recent studies have taken dimensional approaches to address these problems. However, there is still a limitation in that the number of dimensional datasets is significantly smaller for model training, compared with many available categorical datasets. We propose four types of frameworks that use categorical datasets to predict emotion values for a given image in the valence–arousal (VA) space. Specifically, our proposed framework is trained to predict continuous emotion values under the supervision of categorical labels. Extensive experiments demonstrate that our approach showed a positive correlation with the actual VA values of the dimensional dataset. In addition, our framework improves further when a small number of dimensional datasets are available for the fine-tuning process.
{"title":"Enhancing Dimensional Image Emotion Detection With a Low-Resource Dataset via Two-Stage Training","authors":"SangEun Lee;Seoyun Kim;Yubeen Lee;Jufeng Yang;Eunil Park","doi":"10.1109/TCDS.2024.3465602","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3465602","url":null,"abstract":"Image emotion analysis has gained notable attention owing to the growing importance of computationally modeling human emotions. Most previous studies have focused on classifying the feelings evoked by an image into predefined emotion categories. Compared with these categorical approaches which cannot address the ambiguity and complexity of human emotions, recent studies have taken dimensional approaches to address these problems. However, there is still a limitation in that the number of dimensional datasets is significantly smaller for model training, compared with many available categorical datasets. We propose four types of frameworks that use categorical datasets to predict emotion values for a given image in the valence–arousal (VA) space. Specifically, our proposed framework is trained to predict continuous emotion values under the supervision of categorical labels. Extensive experiments demonstrate that our approach showed a positive correlation with the actual VA values of the dimensional dataset. In addition, our framework improves further when a small number of dimensional datasets are available for the fine-tuning process.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"455-464"},"PeriodicalIF":5.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18DOI: 10.1109/tcds.2024.3463194
Xiaoyu Wu, Jiale Liang, Yiang Yu, Guoxin Li, Gary G. Yen, Haoyong Yu
{"title":"Embodied Perception Interaction, and Cognition for Wearable Robotics: A Survey","authors":"Xiaoyu Wu, Jiale Liang, Yiang Yu, Guoxin Li, Gary G. Yen, Haoyong Yu","doi":"10.1109/tcds.2024.3463194","DOIUrl":"https://doi.org/10.1109/tcds.2024.3463194","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"48 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1109/tcds.2024.3462651
Zhendong Guo, Na Dong, Zehui Zhang, Xiaoming Mai, Donghui Li
{"title":"CS-SLAM: A lightweight semantic SLAM method for dynamic scenarios","authors":"Zhendong Guo, Na Dong, Zehui Zhang, Xiaoming Mai, Donghui Li","doi":"10.1109/tcds.2024.3462651","DOIUrl":"https://doi.org/10.1109/tcds.2024.3462651","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"49 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Subcortical regions can be functionally organized into connectivity networks and are extensively communicated with the cortex via reciprocal connections. However, most current research on subcortical networks ignores these interconnections, and networks of the whole brain are of high dimensionality and computational complexity. In this article, we propose a novel cofluctuation-guided subcortical connectivity network construction model based on edge-centric functional connectivity (FC). It is capable of extracting the cofluctuations between the cortex and subcortex and constructing dynamic subcortical networks based on these interconnections. Blind source separation approaches with domain knowledge are designed for dimensionality reduction and feature extraction. Great reproducibility and reliability were achieved when applying our model to two sessions of functional magnetic resonance imaging (fMRI) data. Cortical areas having synchronous communications with the cortex were detected, which was unable to be revealed by traditional node-centric FC. Significant alterations in connectivity patterns were observed when dealing with fMRI of subjects with and without Parkinson's disease, which were further correlated to clinical scores. These validations demonstrated that our model provided a promising strategy for brain network construction, exhibiting great potential in clinical practice.
{"title":"Edge-Centric Functional-Connectivity-Based Cofluctuation-Guided Subcortical Connectivity Network Construction","authors":"Qinrui Ling;Aiping Liu;Taomian Mi;Piu Chan;Xun Chen","doi":"10.1109/TCDS.2024.3462709","DOIUrl":"10.1109/TCDS.2024.3462709","url":null,"abstract":"Subcortical regions can be functionally organized into connectivity networks and are extensively communicated with the cortex via reciprocal connections. However, most current research on subcortical networks ignores these interconnections, and networks of the whole brain are of high dimensionality and computational complexity. In this article, we propose a novel cofluctuation-guided subcortical connectivity network construction model based on edge-centric functional connectivity (FC). It is capable of extracting the cofluctuations between the cortex and subcortex and constructing dynamic subcortical networks based on these interconnections. Blind source separation approaches with domain knowledge are designed for dimensionality reduction and feature extraction. Great reproducibility and reliability were achieved when applying our model to two sessions of functional magnetic resonance imaging (fMRI) data. Cortical areas having synchronous communications with the cortex were detected, which was unable to be revealed by traditional node-centric FC. Significant alterations in connectivity patterns were observed when dealing with fMRI of subjects with and without Parkinson's disease, which were further correlated to clinical scores. These validations demonstrated that our model provided a promising strategy for brain network construction, exhibiting great potential in clinical practice.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"390-399"},"PeriodicalIF":5.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1109/TCDS.2024.3462452
Saydul Akbar Murad;Nick Rahimi
The conversion of brain activity into text using electroencephalography (EEG) has gained significant traction in recent years. Many researchers are working to develop new models to decode EEG signals into text form. Although this area has shown promising developments, it still faces numerous challenges that necessitate further improvement. It is important to outline this area's recent developments and future research directions to provide a comprehensive understanding of the current state of technology, guide future research efforts, and enhance the effectiveness and accessibility of EEG-to-text systems. In this review article, we thoroughly summarize the progress in EEG-to-text conversion. First, we talk about how EEG-to-text technology has grown and what problems the field still faces. Second, we discuss existing techniques used in this field. This includes methods for collecting EEG data, the steps to process these signals, and the development of systems capable of translating these signals into coherent text. We conclude with potential future research directions, emphasizing the need for enhanced accuracy, reduced system constraints, and the exploration of novel applications across varied sectors. By addressing these aspects, this review aims to contribute to developing more accessible and effective brain–computer interface (BCI) technology for a broader user base.
{"title":"Unveiling Thoughts: A Review of Advancements in EEG Brain Signal Decoding Into Text","authors":"Saydul Akbar Murad;Nick Rahimi","doi":"10.1109/TCDS.2024.3462452","DOIUrl":"10.1109/TCDS.2024.3462452","url":null,"abstract":"The conversion of brain activity into text using electroencephalography (EEG) has gained significant traction in recent years. Many researchers are working to develop new models to decode EEG signals into text form. Although this area has shown promising developments, it still faces numerous challenges that necessitate further improvement. It is important to outline this area's recent developments and future research directions to provide a comprehensive understanding of the current state of technology, guide future research efforts, and enhance the effectiveness and accessibility of EEG-to-text systems. In this review article, we thoroughly summarize the progress in EEG-to-text conversion. First, we talk about how EEG-to-text technology has grown and what problems the field still faces. Second, we discuss existing techniques used in this field. This includes methods for collecting EEG data, the steps to process these signals, and the development of systems capable of translating these signals into coherent text. We conclude with potential future research directions, emphasizing the need for enhanced accuracy, reduced system constraints, and the exploration of novel applications across varied sectors. By addressing these aspects, this review aims to contribute to developing more accessible and effective brain–computer interface (BCI) technology for a broader user base.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"61-76"},"PeriodicalIF":5.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1109/TCDS.2024.3461335
Haoyu Zhu;Xiaorui Liu;Hang Su;Wei Wang;Jinpeng Yu
This article focuses on the multiple objects selection problem for the robot in social scenarios, and proposes a novel methodology composed of quantitative social intention evaluation and gaze behavior control. For the social scenarios containing various persons and multimodal social cues, a combination of the entropy weight method (EWM) and gray correlation-order preference by similarity to the ideal solution (GC-TOPSIS) model is proposed to fuse the multimodal social cues, and evaluate the social intention of candidates. According to the quantitative evaluation of social intention, a robot can generate the interaction priority among multiple social candidates. To ensure this interaction selection mechanism in behavior level, an optimal control framework composed of model predictive controller (MPC) and online Gaussian process (GP) observer is employed to drive the eye-head coordinated gaze behavior of robot. Through the experiments conducted on the Xiaopang robot, the availability of the proposed methodology can be illustrated. This work enables robots to generate social behavior based on quantitative intention perception, which could bring the potential to explore the sensory principles and biomechanical mechanism underlying the human-robot interaction, and broaden the application of robot in the social scenario.
{"title":"The Methodology of Quantitative Social Intention Evaluation and Robot Gaze Behavior Control in Multiobjects Scenario","authors":"Haoyu Zhu;Xiaorui Liu;Hang Su;Wei Wang;Jinpeng Yu","doi":"10.1109/TCDS.2024.3461335","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3461335","url":null,"abstract":"This article focuses on the multiple objects selection problem for the robot in social scenarios, and proposes a novel methodology composed of quantitative social intention evaluation and gaze behavior control. For the social scenarios containing various persons and multimodal social cues, a combination of the entropy weight method (EWM) and gray correlation-order preference by similarity to the ideal solution (GC-TOPSIS) model is proposed to fuse the multimodal social cues, and evaluate the social intention of candidates. According to the quantitative evaluation of social intention, a robot can generate the interaction priority among multiple social candidates. To ensure this interaction selection mechanism in behavior level, an optimal control framework composed of model predictive controller (MPC) and online Gaussian process (GP) observer is employed to drive the eye-head coordinated gaze behavior of robot. Through the experiments conducted on the Xiaopang robot, the availability of the proposed methodology can be illustrated. This work enables robots to generate social behavior based on quantitative intention perception, which could bring the potential to explore the sensory principles and biomechanical mechanism underlying the human-robot interaction, and broaden the application of robot in the social scenario.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"400-409"},"PeriodicalIF":5.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143761402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1109/TCDS.2024.3460750
Kunjira Kingphai;Yashar Moshfeghi
Mental workload (MWL) assessment is crucial in information systems (IS), impacting task performance, user experience, and system effectiveness. Deep learning offers promising techniques for MWL classification using electroencephalography (EEG), which monitors cognitive states dynamically and unobtrusively. Our research explores deep learning's potential and challenges in EEG-based MWL classification, focusing on training inputs, cross-validation methods, and classification problem types. We identify five types of EEG-based MWL classification: within-subject, cross subject, cross session, cross task, and combined cross task and cross subject. Success depends on managing dataset uniqueness, session and task variability, and artifact removal. Despite the potential, real-world applications are limited. Enhancements are necessary for self-reporting methods, universal preprocessing standards, and MWL assessment accuracy. Specifically, inaccuracies are inflated when data are shuffled before splitting to train and test sets, disrupting EEG signals’ temporal sequence. In contrast, methods such as the time-series cross validation and leave-session-out approach better preserve temporal integrity, offering more accurate model performance evaluations. Utilizing deep learning for EEG-based MWL assessment could significantly improve IS functionality and adaptability in real time based on user cognitive states.
{"title":"Mental Workload Assessment Using Deep Learning Models From EEG Signals: A Systematic Review","authors":"Kunjira Kingphai;Yashar Moshfeghi","doi":"10.1109/TCDS.2024.3460750","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3460750","url":null,"abstract":"Mental workload (MWL) assessment is crucial in information systems (IS), impacting task performance, user experience, and system effectiveness. Deep learning offers promising techniques for MWL classification using electroencephalography (EEG), which monitors cognitive states dynamically and unobtrusively. Our research explores deep learning's potential and challenges in EEG-based MWL classification, focusing on training inputs, cross-validation methods, and classification problem types. We identify five types of EEG-based MWL classification: within-subject, cross subject, cross session, cross task, and combined cross task and cross subject. Success depends on managing dataset uniqueness, session and task variability, and artifact removal. Despite the potential, real-world applications are limited. Enhancements are necessary for self-reporting methods, universal preprocessing standards, and MWL assessment accuracy. Specifically, inaccuracies are inflated when data are shuffled before splitting to train and test sets, disrupting EEG signals’ temporal sequence. In contrast, methods such as the time-series cross validation and leave-session-out approach better preserve temporal integrity, offering more accurate model performance evaluations. Utilizing deep learning for EEG-based MWL assessment could significantly improve IS functionality and adaptability in real time based on user cognitive states.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"40-60"},"PeriodicalIF":5.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}