Pub Date : 2024-04-05DOI: 10.3389/fnbot.2024.1385778
Dekang Zhu, Qianyi Bu, Zhongpan Zhu, Yujie Zhang, Zhipeng Wang
The combination of lifelong learning algorithms with autonomous intelligent systems (AIS) is gaining popularity due to its ability to enhance AIS performance, but the existing summaries in related fields are insufficient. Therefore, it is necessary to systematically analyze the research on lifelong learning algorithms with autonomous intelligent systems, aiming to gain a better understanding of the current progress in this field. This paper presents a thorough review and analysis of the relevant work on the integration of lifelong learning algorithms and autonomous intelligent systems. Specifically, we investigate the diverse applications of lifelong learning algorithms in AIS’s domains such as autonomous driving, anomaly detection, robots, and emergency management, while assessing their impact on enhancing AIS performance and reliability. The challenging problems encountered in lifelong learning for AIS are summarized based on a profound understanding in literature review. The advanced and innovative development of lifelong learning algorithms for autonomous intelligent systems are discussed for offering valuable insights and guidance to researchers in this rapidly evolving field.
{"title":"Advancing autonomy through lifelong learning: a survey of autonomous intelligent systems","authors":"Dekang Zhu, Qianyi Bu, Zhongpan Zhu, Yujie Zhang, Zhipeng Wang","doi":"10.3389/fnbot.2024.1385778","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1385778","url":null,"abstract":"The combination of lifelong learning algorithms with autonomous intelligent systems (AIS) is gaining popularity due to its ability to enhance AIS performance, but the existing summaries in related fields are insufficient. Therefore, it is necessary to systematically analyze the research on lifelong learning algorithms with autonomous intelligent systems, aiming to gain a better understanding of the current progress in this field. This paper presents a thorough review and analysis of the relevant work on the integration of lifelong learning algorithms and autonomous intelligent systems. Specifically, we investigate the diverse applications of lifelong learning algorithms in AIS’s domains such as autonomous driving, anomaly detection, robots, and emergency management, while assessing their impact on enhancing AIS performance and reliability. The challenging problems encountered in lifelong learning for AIS are summarized based on a profound understanding in literature review. The advanced and innovative development of lifelong learning algorithms for autonomous intelligent systems are discussed for offering valuable insights and guidance to researchers in this rapidly evolving field.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140583646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.3389/fnbot.2024.1351700
Raphael Rätz, Alexandre L. Ratschat, Nerea Cividanes-Garcia, Gerard M. Ribbers, Laura Marchal-Crespo
In stroke rehabilitation, simple robotic devices hold the potential to increase the training dosage in group therapies and to enable continued therapy at home after hospital discharge. However, we identified a lack of portable and cost-effective devices that not only focus on improving motor functions but also address sensory deficits. Thus, we designed a minimally-actuated hand training device that incorporates active grasping movements and passive pronosupination, complemented by a rehabilitative game with meaningful haptic feedback. Following a human-centered design approach, we conducted a usability study with 13 healthy participants, including three therapists. In a simulated unsupervised environment, the naive participants had to set up and use the device based on written instructions. Our mixed-methods approach included quantitative data from performance metrics, standardized questionnaires, and eye tracking, alongside qualitative feedback from semi-structured interviews. The study results highlighted the device's overall ease of setup and use, as well as its realistic haptic feedback. The eye-tracking analysis further suggested that participants felt safe during usage. Moreover, the study provided crucial insights for future improvements such as a more intuitive and comfortable wrist fixation, more natural pronosupination movements, and easier-to-follow instructions. Our research underscores the importance of continuous testing in the development process and offers significant contributions to the design of user-friendly, unsupervised neurorehabilitation technologies to improve sensorimotor stroke rehabilitation.
{"title":"Designing for usability: development and evaluation of a portable minimally-actuated haptic hand and forearm trainer for unsupervised stroke rehabilitation","authors":"Raphael Rätz, Alexandre L. Ratschat, Nerea Cividanes-Garcia, Gerard M. Ribbers, Laura Marchal-Crespo","doi":"10.3389/fnbot.2024.1351700","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1351700","url":null,"abstract":"In stroke rehabilitation, simple robotic devices hold the potential to increase the training dosage in group therapies and to enable continued therapy at home after hospital discharge. However, we identified a lack of portable and cost-effective devices that not only focus on improving motor functions but also address sensory deficits. Thus, we designed a minimally-actuated hand training device that incorporates active grasping movements and passive pronosupination, complemented by a rehabilitative game with meaningful haptic feedback. Following a human-centered design approach, we conducted a usability study with 13 healthy participants, including three therapists. In a simulated unsupervised environment, the naive participants had to set up and use the device based on written instructions. Our mixed-methods approach included quantitative data from performance metrics, standardized questionnaires, and eye tracking, alongside qualitative feedback from semi-structured interviews. The study results highlighted the device's overall ease of setup and use, as well as its realistic haptic feedback. The eye-tracking analysis further suggested that participants felt safe during usage. Moreover, the study provided crucial insights for future improvements such as a more intuitive and comfortable wrist fixation, more natural pronosupination movements, and easier-to-follow instructions. Our research underscores the importance of continuous testing in the development process and offers significant contributions to the design of user-friendly, unsupervised neurorehabilitation technologies to improve sensorimotor stroke rehabilitation.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"47 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140602043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.3389/fnbot.2024.1348029
Jan C. L. Lau, Katja Mombaur
With the global geriatric population expected to reach 1.5 billion by 2050, different assistive technologies have been developed to tackle age-associated movement impairments. Lower-limb robotic exoskeletons have the potential to support frail older adults while promoting activities of daily living, but the need for crutches may be challenging for this population. Crutches aid safety and stability, but moving in an exoskeleton with them can be unnatural to human movements, and coordination can be difficult. Frail older adults may not have the sufficient arm strength to use them, or prolonged usage can lead to upper limb joint deterioration. The research presented in this paper makes a contribution to a more detailed study of crutch-less exoskeleton use, analyzing in particular the most challenging motion, sit-to-stand (STS). It combines motion capture and optimal control approaches to evaluate and compare the STS dynamics with the TWIN exoskeleton with and without crutches. The results show trajectories that are significantly faster than the exoskeleton's default trajectory, and identify the motor torques needed for full and partial STS assistance. With the TWIN exoskeleton's existing motors being able to support 112 Nm (hips) and 88 Nm (knees) total, assuming an ideal contribution from the device and user, the older adult would need to contribute a total of 8 Nm (hips) and 50 Nm (knees). For TWIN to provide full STS assistance, it would require new motors that can exert at least 121 Nm (hips) and 140 Nm (knees) total. The presented optimal control approaches can be replicated on other exoskeletons to determine the torques required with their mass distributions. Future improvements are discussed and the results presented lay groundwork for eliminating crutches when moving with an exoskeleton.
{"title":"Can lower-limb exoskeletons support sit-to-stand motions in frail elderly without crutches? A study combining optimal control and motion capture","authors":"Jan C. L. Lau, Katja Mombaur","doi":"10.3389/fnbot.2024.1348029","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1348029","url":null,"abstract":"With the global geriatric population expected to reach 1.5 billion by 2050, different assistive technologies have been developed to tackle age-associated movement impairments. Lower-limb robotic exoskeletons have the potential to support frail older adults while promoting activities of daily living, but the need for crutches may be challenging for this population. Crutches aid safety and stability, but moving in an exoskeleton with them can be unnatural to human movements, and coordination can be difficult. Frail older adults may not have the sufficient arm strength to use them, or prolonged usage can lead to upper limb joint deterioration. The research presented in this paper makes a contribution to a more detailed study of crutch-less exoskeleton use, analyzing in particular the most challenging motion, sit-to-stand (STS). It combines motion capture and optimal control approaches to evaluate and compare the STS dynamics with the TWIN exoskeleton with and without crutches. The results show trajectories that are significantly faster than the exoskeleton's default trajectory, and identify the motor torques needed for full and partial STS assistance. With the TWIN exoskeleton's existing motors being able to support 112 Nm (hips) and 88 Nm (knees) total, assuming an ideal contribution from the device and user, the older adult would need to contribute a total of 8 Nm (hips) and 50 Nm (knees). For TWIN to provide full STS assistance, it would require new motors that can exert at least 121 Nm (hips) and 140 Nm (knees) total. The presented optimal control approaches can be replicated on other exoskeletons to determine the torques required with their mass distributions. Future improvements are discussed and the results presented lay groundwork for eliminating crutches when moving with an exoskeleton.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"2 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140583319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-27DOI: 10.3389/fnbot.2024.1379906
Qiang Fu, Tianhong Luo, TingQiong Cui, Xiangyu Ma, Shuang Liang, Yi Huang, Shengxue Wang
IntroductionPeriodicity, self-excitation, and time ratio asymmetry are the fundamental characteristics of the human gait. In order to imitate these mentioned characteristics, a pattern generator with four degrees of freedom is proposed based on cardioid oscillators developed by the authors.MethodThe proposed pattern generator is composed of four coupled cardioid oscillators, which are self-excited and have asymmetric time ratios. These oscillators are connected with other oscillators through coupled factors. The dynamic behaviors of the proposed oscillators, such as phase locking, time ratio, and self-excitation, are analyzed via simulations by employing the harmonic balance method. Moreover, for comparison, the simulated trajectories are compared with the natural joint trajectories measured in experiments.Results and discussionSimulation and experimental results show that the behaviors of the proposed pattern generator are similar to those of the natural lower limb. It means the simulated trajectories from the generator are self-excited without any additional inputs and have asymmetric time ratios. Their phases are locked with others. Moreover, the proposed pattern generator can be applied as the reference model for the lower limb exoskeleton controlling algorithm to produce self-adjusted reference trajectories.
{"title":"Cardioid oscillator-based pattern generator for imitating the time-ratio-asymmetrical behavior of the lower limb exoskeleton","authors":"Qiang Fu, Tianhong Luo, TingQiong Cui, Xiangyu Ma, Shuang Liang, Yi Huang, Shengxue Wang","doi":"10.3389/fnbot.2024.1379906","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1379906","url":null,"abstract":"IntroductionPeriodicity, self-excitation, and time ratio asymmetry are the fundamental characteristics of the human gait. In order to imitate these mentioned characteristics, a pattern generator with four degrees of freedom is proposed based on cardioid oscillators developed by the authors.MethodThe proposed pattern generator is composed of four coupled cardioid oscillators, which are self-excited and have asymmetric time ratios. These oscillators are connected with other oscillators through coupled factors. The dynamic behaviors of the proposed oscillators, such as phase locking, time ratio, and self-excitation, are analyzed via simulations by employing the harmonic balance method. Moreover, for comparison, the simulated trajectories are compared with the natural joint trajectories measured in experiments.Results and discussionSimulation and experimental results show that the behaviors of the proposed pattern generator are similar to those of the natural lower limb. It means the simulated trajectories from the generator are self-excited without any additional inputs and have asymmetric time ratios. Their phases are locked with others. Moreover, the proposed pattern generator can be applied as the reference model for the lower limb exoskeleton controlling algorithm to produce self-adjusted reference trajectories.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"33 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140313497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-21DOI: 10.3389/fnbot.2024.1341750
Johan Engström, Ran Wei, Anthony D. McDonald, Alfredo Garcia, Matthew O'Kelly, Leif Johnson
Understanding adaptive human driving behavior, in particular how drivers manage uncertainty, is of key importance for developing simulated human driver models that can be used in the evaluation and development of autonomous vehicles. However, existing traffic psychology models of adaptive driving behavior either lack computational rigor or only address specific scenarios and/or behavioral phenomena. While models developed in the fields of machine learning and robotics can effectively learn adaptive driving behavior from data, due to their black box nature, they offer little or no explanation of the mechanisms underlying the adaptive behavior. Thus, generalizable, interpretable, computational models of adaptive human driving behavior are still rare. This paper proposes such a model based on active inference, a behavioral modeling framework originating in computational neuroscience. The model offers a principled solution to how humans trade progress against caution through policy selection based on the single mandate to minimize expected free energy. This casts goal-seeking and information-seeking (uncertainty-resolving) behavior under a single objective function, allowing the model to seamlessly resolve uncertainty as a means to obtain its goals. We apply the model in two apparently disparate driving scenarios that require managing uncertainty, (1) driving past an occluding object and (2) visual time-sharing between driving and a secondary task, and show how human-like adaptive driving behavior emerges from the single principle of expected free energy minimization.
{"title":"Resolving uncertainty on the fly: modeling adaptive driving behavior as active inference","authors":"Johan Engström, Ran Wei, Anthony D. McDonald, Alfredo Garcia, Matthew O'Kelly, Leif Johnson","doi":"10.3389/fnbot.2024.1341750","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1341750","url":null,"abstract":"Understanding adaptive human driving behavior, in particular how drivers manage uncertainty, is of key importance for developing simulated human driver models that can be used in the evaluation and development of autonomous vehicles. However, existing traffic psychology models of adaptive driving behavior either lack computational rigor or only address specific scenarios and/or behavioral phenomena. While models developed in the fields of machine learning and robotics can effectively learn adaptive driving behavior from data, due to their black box nature, they offer little or no explanation of the mechanisms underlying the adaptive behavior. Thus, generalizable, interpretable, computational models of adaptive human driving behavior are still rare. This paper proposes such a model based on active inference, a behavioral modeling framework originating in computational neuroscience. The model offers a principled solution to how humans trade progress against caution through policy selection based on the single mandate to minimize expected free energy. This casts goal-seeking and information-seeking (uncertainty-resolving) behavior under a single objective function, allowing the model to seamlessly resolve uncertainty as a means to obtain its goals. We apply the model in two apparently disparate driving scenarios that require managing uncertainty, (1) driving past an occluding object and (2) visual time-sharing between driving and a secondary task, and show how human-like adaptive driving behavior emerges from the single principle of expected free energy minimization.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"153 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140199662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-19DOI: 10.3389/fnbot.2024.1338189
Lun Ge, Xiaoguang Zhou, Yongqiang Li, Yongcong Wang
In real-world scenarios, making navigation decisions for autonomous driving involves a sequential set of steps. These judgments are made based on partial observations of the environment, while the underlying model of the environment remains unknown. A prevalent method for resolving such issues is reinforcement learning, in which the agent acquires knowledge through a succession of rewards in addition to fragmentary and noisy observations. This study introduces an algorithm named deep reinforcement learning navigation via decision transformer (DRLNDT) to address the challenge of enhancing the decision-making capabilities of autonomous vehicles operating in partially observable urban environments. The DRLNDT framework is built around the Soft Actor-Critic (SAC) algorithm. DRLNDT utilizes Transformer neural networks to effectively model the temporal dependencies in observations and actions. This approach aids in mitigating judgment errors that may arise due to sensor noise or occlusion within a given state. The process of extracting latent vectors from high-quality images involves the utilization of a variational autoencoder (VAE). This technique effectively reduces the dimensionality of the state space, resulting in enhanced training efficiency. The multimodal state space consists of vector states, including velocity and position, which the vehicle's intrinsic sensors can readily obtain. Additionally, latent vectors derived from high-quality images are incorporated to facilitate the Agent's assessment of the present trajectory. Experiments demonstrate that DRLNDT may achieve a superior optimal policy without prior knowledge of the environment, detailed maps, or routing assistance, surpassing the baseline technique and other policy methods that lack historical data.
{"title":"Deep reinforcement learning navigation via decision transformer in autonomous driving","authors":"Lun Ge, Xiaoguang Zhou, Yongqiang Li, Yongcong Wang","doi":"10.3389/fnbot.2024.1338189","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1338189","url":null,"abstract":"In real-world scenarios, making navigation decisions for autonomous driving involves a sequential set of steps. These judgments are made based on partial observations of the environment, while the underlying model of the environment remains unknown. A prevalent method for resolving such issues is reinforcement learning, in which the agent acquires knowledge through a succession of rewards in addition to fragmentary and noisy observations. This study introduces an algorithm named deep reinforcement learning navigation via decision transformer (DRLNDT) to address the challenge of enhancing the decision-making capabilities of autonomous vehicles operating in partially observable urban environments. The DRLNDT framework is built around the Soft Actor-Critic (SAC) algorithm. DRLNDT utilizes Transformer neural networks to effectively model the temporal dependencies in observations and actions. This approach aids in mitigating judgment errors that may arise due to sensor noise or occlusion within a given state. The process of extracting latent vectors from high-quality images involves the utilization of a variational autoencoder (VAE). This technique effectively reduces the dimensionality of the state space, resulting in enhanced training efficiency. The multimodal state space consists of vector states, including velocity and position, which the vehicle's intrinsic sensors can readily obtain. Additionally, latent vectors derived from high-quality images are incorporated to facilitate the Agent's assessment of the present trajectory. Experiments demonstrate that DRLNDT may achieve a superior optimal policy without prior knowledge of the environment, detailed maps, or routing assistance, surpassing the baseline technique and other policy methods that lack historical data.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"117 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional trajectory learning methods based on Imitation Learning (IL) only learn the existing trajectory knowledge from human demonstration. In this way, it can not adapt the trajectory knowledge to the task environment by interacting with the environment and fine-tuning the policy. To address this problem, a global trajectory learning method which combinines IL with Reinforcement Learning (RL) to adapt the knowledge policy to the environment is proposed. In this paper, IL is proposed to acquire basic trajectory skills, and then learns the agent will explore and exploit more policy which is applicable to the current environment by RL. The basic trajectory skills include the knowledge policy and the time stage information in the whole task space to help learn the time series of the trajectory, and are used to guide the subsequent RL process. Notably, neural networks are not used to model the action policy and the Q value of RL during the RL process. Instead, they are sampled and updated in the whole task space and then transferred to the networks after the RL process through Behavior Cloning (BC) to get continuous and smooth global trajectory policy. The feasibility and the effectiveness of the method was validated in a custom Gym environment of a flower drawing task. And then, we executed the learned policy in the real-world robot drawing experiment.
基于模仿学习(IL)的传统轨迹学习方法只能从人类示范中学习已有的轨迹知识。这样,它就无法通过与环境交互和微调策略来使轨迹知识适应任务环境。为解决这一问题,本文提出了一种全局轨迹学习方法,将 IL 与强化学习(RL)相结合,使知识策略适应环境。本文提出通过 IL 获取基本轨迹技能,然后通过 RL 学习代理探索和利用更多适用于当前环境的策略。基本轨迹技能包括知识策略和整个任务空间的时间阶段信息,以帮助学习轨迹的时间序列,并用于指导后续的 RL 过程。值得注意的是,在 RL 过程中,神经网络并不是用来模拟 RL 的行动策略和 Q 值的。相反,它们在整个任务空间中进行采样和更新,然后通过行为克隆(Behavior Cloning,BC)在 RL 过程后转移到网络中,从而获得连续、平滑的全局轨迹策略。该方法的可行性和有效性在定制的 Gym 环境中的花卉绘制任务中得到了验证。然后,我们在真实世界的机器人绘制实验中执行了学习到的策略。
{"title":"Human skill knowledge guided global trajectory policy reinforcement learning method","authors":"Yajing Zang, Pengfei Wang, Fusheng Zha, Wei Guo, Chuanfeng Li, Lining Sun","doi":"10.3389/fnbot.2024.1368243","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1368243","url":null,"abstract":"Traditional trajectory learning methods based on Imitation Learning (IL) only learn the existing trajectory knowledge from human demonstration. In this way, it can not adapt the trajectory knowledge to the task environment by interacting with the environment and fine-tuning the policy. To address this problem, a global trajectory learning method which combinines IL with Reinforcement Learning (RL) to adapt the knowledge policy to the environment is proposed. In this paper, IL is proposed to acquire basic trajectory skills, and then learns the agent will explore and exploit more policy which is applicable to the current environment by RL. The basic trajectory skills include the knowledge policy and the time stage information in the whole task space to help learn the time series of the trajectory, and are used to guide the subsequent RL process. Notably, neural networks are not used to model the action policy and the Q value of RL during the RL process. Instead, they are sampled and updated in the whole task space and then transferred to the networks after the RL process through Behavior Cloning (BC) to get continuous and smooth global trajectory policy. The feasibility and the effectiveness of the method was validated in a custom Gym environment of a flower drawing task. And then, we executed the learned policy in the real-world robot drawing experiment.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"495 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140151072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.3389/fnbot.2024.1375309
Tinghe Hong, Weibing Li, Kai Huang
Introduction
Redundant robots offer greater flexibility compared to non-redundant ones but are susceptible to increased collision risks when the end-effector approaches the robot's own links. Redundant degrees of freedom (DoFs) present an opportunity for collision avoidance; however, selecting an appropriate inverse kinematics (IK) solution remains challenging due to the infinite possible solutions.
Methods
This study proposes a reinforcement learning (RL) enhanced pseudo-inverse approach to address self-collision avoidance in redundant robots. The RL agent is integrated into the redundancy resolution process of a pseudo-inverse method to determine a suitable IK solution for avoiding self-collisions during task execution. Additionally, an improved replay buffer is implemented to enhance the performance of the RL algorithm.
Results
Simulations and experiments validate the effectiveness of the proposed method in reducing the risk of self-collision in redundant robots.
Conclusion
The RL enhanced pseudo-inverse approach presented in this study demonstrates promising results in mitigating self-collision risks in redundant robots, highlighting its potential for enhancing safety and performance in robotic systems.
导言与非冗余机器人相比,冗余机器人具有更大的灵活性,但当末端执行器接近机器人自身的链接时,容易增加碰撞风险。冗余自由度(DoFs)为避免碰撞提供了机会;然而,由于可能的解决方案不计其数,选择适当的逆运动学(IK)解决方案仍然具有挑战性。强化学习代理被集成到伪逆向方法的冗余解决过程中,以确定合适的 IK 解决方案,从而在任务执行过程中避免自碰撞。结果模拟和实验验证了所提方法在降低冗余机器人自碰撞风险方面的有效性。结论本研究提出的 RL 增强型伪逆向方法在降低冗余机器人自碰撞风险方面取得了可喜的成果,凸显了其在提高机器人系统安全性和性能方面的潜力。
{"title":"A reinforcement learning enhanced pseudo-inverse approach to self-collision avoidance of redundant robots","authors":"Tinghe Hong, Weibing Li, Kai Huang","doi":"10.3389/fnbot.2024.1375309","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1375309","url":null,"abstract":"<sec><title>Introduction</title><p>Redundant robots offer greater flexibility compared to non-redundant ones but are susceptible to increased collision risks when the end-effector approaches the robot's own links. Redundant degrees of freedom (DoFs) present an opportunity for collision avoidance; however, selecting an appropriate inverse kinematics (IK) solution remains challenging due to the infinite possible solutions.</p></sec><sec><title>Methods</title><p>This study proposes a reinforcement learning (RL) enhanced pseudo-inverse approach to address self-collision avoidance in redundant robots. The RL agent is integrated into the redundancy resolution process of a pseudo-inverse method to determine a suitable IK solution for avoiding self-collisions during task execution. Additionally, an improved replay buffer is implemented to enhance the performance of the RL algorithm.</p></sec><sec><title>Results</title><p>Simulations and experiments validate the effectiveness of the proposed method in reducing the risk of self-collision in redundant robots.</p></sec><sec><title>Conclusion</title><p>The RL enhanced pseudo-inverse approach presented in this study demonstrates promising results in mitigating self-collision risks in redundant robots, highlighting its potential for enhancing safety and performance in robotic systems.</p></sec>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140313516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.3389/fnbot.2024.1343644
Gongjun Fan, Qing Wang, Gaochao Yang, Pengfei Liu
High precision navigation and positioning technology, as a fundamental function, is gradually occupying an indispensable position in the various fields. However, a single sensor cannot meet the navigation requirements in different scenarios. This paper proposes a “plug and play” Vision/IMU/UWB multi-sensor tightly-coupled system based on factor graph. The difference from traditional UWB-based tightly-coupled models is that the Vision/IMU/UWB tightly-coupled model in this study uses UWB base station coordinates as parameters for real-time estimation without pre-calibrating UWB base stations. Aiming at the dynamic change of sensor availability in multi-sensor integrated navigation system and the serious problem of traditional factor graph in the weight distribution of observation information, this study proposes an adaptive robust factor graph model. Based on redundant measurement information, we propose a novel adaptive estimation model for UWB ranging covariance, which does not rely on prior information of the system and can adaptively estimate real-time covariance changes of UWB ranging. The algorithm proposed in this study was extensively tested in real-world scenarios, and the results show that the proposed system is superior to the most advanced combination method in all cases. Compared with the visual-inertial odometer based on the factor graph (FG-VIO), the RMSE is improved by 62.83 and 64.26% in scene 1 and 82.15, 70.32, and 75.29% in scene 2 (non-line-of-sight environment).
{"title":"RFG-TVIU: robust factor graph for tightly coupled vision/IMU/UWB integration","authors":"Gongjun Fan, Qing Wang, Gaochao Yang, Pengfei Liu","doi":"10.3389/fnbot.2024.1343644","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1343644","url":null,"abstract":"High precision navigation and positioning technology, as a fundamental function, is gradually occupying an indispensable position in the various fields. However, a single sensor cannot meet the navigation requirements in different scenarios. This paper proposes a “plug and play” Vision/IMU/UWB multi-sensor tightly-coupled system based on factor graph. The difference from traditional UWB-based tightly-coupled models is that the Vision/IMU/UWB tightly-coupled model in this study uses UWB base station coordinates as parameters for real-time estimation without pre-calibrating UWB base stations. Aiming at the dynamic change of sensor availability in multi-sensor integrated navigation system and the serious problem of traditional factor graph in the weight distribution of observation information, this study proposes an adaptive robust factor graph model. Based on redundant measurement information, we propose a novel adaptive estimation model for UWB ranging covariance, which does not rely on prior information of the system and can adaptively estimate real-time covariance changes of UWB ranging. The algorithm proposed in this study was extensively tested in real-world scenarios, and the results show that the proposed system is superior to the most advanced combination method in all cases. Compared with the visual-inertial odometer based on the factor graph (FG-VIO), the RMSE is improved by 62.83 and 64.26% in scene 1 and 82.15, 70.32, and 75.29% in scene 2 (non-line-of-sight environment).","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"8 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140810336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}