首页 > 最新文献

IEEE Robotics and Automation Letters最新文献

英文 中文
Correction To: “Design Models and Performance Analysis for a Novel Shape Memory Alloy-Actuated Wearable Hand Exoskeleton for Rehabilitation” 更正:"用于康复的新型形状记忆合金可穿戴手部外骨骼的设计模型和性能分析
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-20 DOI: 10.1109/LRA.2024.3495353
Elio Matteo Curcio;Francesco Lago;Giuseppe Carbone
Presents corrections to the article “Design Models and Performance Analysis for a Novel Shape Memory Alloy-Actuated Wearable Hand Exoskeleton for Rehabilitation”.
介绍对文章 "用于康复的新型形状记忆合金可穿戴手部外骨骼的设计模型和性能分析 "的更正。
{"title":"Correction To: “Design Models and Performance Analysis for a Novel Shape Memory Alloy-Actuated Wearable Hand Exoskeleton for Rehabilitation”","authors":"Elio Matteo Curcio;Francesco Lago;Giuseppe Carbone","doi":"10.1109/LRA.2024.3495353","DOIUrl":"https://doi.org/10.1109/LRA.2024.3495353","url":null,"abstract":"Presents corrections to the article “Design Models and Performance Analysis for a Novel Shape Memory Alloy-Actuated Wearable Hand Exoskeleton for Rehabilitation”.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11657-11657"},"PeriodicalIF":4.6,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10758911","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Diffusion-Based Data Generator for Training Object Recognition Models in Ultra-Range Distance 基于扩散的数据生成器,用于训练超远距离物体识别模型
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-14 DOI: 10.1109/LRA.2024.3498774
Eran Bamani;Eden Nissinman;Lisa Koenigsberg;Inbar Meir;Avishai Sintov
Object recognition, commonly performed by a camera, is a fundamental requirement for robots to complete complex tasks. Some tasks require recognizing objects far from the robot's camera. A challenging example is Ultra-Range Gesture Recognition (URGR) in human-robot interaction where the user exhibits directive gestures at a distance of up to 25 m from the robot. However, training a model to recognize hardly visible objects located in ultra-range requires an exhaustive collection of a significant amount of labeled samples. The generation of synthetic training datasets is a recent solution to the lack of real-world data, while unable to properly replicate the realistic visual characteristics of distant objects in images. In this letter, we propose the Diffusion in Ultra-Range (DUR) framework based on a Diffusion model to generate labeled images of distant objects in various scenes. The DUR generator receives a desired distance and class (e.g., gesture) and outputs a corresponding synthetic image. We apply DUR to train a URGR model with directive gestures in which fine details of the gesturing hand are challenging to distinguish. DUR is compared to other types of generative models showcasing superiority both in fidelity and in recognition success rate when training a URGR model. More importantly, training a DUR model on a limited amount of real data and then using it to generate synthetic data for training a URGR model outperforms directly training the URGR model on real data. The synthetic-based URGR model is also demonstrated in gesture-based direction of a ground robot.
物体识别通常由摄像头完成,是机器人完成复杂任务的基本要求。有些任务需要识别远离机器人摄像头的物体。一个具有挑战性的例子是人机交互中的超远距离手势识别(URGR),用户在距离机器人 25 米远的地方做出指令性手势。然而,要训练一个模型来识别超远距离内几乎不可见的物体,需要详尽地收集大量标注样本。合成训练数据集的生成是近年来解决真实世界数据缺乏问题的一种方法,但这种方法无法正确复制图像中远处物体的真实视觉特征。在这封信中,我们提出了基于扩散模型的超视距扩散(DUR)框架,用于生成各种场景中远处物体的标记图像。DUR 生成器接收所需的距离和类别(如手势),并输出相应的合成图像。我们将 DUR 应用于训练 URGR 模型,该模型具有指令性手势,其中手势的细节难以区分。我们将 DUR 与其他类型的生成模型进行了比较,结果表明,在训练 URGR 模型时,DUR 在逼真度和识别成功率方面都更胜一筹。更重要的是,在有限的真实数据上训练 DUR 模型,然后用它生成用于训练 URGR 模型的合成数据,其效果优于直接在真实数据上训练 URGR 模型。基于合成数据的 URGR 模型还在地面机器人的手势导航中得到了验证。
{"title":"A Diffusion-Based Data Generator for Training Object Recognition Models in Ultra-Range Distance","authors":"Eran Bamani;Eden Nissinman;Lisa Koenigsberg;Inbar Meir;Avishai Sintov","doi":"10.1109/LRA.2024.3498774","DOIUrl":"https://doi.org/10.1109/LRA.2024.3498774","url":null,"abstract":"Object recognition, commonly performed by a camera, is a fundamental requirement for robots to complete complex tasks. Some tasks require recognizing objects far from the robot's camera. A challenging example is Ultra-Range Gesture Recognition (URGR) in human-robot interaction where the user exhibits directive gestures at a distance of up to 25 m from the robot. However, training a model to recognize hardly visible objects located in ultra-range requires an exhaustive collection of a significant amount of labeled samples. The generation of synthetic training datasets is a recent solution to the lack of real-world data, while unable to properly replicate the realistic visual characteristics of distant objects in images. In this letter, we propose the Diffusion in Ultra-Range (DUR) framework based on a Diffusion model to generate labeled images of distant objects in various scenes. The DUR generator receives a desired distance and class (e.g., gesture) and outputs a corresponding synthetic image. We apply DUR to train a URGR model with directive gestures in which fine details of the gesturing hand are challenging to distinguish. DUR is compared to other types of generative models showcasing superiority both in fidelity and in recognition success rate when training a URGR model. More importantly, training a DUR model on a limited amount of real data and then using it to generate synthetic data for training a URGR model outperforms directly training the URGR model on real data. The synthetic-based URGR model is also demonstrated in gesture-based direction of a ground robot.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11722-11729"},"PeriodicalIF":4.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Position Prediction for Space Teleoperation With SAO-CNN-BiGRU-Attention Algorithm 利用 SAO-CNN-BiGRU-Attention 算法进行太空远程操作的位置预测
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-14 DOI: 10.1109/LRA.2024.3498700
Keli Wu;Haifei Chen;Lijun Li;Zhengxiong Liu;Haitao Chang
Robot position is a crucial information flow for space teleoperation, and the existence of time delay makes it actual asynchronous in sending and reception, greatly affecting the telepresence. To address this issue, this letter investigates the position prediction for space teleoperation and proposes an Snow Ablation Optimization (SAO)-CNN-BiGRU-Attention based prediction algorithm. Through prediction, the spatiotemporal synchronization of position information is achieved, thereby improving the telepresence. Firstly, based on the bilateral active estimation delay control framework, the CNN-BiGRU-Attention model is first introduced into position prediction for space teleoperation, where CNN serves for capturing the spatial feature relationship of the past position information, while BiGRU perceives its dynamic changes and combines Attention mechanism to focus on key feature, ultimately ensuring the accuracy of the prediction model. However, hyperparameter selection for the CNN-BiGRU-Attention model directly affects its prediction efficiency, and the custom selection way of hyperparameter obviously cannot guarantee optimality. To solve this problem, the SAO algorithm is introduced into the hyperparameter selection, utilizing its unique dual population mechanism and flexible position update equation to autonomously identify the optimal model hyperparameter and ensure optimal prediction efficiency. Finally, the effectiveness of the SAO-CNN-BiGRU-Attention algorithm was verified through comparative simulation experiments.
机器人位置是太空远程操作的关键信息流,时延的存在使其实际发送和接收不同步,极大地影响了远程呈现。针对这一问题,本文研究了空间遥感操作的位置预测,并提出了一种基于雪消融优化(SAO)-CNN-BiGRU-Attention 的预测算法。通过预测,实现了位置信息的时空同步,从而提高了远程呈现效果。首先,基于双边主动估计延迟控制框架,将 CNN-BiGRU-Attention 模型首次引入空间遥操作的位置预测中,其中 CNN 用于捕捉过去位置信息的空间特征关系,而 BiGRU 则感知其动态变化并结合 Attention 机制关注关键特征,最终确保预测模型的准确性。然而,CNN-BiGRU-Attention 模型的超参数选择直接影响其预测效率,而自定义选择超参数的方式显然不能保证最优。为解决这一问题,在超参数选择中引入了 SAO 算法,利用其独特的双种群机制和灵活的位置更新方程,自主确定最优模型超参数,确保最优预测效率。最后,通过对比模拟实验验证了 SAO-CNN-BiGRU-Attention 算法的有效性。
{"title":"Position Prediction for Space Teleoperation With SAO-CNN-BiGRU-Attention Algorithm","authors":"Keli Wu;Haifei Chen;Lijun Li;Zhengxiong Liu;Haitao Chang","doi":"10.1109/LRA.2024.3498700","DOIUrl":"https://doi.org/10.1109/LRA.2024.3498700","url":null,"abstract":"Robot position is a crucial information flow for space teleoperation, and the existence of time delay makes it actual asynchronous in sending and reception, greatly affecting the telepresence. To address this issue, this letter investigates the position prediction for space teleoperation and proposes an Snow Ablation Optimization (SAO)-CNN-BiGRU-Attention based prediction algorithm. Through prediction, the spatiotemporal synchronization of position information is achieved, thereby improving the telepresence. Firstly, based on the bilateral active estimation delay control framework, the CNN-BiGRU-Attention model is first introduced into position prediction for space teleoperation, where CNN serves for capturing the spatial feature relationship of the past position information, while BiGRU perceives its dynamic changes and combines Attention mechanism to focus on key feature, ultimately ensuring the accuracy of the prediction model. However, hyperparameter selection for the CNN-BiGRU-Attention model directly affects its prediction efficiency, and the custom selection way of hyperparameter obviously cannot guarantee optimality. To solve this problem, the SAO algorithm is introduced into the hyperparameter selection, utilizing its unique dual population mechanism and flexible position update equation to autonomously identify the optimal model hyperparameter and ensure optimal prediction efficiency. Finally, the effectiveness of the SAO-CNN-BiGRU-Attention algorithm was verified through comparative simulation experiments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11674-11681"},"PeriodicalIF":4.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Rigid-Flexible-Soft Coupled Dexterous Hand With Sliding Tactile Perception and Feedback 具有滑动触觉感知和反馈功能的刚柔软耦合灵巧手
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-13 DOI: 10.1109/LRA.2024.3497721
Rui Chen;Haixiang Zhang;Shamiao Zhou;Xiangjian Xu;Zean Yuan;Huijiang Wang;Jun Luo
The human hand is capable of executing a wide range of complex movements due to its biomechanical structure and skin sensing system. Designing an anthropomorphic hand that mimics the biomechanical structure of the human and incorporates skin sensing, presents a long-term challenge in the field of robotics. In this paper, we proposed a concept for structure design, which is to combine rigid, flexible and soft components, and designed a rigid-flexible-soft coupled dexterous hand based on this concept. For enhancing dexterous hand's adaptivity to environment, we also developed a soft piezoresistive tactile module inspired by human skin, and mounted it on fingertips to detect sliding states. Meanwhile, we have also designed an integrated system for dexterous manipulation including sensing, actuation and control, based on the concept of embodied intelligence, aiming to achieve a closed-loop control to dexterous hand. This letter provides a reliable structure and control strategy to enrich the perceptual abilities of the dexterous hand and enable their applications in unstructured environments.
由于其生物力学结构和皮肤传感系统,人类的手能够执行各种复杂的动作。模仿人类的生物力学结构并结合皮肤传感系统来设计拟人手是机器人领域的一项长期挑战。在本文中,我们提出了一种结构设计理念,即结合刚性、柔性和软性组件,并基于这一理念设计了一种刚柔结合的灵巧手。为了增强灵巧手对环境的适应能力,我们还开发了一种软压阻触觉模块,其灵感来源于人体皮肤,并将其安装在指尖以检测滑动状态。同时,我们还基于具身智能的概念,设计了一个包括传感、执行和控制在内的灵巧操控集成系统,旨在实现对灵巧手的闭环控制。这封信提供了可靠的结构和控制策略,丰富了灵巧手的感知能力,使其能够在非结构化环境中应用。
{"title":"A Rigid-Flexible-Soft Coupled Dexterous Hand With Sliding Tactile Perception and Feedback","authors":"Rui Chen;Haixiang Zhang;Shamiao Zhou;Xiangjian Xu;Zean Yuan;Huijiang Wang;Jun Luo","doi":"10.1109/LRA.2024.3497721","DOIUrl":"https://doi.org/10.1109/LRA.2024.3497721","url":null,"abstract":"The human hand is capable of executing a wide range of complex movements due to its biomechanical structure and skin sensing system. Designing an anthropomorphic hand that mimics the biomechanical structure of the human and incorporates skin sensing, presents a long-term challenge in the field of robotics. In this paper, we proposed a concept for structure design, which is to combine rigid, flexible and soft components, and designed a rigid-flexible-soft coupled dexterous hand based on this concept. For enhancing dexterous hand's adaptivity to environment, we also developed a soft piezoresistive tactile module inspired by human skin, and mounted it on fingertips to detect sliding states. Meanwhile, we have also designed an integrated system for dexterous manipulation including sensing, actuation and control, based on the concept of embodied intelligence, aiming to achieve a closed-loop control to dexterous hand. This letter provides a reliable structure and control strategy to enrich the perceptual abilities of the dexterous hand and enable their applications in unstructured environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11682-11689"},"PeriodicalIF":4.6,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated Grasping Controller Leveraging Optical Proximity Sensors for Simultaneous Contact, Impact Reduction, and Force Control 集成抓取控制器利用光学接近传感器实现同时接触、减少冲击和力控制
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-13 DOI: 10.1109/LRA.2024.3497726
Shunsuke Tokiwa;Hikaru Arita;Yosuke Suzuki;Kenji Tahara
Grasping an unknown object is difficult for robot hands. When the characteristics of the object are unknown, knowing how to plan the speed at and width to which the fingers are narrowed is difficult. In this letter, we propose a method to realize the three functions of simultaneous finger contact, impact reduction, and contact force control, which enable effective grasping of an unknown object. We accomplish this by using a control framework called multiple virtual dynamics-based control, which was proposed in a previous study. The advantage of this control is that multiple functions can be realized without switching control laws. The previous study achieved two functions, impact reduction and contact force control, with a two layers of impedance control which was applied independently to individual fingers. In this letter, a new idea of virtual dynamics that treats multiple fingers comprehensively is introduced, which enables the function of simultaneous contact without compromising the other two functions. This research provides a method to achieve delicate grasping by using proximity sensors.
对于机器手来说,抓取未知物体是一件非常困难的事情。在未知物体特征的情况下,如何规划手指缩窄的速度和宽度是一个难题。在这封信中,我们提出了一种方法来实现手指同时接触、减少冲击力和控制接触力这三种功能,从而有效地抓取未知物体。我们通过使用一种称为基于多重虚拟动力学控制的控制框架来实现这一目标,该框架是在之前的一项研究中提出的。这种控制方式的优势在于无需切换控制法则即可实现多种功能。之前的研究通过两层阻抗控制实现了减少冲击力和控制接触力这两种功能,这两层阻抗控制分别独立应用于各个手指。在这封信中,提出了一种综合处理多个手指的虚拟动力学新思路,在不影响其他两个功能的情况下实现了同时接触的功能。这项研究提供了一种利用接近传感器实现精细抓取的方法。
{"title":"Integrated Grasping Controller Leveraging Optical Proximity Sensors for Simultaneous Contact, Impact Reduction, and Force Control","authors":"Shunsuke Tokiwa;Hikaru Arita;Yosuke Suzuki;Kenji Tahara","doi":"10.1109/LRA.2024.3497726","DOIUrl":"https://doi.org/10.1109/LRA.2024.3497726","url":null,"abstract":"Grasping an unknown object is difficult for robot hands. When the characteristics of the object are unknown, knowing how to plan the speed at and width to which the fingers are narrowed is difficult. In this letter, we propose a method to realize the three functions of simultaneous finger contact, impact reduction, and contact force control, which enable effective grasping of an unknown object. We accomplish this by using a control framework called multiple virtual dynamics-based control, which was proposed in a previous study. The advantage of this control is that multiple functions can be realized without switching control laws. The previous study achieved two functions, impact reduction and contact force control, with a two layers of impedance control which was applied independently to individual fingers. In this letter, a new idea of virtual dynamics that treats multiple fingers comprehensively is introduced, which enables the function of simultaneous contact without compromising the other two functions. This research provides a method to achieve delicate grasping by using proximity sensors.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11633-11640"},"PeriodicalIF":4.6,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10752343","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-Motor-Driven (4 + 2)-Fingered Robotic Gripper Capable of Expanding the Workable Space in the Extremely Confined Environment 单电机驱动(4 + 2)翼机器人抓手可在极度狭窄的环境中扩展工作空间
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-13 DOI: 10.1109/LRA.2024.3497753
Toshihiro Nishimura;Keisuke Akasaka;Subaru Ishikawa;Tetsuyou Watanabe
This letter proposes a novel robotic gripper that can expand workable spaces in a target environment to pick up objects from confined spaces. The proposed gripper is most effective for retrieving objects from deformable environments, such as taking an object out of a drawstring bag, or for extracting target objects located behind surrounding objects. The proposed gripper achieves both work-space expansion and grasping motion by using only a single motor. The gripper is equipped with four outer fingers for expanding the environment and two inner fingers for grasping an object. The inner and outer fingers move in different directions for their respective functions of grasping and spatial expansion. To realize two different movements of the fingers, a novel self-motion switching mechanism that switches between the functions as feed-screw and rack-and-pinion mechanisms is developed. The mechanism switches the motions according to the magnitude of the force applied to the inner fingers. This letter presents the mechanism design of the developed gripper, including the self-motion switching mechanism and the actuation strategy for expanding the workable space. The mechanical analysis is also presented, and the analysis result is validated experimentally. Moreover, an automatic object-picking system using the developed gripper is constructed to evaluate the gripper.
这封信提出了一种新颖的机器人抓手,它可以在目标环境中扩大可工作空间,从狭窄的空间中拾取物体。建议的机械手对于从可变形的环境中取回物体最为有效,例如从拉绳袋中取出物体,或提取位于周围物体后面的目标物体。拟议的机械手只需使用一个电机就能实现工作空间扩展和抓取运动。该机械手配备了四个用于扩展环境的外手指和两个用于抓取物体的内手指。内手指和外手指向不同方向运动,分别实现抓取和空间扩展功能。为了实现手指的两种不同运动,开发了一种新型的自运动切换机构,可在进给螺杆和齿轮齿条机构的功能之间进行切换。该机构根据施加在内侧手指上的力的大小来切换运动。这封信介绍了所开发的机械手的机构设计,包括自运动切换机构和扩大工作空间的驱动策略。此外,还介绍了机械分析,并通过实验验证了分析结果。此外,还利用所开发的机械手构建了一个自动拾取物体系统,以对机械手进行评估。
{"title":"Single-Motor-Driven (4 + 2)-Fingered Robotic Gripper Capable of Expanding the Workable Space in the Extremely Confined Environment","authors":"Toshihiro Nishimura;Keisuke Akasaka;Subaru Ishikawa;Tetsuyou Watanabe","doi":"10.1109/LRA.2024.3497753","DOIUrl":"https://doi.org/10.1109/LRA.2024.3497753","url":null,"abstract":"This letter proposes a novel robotic gripper that can expand workable spaces in a target environment to pick up objects from confined spaces. The proposed gripper is most effective for retrieving objects from deformable environments, such as taking an object out of a drawstring bag, or for extracting target objects located behind surrounding objects. The proposed gripper achieves both work-space expansion and grasping motion by using only a single motor. The gripper is equipped with four outer fingers for expanding the environment and two inner fingers for grasping an object. The inner and outer fingers move in different directions for their respective functions of grasping and spatial expansion. To realize two different movements of the fingers, a novel self-motion switching mechanism that switches between the functions as feed-screw and rack-and-pinion mechanisms is developed. The mechanism switches the motions according to the magnitude of the force applied to the inner fingers. This letter presents the mechanism design of the developed gripper, including the self-motion switching mechanism and the actuation strategy for expanding the workable space. The mechanical analysis is also presented, and the analysis result is validated experimentally. Moreover, an automatic object-picking system using the developed gripper is constructed to evaluate the gripper.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11585-11592"},"PeriodicalIF":4.6,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking 基于可微优化神经策略的遮挡感知目标跟踪
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-13 DOI: 10.1109/LRA.2024.3497717
Houman Masnavi;Arun Kumar Singh;Farrokh Janabi-Sharifi
We propose a learned probabilistic neural policy for safe, occlusion-free target tracking. The core novelty of our work stems from the structure of our policy network that combines generative modeling based on Conditional Variational Autoencoder (CVAE) with differentiable optimization layers. The weights of the CVAE network and the parameters of the differentiable optimization can be learned in an end-to-end fashion through demonstration trajectories. We improve the state-of-the-art (SOTA) in the following respects. We show that our learned policy outperforms existing SOTA in terms of occlusion/collision avoidance capabilities and computation time. Second, we present an extensive ablation showing how different components of our learning pipeline contribute to the overall tracking task. We also demonstrate the real-time performance of our approach on resource-constrained hardware such as NVIDIA Jetson TX2. Finally, our learned policy can also be viewed as a reactive planner for navigation in highly cluttered environments.
我们提出了一种用于安全无遮挡目标跟踪的学习型概率神经策略。我们工作的核心创新源于我们的策略网络结构,它结合了基于条件变异自动编码器(CVAE)的生成模型和可微分优化层。CVAE 网络的权重和可微分优化的参数可以通过演示轨迹以端到端方式学习。我们在以下方面改进了最先进的技术(SOTA)。我们的研究表明,我们学习的策略在避免闭塞/碰撞能力和计算时间方面优于现有的 SOTA。其次,我们进行了广泛的消融,展示了我们学习管道的不同组成部分如何为整体跟踪任务做出贡献。我们还展示了我们的方法在资源有限的硬件(如英伟达 Jetson TX2)上的实时性能。最后,我们的学习策略还可被视为在高度杂乱环境中进行导航的反应式规划器。
{"title":"Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking","authors":"Houman Masnavi;Arun Kumar Singh;Farrokh Janabi-Sharifi","doi":"10.1109/LRA.2024.3497717","DOIUrl":"https://doi.org/10.1109/LRA.2024.3497717","url":null,"abstract":"We propose a learned probabilistic neural policy for safe, occlusion-free target tracking. The core novelty of our work stems from the structure of our policy network that combines generative modeling based on Conditional Variational Autoencoder (CVAE) with differentiable optimization layers. The weights of the CVAE network and the parameters of the differentiable optimization can be learned in an end-to-end fashion through demonstration trajectories. We improve the state-of-the-art (SOTA) in the following respects. We show that our learned policy outperforms existing SOTA in terms of occlusion/collision avoidance capabilities and computation time. Second, we present an extensive ablation showing how different components of our learning pipeline contribute to the overall tracking task. We also demonstrate the real-time performance of our approach on resource-constrained hardware such as NVIDIA Jetson TX2. Finally, our learned policy can also be viewed as a reactive planner for navigation in highly cluttered environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11714-11721"},"PeriodicalIF":4.6,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual-Inertial Localization Leveraging Skylight Polarization Pattern Constraints 利用天光极化模式约束进行视觉惯性定位
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-11 DOI: 10.1109/LRA.2024.3495375
Zhenhua Wan;Peng Fu;Kunfeng Wang;Kaichun Zhao
In this letter, we develop a tightly coupled polarization-visual-inertial localization system that utilizes naturally-attributed polarized skylight to provide a global heading. We introduce a focal plane polarization camera with negligible instantaneous field-of-view error to collect polarized skylight. Then, we design a robust heading determination method from polarized skylight and construct a global stable heading constraint. In particular, this constraint compensates for the heading unobservability present in standard VINS. In addition to the standard sparse visual feature measurements used in VINS, polarization heading residuals are constructed and co-optimized in a tightly-coupled VINS update. An adaptive fusion strategy is designed to correct the cumulative drift. Outdoor real-world experiments show that the proposed method outperforms state-of-the-art VINS-Fusion in terms of localization accuracy, and improves 22% over VINS-Fusion in a wooded campus environment.
在这封信中,我们开发了一种紧密耦合的偏振-视觉-惯性定位系统,该系统利用自然分布的偏振天光来提供全球航向。我们引入了一个可忽略瞬时视场误差的焦平面偏振相机来收集偏振天光。然后,我们设计了一种利用偏振天光确定航向的稳健方法,并构建了一个全局稳定航向约束。特别是,该约束条件弥补了标准 VINS 中存在的航向不可观测性。除了 VINS 中使用的标准稀疏视觉特征测量之外,还构建了偏振航向残差,并在紧密耦合的 VINS 更新中进行了共同优化。设计了一种自适应融合策略来纠正累积漂移。户外实际实验表明,就定位精度而言,所提出的方法优于最先进的 VINS-融合方法,在树木繁茂的校园环境中,比 VINS-融合方法提高了 22%。
{"title":"Visual-Inertial Localization Leveraging Skylight Polarization Pattern Constraints","authors":"Zhenhua Wan;Peng Fu;Kunfeng Wang;Kaichun Zhao","doi":"10.1109/LRA.2024.3495375","DOIUrl":"https://doi.org/10.1109/LRA.2024.3495375","url":null,"abstract":"In this letter, we develop a tightly coupled polarization-visual-inertial localization system that utilizes naturally-attributed polarized skylight to provide a global heading. We introduce a focal plane polarization camera with negligible instantaneous field-of-view error to collect polarized skylight. Then, we design a robust heading determination method from polarized skylight and construct a global stable heading constraint. In particular, this constraint compensates for the heading unobservability present in standard VINS. In addition to the standard sparse visual feature measurements used in VINS, polarization heading residuals are constructed and co-optimized in a tightly-coupled VINS update. An adaptive fusion strategy is designed to correct the cumulative drift. Outdoor real-world experiments show that the proposed method outperforms state-of-the-art VINS-Fusion in terms of localization accuracy, and improves 22% over VINS-Fusion in a wooded campus environment.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11481-11488"},"PeriodicalIF":4.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XBG: End-to-End Imitation Learning for Autonomous Behaviour in Human-Robot Interaction and Collaboration XBG:端到端模仿学习,实现人机交互与协作中的自主行为
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-11 DOI: 10.1109/LRA.2024.3495577
Carlos Cardenas-Perez;Giulio Romualdi;Mohamed Elobaid;Stefano Dafarra;Giuseppe L'Erario;Silvio Traversaro;Pietro Morerio;Alessio Del Bue;Daniele Pucci
This letter presents XBG (eXteroceptive Behaviour Generation), a multimodal end-to-end Imitation Learning (IL) system for whole-body autonomous humanoid robots used in real-world Human-Robot Interaction (HRI) scenarios. The main contribution is an architecture for learning HRI behaviours using a data-driven approach. A diverse dataset is collected via teleoperation, covering multiple HRI scenarios, such as handshaking, handwaving, payload reception, walking, and walking with a payload. After synchronizing, filtering, and transforming the data, we show how to train the presented Deep Neural Networks (DNN), integrating exteroceptive and proprioceptive information to help the robot understand both its environment and its actions. The robot takes in sequences of images (RGB and depth) and joints state information to react accordingly. By fusing multimodal signals over time, the model enables autonomous capabilities in a robotic platform. The models are evaluated based on the success rates in the mentioned HRI scenarios and they are deployed on the ergoCub humanoid robot. XBG achieves success rates between 60% and 100% even when tested in unseen environments.
这封信介绍了 XBG(eXteroceptive Behaviour Generation),这是一种多模态端到端模仿学习(IL)系统,适用于真实世界人机交互(HRI)场景中使用的全身自主仿人机器人。该系统的主要贡献在于采用数据驱动方法学习 HRI 行为的架构。通过远程操作收集了一个多样化的数据集,涵盖多种人机交互场景,如握手、挥手、接收有效载荷、行走和带着有效载荷行走。在对数据进行同步、过滤和转换后,我们展示了如何训练所呈现的深度神经网络(DNN),将外部感知和本体感知信息整合在一起,帮助机器人理解其环境和行动。机器人接收图像序列(RGB 和深度)和关节状态信息,并做出相应反应。通过长期融合多模态信号,该模型实现了机器人平台的自主能力。根据上述 HRI 场景中的成功率对模型进行了评估,并将其部署在 ergoCub 人形机器人上。即使在未知环境中进行测试,XBG 的成功率也在 60% 到 100% 之间。
{"title":"XBG: End-to-End Imitation Learning for Autonomous Behaviour in Human-Robot Interaction and Collaboration","authors":"Carlos Cardenas-Perez;Giulio Romualdi;Mohamed Elobaid;Stefano Dafarra;Giuseppe L'Erario;Silvio Traversaro;Pietro Morerio;Alessio Del Bue;Daniele Pucci","doi":"10.1109/LRA.2024.3495577","DOIUrl":"https://doi.org/10.1109/LRA.2024.3495577","url":null,"abstract":"This letter presents XBG (eXteroceptive Behaviour Generation), a multimodal end-to-end Imitation Learning (IL) system for whole-body autonomous humanoid robots used in real-world Human-Robot Interaction (HRI) scenarios. The main contribution is an architecture for learning HRI behaviours using a data-driven approach. A diverse dataset is collected via teleoperation, covering multiple HRI scenarios, such as handshaking, handwaving, payload reception, walking, and walking with a payload. After synchronizing, filtering, and transforming the data, we show how to train the presented Deep Neural Networks (DNN), integrating exteroceptive and proprioceptive information to help the robot understand both its environment and its actions. The robot takes in sequences of images (RGB and depth) and joints state information to react accordingly. By fusing multimodal signals over time, the model enables autonomous capabilities in a robotic platform. The models are evaluated based on the success rates in the mentioned HRI scenarios and they are deployed on the ergoCub humanoid robot. XBG achieves success rates between 60% and 100% even when tested in unseen environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11617-11624"},"PeriodicalIF":4.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical Videos 逐步思考:机器人手术视频中的错误检测手势链提示
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-11-11 DOI: 10.1109/LRA.2024.3495452
Zhimin Shao;Jialang Xu;Danail Stoyanov;Evangelos B. Mazomenos;Yueming Jin
Despite advancements in robotic systems and surgical data science, ensuring safe execution in robot-assisted minimally invasive surgery (RMIS) remains challenging. Current methods for surgical error detection typically involve two parts: identifying gestures and then detecting errors within each gesture clip. These methods often overlook the rich contextual and semantic information inherent in surgical videos, with limited performance due to reliance on accurate gesture identification. Inspired by the chain-of-thought prompting in natural language processing, this letter presents a novel and real-time end-to-end error detection framework, Chain-of-Gesture (COG) prompting, integrating contextual information from surgical videos step by step. This encompasses two reasoning modules that simulate expert surgeons' decision-making: a Gestural-Visual Reasoning module using transformer and attention architectures for gesture prompting and a Multi-Scale Temporal Reasoning module employing a multi-stage temporal convolutional network with slow and fast paths for temporal information extraction. We validate our method on the JIGSAWS dataset and show improvements over the state-of-the-art, achieving 4.6% higher F1 score, 4.6% higher Accuracy, and 5.9% higher Jaccard index, with an average frame processing time of 6.69 milliseconds. This demonstrates our approach's potential to enhance RMIS safety and surgical education efficacy.
尽管机器人系统和手术数据科学取得了进步,但确保机器人辅助微创手术(RMIS)的安全实施仍然充满挑战。目前的手术错误检测方法通常包括两个部分:识别手势,然后检测每个手势片段中的错误。这些方法往往忽略了手术视频中固有的丰富上下文和语义信息,由于依赖于准确的手势识别,因此性能有限。受自然语言处理中的 "思维链提示 "启发,这封信提出了一个新颖、实时的端到端错误检测框架--"手势链(COG)提示",逐步整合手术视频中的上下文信息。其中包括两个模拟外科医生决策的推理模块:一个是手势-视觉推理模块,使用变换器和注意力架构进行手势提示;另一个是多尺度时态推理模块,使用多级时态卷积网络的慢速和快速路径进行时态信息提取。我们在 JIGSAWS 数据集上验证了我们的方法,结果表明我们的方法比最先进的方法有所改进,F1 分数提高了 4.6%,准确率提高了 4.6%,Jaccard 指数提高了 5.9%,平均帧处理时间为 6.69 毫秒。这表明我们的方法具有提高 RMIS 安全性和手术教育效果的潜力。
{"title":"Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical Videos","authors":"Zhimin Shao;Jialang Xu;Danail Stoyanov;Evangelos B. Mazomenos;Yueming Jin","doi":"10.1109/LRA.2024.3495452","DOIUrl":"https://doi.org/10.1109/LRA.2024.3495452","url":null,"abstract":"Despite advancements in robotic systems and surgical data science, ensuring safe execution in robot-assisted minimally invasive surgery (RMIS) remains challenging. Current methods for surgical error detection typically involve two parts: identifying gestures and then detecting errors within each gesture clip. These methods often overlook the rich contextual and semantic information inherent in surgical videos, with limited performance due to reliance on accurate gesture identification. Inspired by the chain-of-thought prompting in natural language processing, this letter presents a novel and real-time end-to-end error detection framework, Chain-of-Gesture (COG) prompting, integrating contextual information from surgical videos step by step. This encompasses two reasoning modules that simulate expert surgeons' decision-making: a Gestural-Visual Reasoning module using transformer and attention architectures for gesture prompting and a Multi-Scale Temporal Reasoning module employing a multi-stage temporal convolutional network with slow and fast paths for temporal information extraction. We validate our method on the JIGSAWS dataset and show improvements over the state-of-the-art, achieving 4.6% higher F1 score, 4.6% higher Accuracy, and 5.9% higher Jaccard index, with an average frame processing time of 6.69 milliseconds. This demonstrates our approach's potential to enhance RMIS safety and surgical education efficacy.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11513-11520"},"PeriodicalIF":4.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Robotics and Automation Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1