首页 > 最新文献

IEEE Robotics and Automation Letters最新文献

英文 中文
Learning Based Estimation of Tool-Tissue Interaction Forces for Stationary and Moving Environments 基于学习的静态和移动环境下工具与组织相互作用力的估计
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-30 DOI: 10.1109/LRA.2024.3488400
L. Nowakowski;R. V. Patel
Accurately estimating tool-tissue interaction forces during robotics-assisted minimally invasive surgery is an important aspect of enabling haptics-based teleoperation. By collecting data regarding the state of a robot in a variety of configurations, neural networks can be trained to predict this interaction force. This paper extends existing work in this domain based on collecting one of the largest known ground truth force datasets for stationary as well as moving phantoms that replicate tissue motions found in clinical procedures. Existing methods, and a new transformer-based architecture, are evaluated to demonstrate the domain gap between stationary and moving phantom tissue data and the impact that data scaling has on each architecture's ability to generalize the force estimation task. It was found that temporal networks were more sensitive to the moving domain than single-sample Feed Forward Networks (FFNs) that were trained on stationary tissue data. However, the transformer approach results in the lowest Root Mean Square Error (RMSE) when evaluating networks trained on examples of both stationary and moving phantom tissue samples. The results demonstrate the domain gap between stationary and moving surgical environments and the effectiveness of scaling datasets for increased accuracy of interaction force prediction.
准确估计机器人辅助微创手术过程中工具与组织的相互作用力是实现基于触觉的远程操作的一个重要方面。通过收集机器人在各种配置下的状态数据,可以训练神经网络来预测这种相互作用力。本文基于收集已知最大的地面真实力数据集之一,对该领域的现有工作进行了扩展,该数据集用于静止和移动模型,复制了临床手术中发现的组织运动。对现有方法和基于变压器的新架构进行了评估,以证明静态和移动模型组织数据之间的领域差距,以及数据缩放对每种架构概括力估算任务能力的影响。结果发现,与在静态组织数据上训练的单样本前馈网络(FFN)相比,时态网络对移动域更加敏感。不过,在评估根据静态和移动幻影组织样本训练的网络时,变换器方法的均方根误差(RMSE)最小。结果证明了静止和移动手术环境之间的领域差距,以及扩展数据集以提高相互作用力预测准确性的有效性。
{"title":"Learning Based Estimation of Tool-Tissue Interaction Forces for Stationary and Moving Environments","authors":"L. Nowakowski;R. V. Patel","doi":"10.1109/LRA.2024.3488400","DOIUrl":"https://doi.org/10.1109/LRA.2024.3488400","url":null,"abstract":"Accurately estimating tool-tissue interaction forces during robotics-assisted minimally invasive surgery is an important aspect of enabling haptics-based teleoperation. By collecting data regarding the state of a robot in a variety of configurations, neural networks can be trained to predict this interaction force. This paper extends existing work in this domain based on collecting one of the largest known ground truth force datasets for stationary as well as moving phantoms that replicate tissue motions found in clinical procedures. Existing methods, and a new transformer-based architecture, are evaluated to demonstrate the domain gap between stationary and moving phantom tissue data and the impact that data scaling has on each architecture's ability to generalize the force estimation task. It was found that temporal networks were more sensitive to the moving domain than single-sample Feed Forward Networks (FFNs) that were trained on stationary tissue data. However, the transformer approach results in the lowest Root Mean Square Error (RMSE) when evaluating networks trained on examples of both stationary and moving phantom tissue samples. The results demonstrate the domain gap between stationary and moving surgical environments and the effectiveness of scaling datasets for increased accuracy of interaction force prediction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11266-11273"},"PeriodicalIF":4.6,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Variational DeepMDP: An Efficient Approach for Industrial Assembly in High-Mix, Low-Volume Production 多模态变式 DeepMDP:用于多品种、小批量生产中的工业装配的高效方法
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487490
Grzegorz Bartyzel
Transferability, along with sample efficiency, is a critical factor for a reinforcement learning (RL) agent's successful application in real-world contact-rich manipulation tasks, such as product assembly. For instance, in the case of the industrial insertion task on high-mix, low-volume (HMLV) production lines, transferability could eliminate the need for machine retooling, thus reducing production line downtimes. In our work, we introduce a method called Multimodal Variational DeepMDP (MVDeepMDP) that demonstrates the ability to generalize to various environmental variations not encountered during training. The key feature of our approach involves learning a multimodal latent dynamic representation. We demonstrate the effectiveness of our method in the context of an electronic parts insertion task, which is challenging for RL agents due to the diverse physical properties of the non-standardized components, as well as simple 3D printed blocks insertion. Furthermore, we evaluate the transferability of MVDeepMDP and analyze the impact of the balancing mechanism of the generalized Product-of-Experts (gPoE), which is used to combine observable modalities. Finally, we explore the influence of separately processing state modalities of different physical quantities, such as pose and 6D force/torque (F/T) data.
可转移性以及样本效率是强化学习(RL)代理成功应用于现实世界中产品组装等接触性操作任务的关键因素。例如,在多品种、小批量(HMLV)生产线上的工业插装任务中,可转移性可以消除机器重装的需要,从而减少生产线停机时间。在我们的工作中,我们引入了一种名为多模态变异 DeepMDP(MVDeepMDP)的方法,该方法展示了对训练期间未遇到的各种环境变化进行泛化的能力。我们方法的主要特点是学习多模态潜在动态表示。我们在电子零件插入任务中演示了该方法的有效性,由于非标准化组件的物理特性各不相同,该任务对 RL 代理以及简单的 3D 打印块插入具有挑战性。此外,我们还评估了 MVDeepMDP 的可移植性,并分析了广义专家产品(gPoE)平衡机制的影响,该机制用于结合可观察的模式。最后,我们探讨了分别处理不同物理量的状态模态(如姿势和 6D 力/力矩 (F/T) 数据)的影响。
{"title":"Multimodal Variational DeepMDP: An Efficient Approach for Industrial Assembly in High-Mix, Low-Volume Production","authors":"Grzegorz Bartyzel","doi":"10.1109/LRA.2024.3487490","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487490","url":null,"abstract":"Transferability, along with sample efficiency, is a critical factor for a reinforcement learning (RL) agent's successful application in real-world contact-rich manipulation tasks, such as product assembly. For instance, in the case of the industrial insertion task on high-mix, low-volume (HMLV) production lines, transferability could eliminate the need for machine retooling, thus reducing production line downtimes. In our work, we introduce a method called Multimodal Variational DeepMDP (MVDeepMDP) that demonstrates the ability to generalize to various environmental variations not encountered during training. The key feature of our approach involves learning a multimodal latent dynamic representation. We demonstrate the effectiveness of our method in the context of an electronic parts insertion task, which is challenging for RL agents due to the diverse physical properties of the non-standardized components, as well as simple 3D printed blocks insertion. Furthermore, we evaluate the transferability of MVDeepMDP and analyze the impact of the balancing mechanism of the \u0000<italic>generalized Product-of-Experts</i>\u0000 (gPoE), which is used to combine observable modalities. Finally, we explore the influence of separately processing state modalities of different physical quantities, such as pose and 6D force/torque (F/T) data.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11297-11304"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safe and Efficient Multi-Agent Collision Avoidance With Physics-Informed Reinforcement Learning 利用物理信息强化学习实现安全高效的多代理碰撞规避
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487491
Pu Feng;Rongye Shi;Size Wang;Junkang Liang;Xin Yu;Simin Li;Wenjun Wu
Reinforcement learning (RL) has shown great promise in addressing multi-agent collision avoidance challenges. However, existing RL-based methods often suffer from low training efficiency and poor action safety. To tackle these issues, we introduce a physics-informed reinforcement learning framework equipped with two modules: a Potential Field (PF) module and a Multi-Agent Multi-Level Safety (MAMLS) module. The PF module uses the Artificial Potential Field method to compute a regularization loss, adaptively integrating it into the critic's loss to enhance training efficiency. The MAMLS module formulates action safety as a constrained optimization problem, deriving safe actions by solving this optimization. Furthermore, to better address the characteristics of multi-agent collision avoidance tasks, multi-agent multi-level constraints are introduced. The results of simulations and real-world experiments showed that our physics-informed framework offers a significant improvement in terms of both the efficiency of training and safety-related metrics over advanced baseline methods.
强化学习(RL)在解决多机器人防撞难题方面已显示出巨大前景。然而,现有的基于 RL 的方法往往存在训练效率低和行动安全性差的问题。为了解决这些问题,我们引入了一个物理信息强化学习框架,该框架配备了两个模块:势场(PF)模块和多代理多级安全(MAMLS)模块。PF 模块使用人工势场方法计算正则化损失,并自适应地将其整合到批评者损失中,以提高训练效率。MAMLS 模块将行动安全视为一个约束优化问题,通过求解该优化问题得出安全行动。此外,为了更好地应对多机器人防碰撞任务的特点,还引入了多机器人多级约束。模拟和实际实验结果表明,与先进的基线方法相比,我们的物理信息框架在训练效率和安全相关指标方面都有显著提高。
{"title":"Safe and Efficient Multi-Agent Collision Avoidance With Physics-Informed Reinforcement Learning","authors":"Pu Feng;Rongye Shi;Size Wang;Junkang Liang;Xin Yu;Simin Li;Wenjun Wu","doi":"10.1109/LRA.2024.3487491","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487491","url":null,"abstract":"Reinforcement learning (RL) has shown great promise in addressing multi-agent collision avoidance challenges. However, existing RL-based methods often suffer from low training efficiency and poor action safety. To tackle these issues, we introduce a physics-informed reinforcement learning framework equipped with two modules: a Potential Field (PF) module and a Multi-Agent Multi-Level Safety (MAMLS) module. The PF module uses the Artificial Potential Field method to compute a regularization loss, adaptively integrating it into the critic's loss to enhance training efficiency. The MAMLS module formulates action safety as a constrained optimization problem, deriving safe actions by solving this optimization. Furthermore, to better address the characteristics of multi-agent collision avoidance tasks, multi-agent multi-level constraints are introduced. The results of simulations and real-world experiments showed that our physics-informed framework offers a significant improvement in terms of both the efficiency of training and safety-related metrics over advanced baseline methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11138-11145"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
D2S: Representing Sparse Descriptors and 3D Coordinates for Camera Relocalization D2S:表示稀疏描述符和三维坐标,实现相机重定位
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487503
Bach-Thuan Bui;Huy-Hoang Bui;Dinh-Tuan Tran;Joo-Ho Lee
State-of-the-art visual localization methods mostly rely on complex procedures to match local descriptors and 3D point clouds. However, these procedures can incur significant costs in terms of inference, storage, and updates over time. In this study, we propose a direct learning-based approach that utilizes a simple network named D2S to represent complex local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. It solely leverages a single RGB image for localization during the testing phase and only requires a lightweight model to encode a complex sparse scene. The proposed D2S employs a combination of a simple loss function and graph attention to selectively focus on robust descriptors while disregarding areas such as clouds, trees, and several dynamic objects. This selective attention enables D2S to effectively perform a binary-semantic classification for sparse descriptors. Additionally, we propose a simple outdoor dataset to evaluate the capabilities of visual localization methods in scene-specific generalization and self-updating from unlabeled observations. Our approach outperforms the previous regression-based methods in both indoor and outdoor environments. It demonstrates the ability to generalize beyond training data, including scenarios involving transitions from day to night and adapting to domain shifts.
最先进的视觉定位方法大多依赖复杂的程序来匹配局部描述符和三维点云。然而,这些程序在推理、存储和随时间更新方面会产生巨大的成本。在本研究中,我们提出了一种基于直接学习的方法,利用名为 D2S 的简单网络来表示复杂的局部描述符及其场景坐标。我们的方法具有简单和成本效益高的特点。在测试阶段,它只需利用单张 RGB 图像进行定位,只需一个轻量级模型即可对复杂的稀疏场景进行编码。所提出的 D2S 结合使用了简单的损失函数和图注意,选择性地关注稳健描述符,而忽略云、树和一些动态物体等区域。这种选择性关注使 D2S 能够有效地对稀疏描述符进行二元语义分类。此外,我们还提出了一个简单的室外数据集,以评估视觉定位方法在特定场景泛化和从无标记观测中进行自我更新方面的能力。在室内和室外环境中,我们的方法都优于之前基于回归的方法。它展示了超越训练数据的泛化能力,包括从白天到夜晚的场景转换以及适应领域变化的能力。
{"title":"D2S: Representing Sparse Descriptors and 3D Coordinates for Camera Relocalization","authors":"Bach-Thuan Bui;Huy-Hoang Bui;Dinh-Tuan Tran;Joo-Ho Lee","doi":"10.1109/LRA.2024.3487503","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487503","url":null,"abstract":"State-of-the-art visual localization methods mostly rely on complex procedures to match local descriptors and 3D point clouds. However, these procedures can incur significant costs in terms of inference, storage, and updates over time. In this study, we propose a direct learning-based approach that utilizes a simple network named D2S to represent complex local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. It solely leverages a single RGB image for localization during the testing phase and only requires a lightweight model to encode a complex sparse scene. The proposed D2S employs a combination of a simple loss function and graph attention to selectively focus on robust descriptors while disregarding areas such as clouds, trees, and several dynamic objects. This selective attention enables D2S to effectively perform a binary-semantic classification for sparse descriptors. Additionally, we propose a simple outdoor dataset to evaluate the capabilities of visual localization methods in scene-specific generalization and self-updating from unlabeled observations. Our approach outperforms the previous regression-based methods in both indoor and outdoor environments. It demonstrates the ability to generalize beyond training data, including scenarios involving transitions from day to night and adapting to domain shifts.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11449-11456"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Art of Imitation: Learning Long-Horizon Manipulation Tasks From Few Demonstrations 模仿的艺术:从少量演示中学习远距离操作任务
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487506
Jan Ole von Hartz;Tim Welschehold;Abhinav Valada;Joschka Boedecker
Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize the robot's end-effector velocity into its direction and magnitude, and model them using Riemannian GMMs. Second, we leverage the factorized velocities to segment and sequence skills from complex demonstration trajectories. Through the segmentation, we further align skill trajectories and hence leverage time as a powerful inductive bias. Third, we present a method to automatically detect relevant task parameters per skill from visual observations. Our approach enables learning complex manipulation tasks from just five demonstrations while using only RGB-D observations. Extensive experimental evaluations on RLBench demonstrate that our approach achieves state-of-the-art performance with 20-fold improved sample efficiency. Our policies generalize across different environments, object instances, and object positions, while the learned skills are reusable.
任务参数化高斯混合模型(TP-GMM)是学习以物体为中心的机器人操纵任务的一种样本高效方法。然而,在野外应用 TP-GMM 时还面临一些挑战。在这项工作中,我们协同应对了三个关键挑战。首先,末端执行器的速度是非欧几里得的,因此很难使用标准 GMM 建模。因此,我们建议将机器人的末端执行器速度因子化为方向和幅度,并使用黎曼 GMM 建立模型。其次,我们利用因子化速度对复杂的演示轨迹进行分割和技能排序。通过分割,我们进一步调整技能轨迹,从而利用时间作为强大的归纳偏倚。第三,我们提出了一种从视觉观察中自动检测每个技能的相关任务参数的方法。我们的方法只需使用 RGB-D 观察结果,就能从五个演示中学习复杂的操作任务。在 RLBench 上进行的广泛实验评估表明,我们的方法达到了最先进的性能,样本效率提高了 20 倍。我们的策略可在不同环境、对象实例和对象位置之间通用,同时所学技能可重复使用。
{"title":"The Art of Imitation: Learning Long-Horizon Manipulation Tasks From Few Demonstrations","authors":"Jan Ole von Hartz;Tim Welschehold;Abhinav Valada;Joschka Boedecker","doi":"10.1109/LRA.2024.3487506","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487506","url":null,"abstract":"Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize the robot's end-effector velocity into its direction and magnitude, and model them using Riemannian GMMs. Second, we leverage the factorized velocities to segment and sequence skills from complex demonstration trajectories. Through the segmentation, we further align skill trajectories and hence leverage time as a powerful inductive bias. Third, we present a method to automatically detect relevant task parameters \u0000<italic>per</i>\u0000 skill from visual observations. Our approach enables learning complex manipulation tasks from just five demonstrations while using only RGB-D observations. Extensive experimental evaluations on RLBench demonstrate that our approach achieves state-of-the-art performance with 20-fold improved sample efficiency. Our policies generalize across different environments, object instances, and object positions, while the learned skills are reusable.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11369-11376"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MiniTac: An Ultra-Compact $text{8 mm}$ Vision-Based Tactile Sensor for Enhanced Palpation in Robot-Assisted Minimally Invasive Surgery MiniTac:基于视觉的超小型触觉传感器($text{8 mm}$),用于增强机器人辅助微创手术中的触诊功能
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487516
Wanlin Li;Zihang Zhao;Leiyao Cui;Weiyi Zhang;Hangxin Liu;Li-An Li;Yixin Zhu
Robot-assisted minimally invasive surgery (RAMIS) provides substantial benefits over traditional open and laparoscopic methods. However, a significant limitation of robot-assisted minimally invasive surgery (RAMIS) is the surgeon's inability to palpate tissues, a crucial technique for examining tissue properties and detecting abnormalities, restricting the widespread adoption of RAMIS. To overcome this obstacle, we introduce MiniTac, a novel vision-based tactile sensor with an ultra-compact cross-sectional diameter of 8mm, designed for seamless integration into mainstream RAMIS devices, particularly the Da Vinci surgical systems. MiniTac features a novel mechanoresponsive photonic elastomer membrane that changes color distribution under varying contact pressures. This color change is captured by an embedded miniature camera, allowing MiniTac to detect tumors both on the tissue surface and in deeper layers typically obscured from endoscopic view. MiniTac's efficacy has been rigorously tested on both phantoms and ex-vivo tissues. By leveraging advanced mechanoresponsive photonic materials, MiniTac represents a significant advancement in integrating tactile sensing into RAMIS, potentially expanding its applicability to a wider array of clinical scenarios that currently rely on traditional surgical approaches.
与传统的开腹和腹腔镜方法相比,机器人辅助微创手术(RAMIS)具有很大的优势。然而,机器人辅助微创手术(RAMIS)的一大局限是外科医生无法触诊组织,而触诊是检查组织特性和检测异常的关键技术,这限制了 RAMIS 的广泛应用。为了克服这一障碍,我们推出了基于视觉的新型触觉传感器 MiniTac,它的横截面直径只有 8 毫米,非常小巧,可无缝集成到主流 RAMIS 设备中,尤其是达芬奇手术系统。MiniTac 采用新型机械传导性光子弹性体膜,在不同的接触压力下会改变颜色分布。这种颜色变化由嵌入式微型摄像头捕捉,使 MiniTac 既能检测组织表面的肿瘤,也能检测通常被内窥镜遮挡的深层肿瘤。MiniTac 的功效已在模型和体外组织上进行了严格测试。通过利用先进的机械响应光子材料,MiniTac 在将触觉传感集成到 RAMIS 方面取得了重大进展,有可能将其应用范围扩大到目前依赖传统手术方法的更广泛的临床场景。
{"title":"MiniTac: An Ultra-Compact $text{8 mm}$ Vision-Based Tactile Sensor for Enhanced Palpation in Robot-Assisted Minimally Invasive Surgery","authors":"Wanlin Li;Zihang Zhao;Leiyao Cui;Weiyi Zhang;Hangxin Liu;Li-An Li;Yixin Zhu","doi":"10.1109/LRA.2024.3487516","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487516","url":null,"abstract":"Robot-assisted minimally invasive surgery (RAMIS) provides substantial benefits over traditional open and laparoscopic methods. However, a significant limitation of robot-assisted minimally invasive surgery (RAMIS) is the surgeon's inability to palpate tissues, a crucial technique for examining tissue properties and detecting abnormalities, restricting the widespread adoption of RAMIS. To overcome this obstacle, we introduce MiniTac, a novel vision-based tactile sensor with an ultra-compact cross-sectional diameter of 8mm, designed for seamless integration into mainstream RAMIS devices, particularly the Da Vinci surgical systems. MiniTac features a novel mechanoresponsive photonic elastomer membrane that changes color distribution under varying contact pressures. This color change is captured by an embedded miniature camera, allowing MiniTac to detect tumors both on the tissue surface and in deeper layers typically obscured from endoscopic view. MiniTac's efficacy has been rigorously tested on both phantoms and ex-vivo tissues. By leveraging advanced mechanoresponsive photonic materials, MiniTac represents a significant advancement in integrating tactile sensing into RAMIS, potentially expanding its applicability to a wider array of clinical scenarios that currently rely on traditional surgical approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11170-11177"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Based Exteroception of Soft Underwater Manipulator With Soft Actuator Network 基于学习的软执行器网络水下软机械手外感知技术
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487512
Kailuan Tang;Shaowu Tang;Chenghua Lu;Shijian Wu;Sicong Liu;Juan Yi;Jian S. Dai;Zheng Wang
Interactions with environmental objects can induce substantial alterations in both exteroceptive and proprioceptive signals. However, the deployment of exteroceptive sensors within underwater soft manipulators encounters numerous challenges and constraints, thereby imposing limitations on their perception capabilities. In this article, we present a novel learning-based exteroceptive approach that utilizes internal proprioceptive signals and harnesses the principles of soft actuator network (SAN). Deformation and vibration resulting from external collisions tend to propagate through the SANs in underwater soft manipulators and can be detected by proprioceptive sensors. We extract features from the sensor signals and develop a fully-connected neural network (FCNN)-based classifier to determine collision positions. We have constructed a training dataset and an independent validation dataset for the purpose of training and validating the classifier. The experimental results affirm that the proposed method can identify collision locations with an accuracy level of 97.11% using the independent validation dataset, which exhibits potential applications within the domain of underwater soft robotics perception and control.
与环境物体的相互作用会引起外部感觉和本体感觉信号的巨大变化。然而,在水下软机械手中部署外感知传感器会遇到许多挑战和限制,从而对其感知能力造成限制。在这篇文章中,我们提出了一种基于学习的新型外感知方法,它利用内部本体感觉信号和软致动器网络(SAN)原理。外部碰撞产生的变形和振动往往会通过水下软体机械手的 SAN 传播,并可被本体感觉传感器检测到。我们从传感器信号中提取特征,并开发了基于全连接神经网络(FCNN)的分类器来确定碰撞位置。我们构建了一个训练数据集和一个独立的验证数据集,用于训练和验证分类器。实验结果表明,利用独立验证数据集,所提出的方法能够以 97.11% 的准确率识别碰撞位置,在水下软机器人感知和控制领域具有潜在的应用前景。
{"title":"Learning Based Exteroception of Soft Underwater Manipulator With Soft Actuator Network","authors":"Kailuan Tang;Shaowu Tang;Chenghua Lu;Shijian Wu;Sicong Liu;Juan Yi;Jian S. Dai;Zheng Wang","doi":"10.1109/LRA.2024.3487512","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487512","url":null,"abstract":"Interactions with environmental objects can induce substantial alterations in both exteroceptive and proprioceptive signals. However, the deployment of exteroceptive sensors within underwater soft manipulators encounters numerous challenges and constraints, thereby imposing limitations on their perception capabilities. In this article, we present a novel learning-based exteroceptive approach that utilizes internal proprioceptive signals and harnesses the principles of soft actuator network (SAN). Deformation and vibration resulting from external collisions tend to propagate through the SANs in underwater soft manipulators and can be detected by proprioceptive sensors. We extract features from the sensor signals and develop a fully-connected neural network (FCNN)-based classifier to determine collision positions. We have constructed a training dataset and an independent validation dataset for the purpose of training and validating the classifier. The experimental results affirm that the proposed method can identify collision locations with an accuracy level of 97.11% using the independent validation dataset, which exhibits potential applications within the domain of underwater soft robotics perception and control.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11082-11089"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-Structure: Structural Benchmark Dataset for SLAM Algorithms 开放式结构:用于 SLAM 算法的结构基准数据集
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487071
Yanyan Li;Zhao Guo;Ze Yang;Yanbiao Sun;Liang Zhao;Federico Tombari
This letter presents Open-Structure, a novel benchmark dataset for evaluating visual odometry and SLAM methods. Compared to existing public datasets that primarily offer raw images, Open-Structure provides direct access to point and line measurements, correspondences, structural associations, and co-visibility factor graphs, which can be fed to various stages of SLAM pipelines to mitigate the impact of data preprocessing modules in ablation experiments. The dataset comprises two distinct types of sequences from the perspective of scenarios. The first type maintains reasonable observation and occlusion relationships, as these critical elements are extracted from public image-based sequences using our dataset generator. In contrast, the second type consists of carefully designed simulation sequences that enhance dataset diversity by introducing a wide range of trajectories and observations. Furthermore, a baseline is proposed using our dataset to evaluate widely used modules, including camera pose tracking, parametrization, and factor graph optimization, within SLAM systems. By evaluating these state-of-the-art algorithms across different scenarios, we discern each module's strengths and weaknesses in the context of camera tracking and optimization processes.
本文介绍了用于评估视觉里程测量和 SLAM 方法的新型基准数据集 Open-Structure。与主要提供原始图像的现有公共数据集相比,Open-Structure 数据集可直接获取点和线的测量结果、对应关系、结构关联和共视因子图,这些数据可输入 SLAM 管道的各个阶段,以减轻消融实验中数据预处理模块的影响。从场景的角度来看,该数据集包括两种不同类型的序列。第一种类型保持了合理的观察和遮挡关系,因为这些关键要素是利用我们的数据集生成器从基于图像的公共序列中提取的。相比之下,第二种类型由精心设计的模拟序列组成,通过引入各种轨迹和观测数据来增强数据集的多样性。此外,我们还提出了一个基线,利用我们的数据集来评估 SLAM 系统中广泛使用的模块,包括相机姿态跟踪、参数化和因子图优化。通过在不同场景下对这些先进算法进行评估,我们发现了每个模块在摄像机跟踪和优化过程中的优缺点。
{"title":"Open-Structure: Structural Benchmark Dataset for SLAM Algorithms","authors":"Yanyan Li;Zhao Guo;Ze Yang;Yanbiao Sun;Liang Zhao;Federico Tombari","doi":"10.1109/LRA.2024.3487071","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487071","url":null,"abstract":"This letter presents Open-Structure, a novel benchmark dataset for evaluating visual odometry and SLAM methods. Compared to existing public datasets that primarily offer raw images, Open-Structure provides direct access to point and line measurements, correspondences, structural associations, and co-visibility factor graphs, which can be fed to various stages of SLAM pipelines to mitigate the impact of data preprocessing modules in ablation experiments. The dataset comprises two distinct types of sequences from the perspective of scenarios. The first type maintains reasonable observation and occlusion relationships, as these critical elements are extracted from public image-based sequences using our dataset generator. In contrast, the second type consists of carefully designed simulation sequences that enhance dataset diversity by introducing a wide range of trajectories and observations. Furthermore, a baseline is proposed using our dataset to evaluate widely used modules, including camera pose tracking, parametrization, and factor graph optimization, within SLAM systems. By evaluating these state-of-the-art algorithms across different scenarios, we discern each module's strengths and weaknesses in the context of camera tracking and optimization processes.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11457-11464"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Catastrophic Forgetting in Robot Continual Learning: A Guided Policy Search Approach Enhanced With Memory-Aware Synapses 减轻机器人持续学习中的灾难性遗忘:利用记忆感知突触增强的引导式策略搜索方法
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487484
Qingwei Dong;Peng Zeng;Yunpeng He;Guangxi Wan;Xiaoting Dong
Complex operational scenarios increasingly demand that industrial robots sequentially resolve multiple interrelated problems to accomplish complex operational tasks, necessitating robots to have the capacity for not only learning through interaction with the environment but also for continual learning. Current deep reinforcement learning methods have demonstrated substantial prowess in enabling robots to learn individual simple operational skills. However, catastrophic forgetting regarding the continual learning of various distinct tasks under a unified control policy remains a challenge. The lengthy sequential decision-making trajectory in reinforcement learning scenarios results in a massive state-action search space for the agent. Moreover, low-value state-action samples exacerbate the difficulty of continuous learning in reinforcement learning problems. In this letter, we propose a Continual Reinforcement Learning (CRL) method that accommodates the incremental multiskill learning demands of robots. We transform the tightly coupled structure in Guided Policy Search (GPS) algorithms, which closely intertwine local and global policies, into a loosely coupled structure. This revised structure updates the global policy only after the local policy for a specific task has converged, enabling online learning. In incrementally learning new tasks, the global policy is updated using hard parameter sharing and Memory Aware Synapses (MAS), creating task-specific layers while penalizing significant parameter changes in shared layers linked to prior tasks. This method reduces overfitting and mitigates catastrophic forgetting in robotic CRL. We validate our method on PR2, UR5 and Sawyer robots in simulators as well as on a real UR5 robot.
复杂的操作场景越来越多地要求工业机器人按顺序解决多个相互关联的问题,以完成复杂的操作任务,这就要求机器人不仅要具备通过与环境互动来学习的能力,还要具备持续学习的能力。目前的深度强化学习方法在帮助机器人学习单个简单操作技能方面已显示出巨大的优势。然而,在统一控制策略下持续学习各种不同任务的灾难性遗忘仍然是一个挑战。在强化学习场景中,冗长的顺序决策轨迹会给机器人带来一个巨大的状态动作搜索空间。此外,低价值的状态-动作样本加剧了强化学习问题中持续学习的难度。在这封信中,我们提出了一种持续强化学习(CRL)方法,以适应机器人的增量多技能学习需求。我们将引导策略搜索(GPS)算法中紧密结合局部策略和全局策略的紧密耦合结构转变为松散耦合结构。这种修改后的结构只有在特定任务的局部策略收敛后才会更新全局策略,从而实现在线学习。在增量学习新任务时,全局策略通过硬参数共享和记忆感知突触(MAS)进行更新,创建特定任务层,同时对与先前任务相关的共享层中的重大参数变化进行惩罚。这种方法减少了过拟合,减轻了机器人 CRL 中的灾难性遗忘。我们在模拟器中的 PR2、UR5 和 Sawyer 机器人以及真实的 UR5 机器人上验证了我们的方法。
{"title":"Mitigating Catastrophic Forgetting in Robot Continual Learning: A Guided Policy Search Approach Enhanced With Memory-Aware Synapses","authors":"Qingwei Dong;Peng Zeng;Yunpeng He;Guangxi Wan;Xiaoting Dong","doi":"10.1109/LRA.2024.3487484","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487484","url":null,"abstract":"Complex operational scenarios increasingly demand that industrial robots sequentially resolve multiple interrelated problems to accomplish complex operational tasks, necessitating robots to have the capacity for not only learning through interaction with the environment but also for continual learning. Current deep reinforcement learning methods have demonstrated substantial prowess in enabling robots to learn individual simple operational skills. However, catastrophic forgetting regarding the continual learning of various distinct tasks under a unified control policy remains a challenge. The lengthy sequential decision-making trajectory in reinforcement learning scenarios results in a massive state-action search space for the agent. Moreover, low-value state-action samples exacerbate the difficulty of continuous learning in reinforcement learning problems. In this letter, we propose a Continual Reinforcement Learning (CRL) method that accommodates the incremental multiskill learning demands of robots. We transform the tightly coupled structure in Guided Policy Search (GPS) algorithms, which closely intertwine local and global policies, into a loosely coupled structure. This revised structure updates the global policy only after the local policy for a specific task has converged, enabling online learning. In incrementally learning new tasks, the global policy is updated using hard parameter sharing and Memory Aware Synapses (MAS), creating task-specific layers while penalizing significant parameter changes in shared layers linked to prior tasks. This method reduces overfitting and mitigates catastrophic forgetting in robotic CRL. We validate our method on PR2, UR5 and Sawyer robots in simulators as well as on a real UR5 robot.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11242-11249"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Depth-Visual-Inertial (DVI) Mapping System for Robust Indoor 3D Reconstruction 用于稳健室内三维重建的深度-视觉-惯性(DVI)映射系统
IF 4.6 2区 计算机科学 Q2 ROBOTICS Pub Date : 2024-10-29 DOI: 10.1109/LRA.2024.3487496
Charles Hamesse;Michiel Vlaminck;Hiep Luong;Rob Haelterman
We propose the Depth-Visual-Inertial (DVI) mapping system: a robust multi-sensor fusion framework for dense 3D mapping using time-of-flight cameras equipped with RGB and IMU sensors. Inspired by recent developments in real-time LiDAR-based odometry and mapping, our system uses an error-state iterative Kalman filter for state estimation: it processes the inertial sensor's data for state propagation, followed by a state update first using visual-inertial odometry, then depth-based odometry. This sensor fusion scheme makes our system robust to degenerate scenarios (e.g. lack of visual or geometrical features, fast rotations) and to noisy sensor data, like those that can be obtained with off-the-shelf time-of-flight DVI sensors. For evaluation, we propose the new Bunker DVI Dataset, featuring data from multiple DVI sensors recorded in challenging conditions reflecting search-and-rescue operations. We show the superior robustness and precision of our method against previous work. Following the open science principle, we make both our source code and dataset publicly available.
我们提出了深度-视觉-惯性(DVI)测绘系统:这是一种稳健的多传感器融合框架,用于使用配备了 RGB 和 IMU 传感器的飞行时间照相机进行密集 3D 测绘。受基于激光雷达的实时里程测量和制图的最新发展的启发,我们的系统使用误差状态迭代卡尔曼滤波器进行状态估计:它处理惯性传感器的数据进行状态传播,然后首先使用视觉惯性里程测量进行状态更新,接着使用基于深度的里程测量进行状态更新。这种传感器融合方案使我们的系统对退化场景(如缺乏视觉或几何特征、快速旋转)和嘈杂的传感器数据(如使用现成的飞行时间 DVI 传感器获得的数据)具有鲁棒性。为了进行评估,我们提出了新的掩体 DVI 数据集,该数据集由多个 DVI 传感器在具有挑战性的条件下记录的数据组成,反映了搜救行动的情况。与之前的研究相比,我们的方法具有更高的鲁棒性和精确性。遵循开放科学原则,我们公开了源代码和数据集。
{"title":"Depth-Visual-Inertial (DVI) Mapping System for Robust Indoor 3D Reconstruction","authors":"Charles Hamesse;Michiel Vlaminck;Hiep Luong;Rob Haelterman","doi":"10.1109/LRA.2024.3487496","DOIUrl":"https://doi.org/10.1109/LRA.2024.3487496","url":null,"abstract":"We propose the \u0000<underline>D</u>\u0000epth-\u0000<underline>V</u>\u0000isual-\u0000<underline>I</u>\u0000nertial (DVI) mapping system: a robust multi-sensor fusion framework for dense 3D mapping using time-of-flight cameras equipped with RGB and IMU sensors. Inspired by recent developments in real-time LiDAR-based odometry and mapping, our system uses an error-state iterative Kalman filter for state estimation: it processes the inertial sensor's data for state propagation, followed by a state update first using visual-inertial odometry, then depth-based odometry. This sensor fusion scheme makes our system robust to degenerate scenarios (e.g. lack of visual or geometrical features, fast rotations) and to noisy sensor data, like those that can be obtained with off-the-shelf time-of-flight DVI sensors. For evaluation, we propose the new Bunker DVI Dataset, featuring data from multiple DVI sensors recorded in challenging conditions reflecting search-and-rescue operations. We show the superior robustness and precision of our method against previous work. Following the open science principle, we make both our source code and dataset publicly available.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11313-11320"},"PeriodicalIF":4.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Robotics and Automation Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1