Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662977
Mikihisa Yuasa;Ramavarapu S. Sreenivas;Huy T. Tran
Learning-based policies have demonstrated success in many robotic applications, but often lack explainability. We propose a neuro-symbolic explanation framework that generates a weighted signal temporal logic (wSTL) specification which describes a robot policy in a human-interpretable form. Existing methods typically produce explanations that are verbose and inconsistent, which hinders explainability, and are loose, which limits meaningful insights. We address these issues by introducing a simplification process consisting of predicate filtering, regularization, and iterative pruning. We also introduce three explainability metrics—conciseness, consistency, and strictness—to assess explanation quality beyond conventional classification accuracy. Our method—TLNet—is validated in three simulated robotic environments, where it outperforms baselines in generating concise, consistent, and strict wSTL explanations without sacrificing accuracy. This work bridges policy learning and explainability through formal methods, contributing to more transparent decision-making in robotics.
{"title":"Neuro-Symbolic Generation of Explanations for Robot Policies With Weighted Signal Temporal Logic","authors":"Mikihisa Yuasa;Ramavarapu S. Sreenivas;Huy T. Tran","doi":"10.1109/LRA.2026.3662977","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662977","url":null,"abstract":"Learning-based policies have demonstrated success in many robotic applications, but often lack explainability. We propose a neuro-symbolic explanation framework that generates a weighted signal temporal logic (wSTL) specification which describes a robot policy in a human-interpretable form. Existing methods typically produce explanations that are verbose and inconsistent, which hinders explainability, and are loose, which limits meaningful insights. We address these issues by introducing a simplification process consisting of predicate filtering, regularization, and iterative pruning. We also introduce three explainability metrics—conciseness, consistency, and strictness—to assess explanation quality beyond conventional classification accuracy. Our method—<sc>TLNet</small>—is validated in three simulated robotic environments, where it outperforms baselines in generating concise, consistent, and strict wSTL explanations without sacrificing accuracy. This work bridges policy learning and explainability through formal methods, contributing to more transparent decision-making in robotics.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3963-3970"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11386893","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Our aim is to learn to solve long-horizon decision-making problems in complex robotics domains given low-level skills and a handful of demonstrations containing sequences of images. To this end, we focus on learning abstract symbolic world models that facilitate zero-shot generalization to novel goals via planning. A critical component of such models is the set of symbolic predicates that define properties of and relationships between objects. In this work, we leverage pretrained vision-language models (VLMs) to propose a large set of visual predicates potentially relevant for decision-making, and to evaluate those predicates directly from camera images. At training time, we pass the proposed predicates and demonstrations into an optimization-based model-learning algorithm to obtain an abstract symbolic world model that is defined in terms of a compact subset of the proposed predicates. At test time, given a novel goal in a novel setting, we use the VLM to construct a symbolic description of the current world state, and then use a search-based planning algorithm to find a sequence of low-level skills that achieves the goal. We demonstrate empirically across experiments in both simulation and the real world that our method can generalize aggressively, applying its learned world model to solve problems with varying visual backgrounds, types, numbers, and arrangements of objects, as well as novel goals and much longer horizons than those seen at training time.
{"title":"From Pixels to Predicates: Learning Symbolic World Models via Pretrained VLMs","authors":"Ashay Athalye;Nishanth Kumar;Tom Silver;Yichao Liang;Jiuguang Wang;Tomás Lozano-Pérez;Leslie Pack Kaelbling","doi":"10.1109/LRA.2026.3662533","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662533","url":null,"abstract":"Our aim is to learn to solve long-horizon decision-making problems in complex robotics domains given low-level skills and a handful of demonstrations containing sequences of images. To this end, we focus on learning abstract symbolic world models that facilitate zero-shot generalization to novel goals via planning. A critical component of such models is the set of symbolic <italic>predicates</i> that define properties of and relationships between objects. In this work, we leverage pretrained vision-language models (VLMs) to propose a large set of visual predicates potentially relevant for decision-making, and to evaluate those predicates directly from camera images. At training time, we pass the proposed predicates and demonstrations into an optimization-based model-learning algorithm to obtain an abstract symbolic world model that is defined in terms of a compact subset of the proposed predicates. At test time, given a novel goal in a novel setting, we use the VLM to construct a symbolic description of the current world state, and then use a search-based planning algorithm to find a sequence of low-level skills that achieves the goal. We demonstrate empirically across experiments in both simulation and the real world that our method can generalize aggressively, applying its learned world model to solve problems with varying visual backgrounds, types, numbers, and arrangements of objects, as well as novel goals and much longer horizons than those seen at training time.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"4002-4009"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662646
Yao Huang;Li Liu;Jian Sun;Bo Song
Disrupted hand motor functions may be restored through exoskeleton-assisted rehabilitation training. However, the variability of soft tissue in human joints or across individuals and development of an exoskeleton that combines human-machine motion compatibility and dynamic compliance pose persistent challenges. We introduce a hybrid single-motor-driven rigid–soft exoskeleton for the index finger to assist in rehabilitation training. A rigid parallel mechanism directly drives the soft component of the metacarpophalangeal (MCP) joint. In addition, we adopt an interlocking mechanism to induce deformation in leaf springs, enabling the coordinated flexion and extension of multiple joints. A motion analysis based on the modified Denavit–Hartenberg convention confirms that the proposed parallel mechanism can compensate for the misalignment displacement of the MCP joint. Based on the displacement and force applied to the soft component by the designed rigid parallel mechanism, kinematic and static analyses along with dimensional optimization are performed on a dual-segment parallel leaf spring. A prototype exoskeleton undergoing tests demonstrated Pearson correlation coefficients of 0.998, 0.991, 0.986, for the MCP, proximal and distal interphalangeal (PIP/DIP) joints, respectively. The corresponding joint flexion angles were 68.19°, 81.91°, and 41.64°. The exoskeleton self-aligns with the index finger joints, properly assisting the natural bending motion of the finger to meet rehabilitation training needs of patients. The proposed exoskeleton can assist with a fingertip force of 6.2 N, thereby satisfying grip requirements, while the reduced force on the dorsal surface of the index finger enhances comfort during use. The proposed solution is promising for developing hand exoskeletons.
{"title":"Design and Analysis of Hybrid Rigid-Soft Self-Aligning Index Finger Exoskeleton","authors":"Yao Huang;Li Liu;Jian Sun;Bo Song","doi":"10.1109/LRA.2026.3662646","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662646","url":null,"abstract":"Disrupted hand motor functions may be restored through exoskeleton-assisted rehabilitation training. However, the variability of soft tissue in human joints or across individuals and development of an exoskeleton that combines human-machine motion compatibility and dynamic compliance pose persistent challenges. We introduce a hybrid single-motor-driven rigid–soft exoskeleton for the index finger to assist in rehabilitation training. A rigid parallel mechanism directly drives the soft component of the metacarpophalangeal (MCP) joint. In addition, we adopt an interlocking mechanism to induce deformation in leaf springs, enabling the coordinated flexion and extension of multiple joints. A motion analysis based on the modified Denavit–Hartenberg convention confirms that the proposed parallel mechanism can compensate for the misalignment displacement of the MCP joint. Based on the displacement and force applied to the soft component by the designed rigid parallel mechanism, kinematic and static analyses along with dimensional optimization are performed on a dual-segment parallel leaf spring. A prototype exoskeleton undergoing tests demonstrated Pearson correlation coefficients of 0.998, 0.991, 0.986, for the MCP, proximal and distal interphalangeal (PIP/DIP) joints, respectively. The corresponding joint flexion angles were 68.19°, 81.91°, and 41.64°. The exoskeleton self-aligns with the index finger joints, properly assisting the natural bending motion of the finger to meet rehabilitation training needs of patients. The proposed exoskeleton can assist with a fingertip force of 6.2 N, thereby satisfying grip requirements, while the reduced force on the dorsal surface of the index finger enhances comfort during use. The proposed solution is promising for developing hand exoskeletons.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"4010-4017"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662576
Sitan Li;Chao Liu;Koji Matsuno;Chien Chern Cheah
Recent advances in machine learning and deep learning have significantly enhanced robot control by improving object detection and visual feature extraction. However, ensuring theoretical guarantees of stability and convergence in learning-enabled control systems remains a major challenge. In this paper, we propose a vision-based control framework that integrates a deep learning oriented-object detector with a Lyapunov-stable servo control law. The proposed method ensures provably stable convergence of the robot end-effector or its grasped object’s pose to a desired camera image region for both eye-in-hand and eye-to-hand configurations. Unlike existing deep learning based visual servoing methods, which either lack formal stability guarantees or ignore object orientation control, our approach incorporates object orientation into the control loop through a region-based method using quaternion representation and formally guarantees stability. We validated our framework on a 6-DoF UR5e manipulator performing cup insertion and centering tasks, demonstrating accurate and stable control in both camera setups.
{"title":"Stable Vision-Based Robot Kinematic Control With Deep Learning-Based Oriented Object Detector","authors":"Sitan Li;Chao Liu;Koji Matsuno;Chien Chern Cheah","doi":"10.1109/LRA.2026.3662576","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662576","url":null,"abstract":"Recent advances in machine learning and deep learning have significantly enhanced robot control by improving object detection and visual feature extraction. However, ensuring theoretical guarantees of stability and convergence in learning-enabled control systems remains a major challenge. In this paper, we propose a vision-based control framework that integrates a deep learning oriented-object detector with a Lyapunov-stable servo control law. The proposed method ensures provably stable convergence of the robot end-effector or its grasped object’s pose to a desired camera image region for both eye-in-hand and eye-to-hand configurations. Unlike existing deep learning based visual servoing methods, which either lack formal stability guarantees or ignore object orientation control, our approach incorporates object orientation into the control loop through a region-based method using quaternion representation and formally guarantees stability. We validated our framework on a 6-DoF UR5e manipulator performing cup insertion and centering tasks, demonstrating accurate and stable control in both camera setups.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3915-3922"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662559
Patrick M. Amy;Brandon C. Fallin;Jhyv N. Philor;Warren E. Dixon
This work explores the indirect herding control problem for a single pursuer agent regulating a single target agent to a goal location. To accommodate the constraints of sensing hardware, an event-triggered inter-agent influence model between the pursuer agent and target agent is considered. Motivated by fielded sensing systems, we present an event-triggered controller and trigger mechanism that satisfies a user-selected minimum inter-event time. The combined pursuer-target system is presented as a switched system that alternates between stable and unstable modes. A dwell-time analysis is completed to develop a closed-form solution for the maximum time the pursuer agent can allow the target agent to evolve in the unstable mode before requiring a control input update. The presented trigger function is designed to produce inter-event times that are upper-bounded by the maximum dwell time. The effectiveness of the proposed approach is demonstrated through both simulated and experimental studies, where a pursuer agent successfully regulates a target agent to a desired goal location.
{"title":"Event-Triggered Indirect Herding Control of a Cooperative Agent","authors":"Patrick M. Amy;Brandon C. Fallin;Jhyv N. Philor;Warren E. Dixon","doi":"10.1109/LRA.2026.3662559","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662559","url":null,"abstract":"This work explores the indirect herding control problem for a single pursuer agent regulating a single target agent to a goal location. To accommodate the constraints of sensing hardware, an event-triggered inter-agent influence model between the pursuer agent and target agent is considered. Motivated by fielded sensing systems, we present an event-triggered controller and trigger mechanism that satisfies a user-selected minimum inter-event time. The combined pursuer-target system is presented as a switched system that alternates between stable and unstable modes. A dwell-time analysis is completed to develop a closed-form solution for the maximum time the pursuer agent can allow the target agent to evolve in the unstable mode before requiring a control input update. The presented trigger function is designed to produce inter-event times that are upper-bounded by the maximum dwell time. The effectiveness of the proposed approach is demonstrated through both simulated and experimental studies, where a pursuer agent successfully regulates a target agent to a desired goal location.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3828-3835"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662586
Zilin Fang;Anxing Xiao;David Hsu;Gim Hee Lee
Navigating socially in human environments requires more than satisfying geometric constraints, as collision-free paths may still interfere with ongoing activities or conflict with social norms. Addressing this challenge calls for analyzing interactions between agents and incorporating common-sense reasoning into planning. This paper presents a social robot navigation framework that integrates geometric planning with contextual social reasoning. The system first extracts obstacles and human dynamics to generate geometrically feasible candidate paths, then leverages a fine-tuned vision-language model (VLM) to evaluate these paths, informed by contextually grounded social expectations, selecting a socially optimized path for the controller. This task-specific VLM distills social reasoning from large foundation models into a smaller and efficient model, allowing the framework to perform real-time adaptation in diverse human–robot interaction contexts. Experiments in four social navigation contexts demonstrate that our method achieves the best overall performance with the lowest personal space violation duration, the minimal pedestrian-facing time, and no social zone intrusions.
{"title":"From Obstacles to Etiquette: Robot Social Navigation With VLM-Informed Path Selection","authors":"Zilin Fang;Anxing Xiao;David Hsu;Gim Hee Lee","doi":"10.1109/LRA.2026.3662586","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662586","url":null,"abstract":"Navigating socially in human environments requires more than satisfying geometric constraints, as collision-free paths may still interfere with ongoing activities or conflict with social norms. Addressing this challenge calls for analyzing interactions between agents and incorporating common-sense reasoning into planning. This paper presents a social robot navigation framework that integrates geometric planning with contextual social reasoning. The system first extracts obstacles and human dynamics to generate geometrically feasible candidate paths, then leverages a fine-tuned vision-language model (VLM) to evaluate these paths, informed by contextually grounded social expectations, selecting a socially optimized path for the controller. This task-specific VLM distills social reasoning from large foundation models into a smaller and efficient model, allowing the framework to perform real-time adaptation in diverse human–robot interaction contexts. Experiments in four social navigation contexts demonstrate that our method achieves the best overall performance with the lowest personal space violation duration, the minimal pedestrian-facing time, and no social zone intrusions.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 4","pages":"3947-3954"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146216636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a soft gripper system in which actuation, sensing, and signal processing are all realized using soft dielectric elastomer (DE) devices. The platform integrates three functional units: a multilayer dielectric elastomer actuator (DEA) module that drives a compliant two-finger gripper, two tactile sensors mounted on the fingertips for contact detection and coarse shape recognition, and a dielectric elastomer switch (DES) that enables soft electronics control. The DEA module consists of eight parallel multilayer actuators, each comprising four active layers and a nonlinear biasing spring to amplify stroke, and is powered by an embedded high-voltage supply operated at 3 kV. The DES operates by mechanically modulating the resistance of a stretchable piezoresistive electrode, providing reliable switching behavior under high voltage. Grasping tests demonstrate that the system achieves a maximum opening angle of 38$^{circ }$ and can safely manipulate delicate objects, including cherries and eggs, without damage. These results demonstrate the feasibility of advancing toward fully soft robotic systems by integrating DE-based components for actuation, sensing, and signal processing.
{"title":"A Gripper System With Soft Dielectric Elastomer Functional Units for Actuation, Sensing, and Signal Processing","authors":"Junhao Ni;Moritz Scharff;Katherine Wilson;Hui Zhi Beh;Andreas Tairych;Andreas Richter;Iain Anderson;Gerald Gerlach;E.-F. Markus Vorrath","doi":"10.1109/LRA.2026.3662528","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662528","url":null,"abstract":"This paper presents a soft gripper system in which actuation, sensing, and signal processing are all realized using soft dielectric elastomer (DE) devices. The platform integrates three functional units: a multilayer dielectric elastomer actuator (DEA) module that drives a compliant two-finger gripper, two tactile sensors mounted on the fingertips for contact detection and coarse shape recognition, and a dielectric elastomer switch (DES) that enables soft electronics control. The DEA module consists of eight parallel multilayer actuators, each comprising four active layers and a nonlinear biasing spring to amplify stroke, and is powered by an embedded high-voltage supply operated at 3 kV. The DES operates by mechanically modulating the resistance of a stretchable piezoresistive electrode, providing reliable switching behavior under high voltage. Grasping tests demonstrate that the system achieves a maximum opening angle of 38<inline-formula><tex-math>$^{circ }$</tex-math></inline-formula> and can safely manipulate delicate objects, including cherries and eggs, without damage. These results demonstrate the feasibility of advancing toward fully soft robotic systems by integrating DE-based components for actuation, sensing, and signal processing.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3598-3605"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373904","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate 3D scene flow estimation is critical for autonomous systems to navigate dynamic environments safely, but creating the necessary large-scale, manually annotated datasets remains a significant bottleneck for developing robust perception models. Current self-supervised methods struggle to match the performance of fully supervised approaches, especially in challenging long-range and adverse weather scenarios, while supervised methods are not scalable due to their reliance on expensive human labeling. We introduce DoGFlow, a novel self-supervised framework that recovers full 3D object motions for LiDAR scene flow estimation without requiring any manual ground truth annotations. This paper presents our cross-modal label transfer approach, where DoGFlow computes motion labels directly from 4D radar Doppler measurements and transfers them to the LiDAR domain using dynamic-aware association and ambiguity-resolved propagation. On the challenging MAN TruckScenes dataset, DoGFlow substantially outperforms existing self-supervised methods and improves label efficiency by enabling LiDAR backbones to achieve over 90% of fully supervised performance with only 10% of the ground truth data.
{"title":"DoGFlow: Self-Supervised LiDAR Scene Flow via Cross-Modal Doppler Guidance","authors":"Ajinkya Khoche;Qingwen Zhang;Yixi Cai;Sina Sharif Mansouri;Patric Jensfelt","doi":"10.1109/LRA.2026.3662592","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662592","url":null,"abstract":"Accurate 3D scene flow estimation is critical for autonomous systems to navigate dynamic environments safely, but creating the necessary large-scale, manually annotated datasets remains a significant bottleneck for developing robust perception models. Current self-supervised methods struggle to match the performance of fully supervised approaches, especially in challenging long-range and adverse weather scenarios, while supervised methods are not scalable due to their reliance on expensive human labeling. We introduce DoGFlow, a novel self-supervised framework that recovers full 3D object motions for LiDAR scene flow estimation without requiring any manual ground truth annotations. This paper presents our cross-modal label transfer approach, where DoGFlow computes motion labels directly from 4D radar Doppler measurements and transfers them to the LiDAR domain using dynamic-aware association and ambiguity-resolved propagation. On the challenging MAN TruckScenes dataset, DoGFlow substantially outperforms existing self-supervised methods and improves label efficiency by enabling LiDAR backbones to achieve over 90% of fully supervised performance with only 10% of the ground truth data.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3836-3843"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11373844","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662617
Yifan Huang;Haoyuan Gu;Ziteng Liu;Chunfeng Yue;Changsheng Dai
Cell penetration and intracellular injection are indispensable procedures in many cell surgery tasks. Because of the complex layered structure of oocytes, conventional piezo-assisted penetration often causesunavoidable cellular damage. Moreover, the small volume of single cells and the nonlinear dynamics involved in the injection process make it challenging to achieve precise and rapid delivery of the injected material within the cytoplasm. This letter presents an integrated robotic system for automated oocyte penetration and intracellular sperm injection that enhances precision and minimizes cellular damage. A deep neural network enables robust segmentation of oocyte structural layers under low-contrast and partially occluded conditions, providing reliable feedback for penetration optimization. A dynamic model was developed to describe the interaction between the pump motion and the fluidic response. A model-predictive controller is then designed to compensate for delays and pressure deviations, ensuring smooth and accurate sperm delivery. Experimental validation demonstrates that the proposed system achieves penetration with a deformation of 4.76 $pm$ 2.34 $mu$m and precise sperm positioning within $pm$ 5 pixels (overshoot below 8 pixels), while ensuring a post-injection survival rate of 82.8%.
{"title":"Robotic Piezo-Assisted Oocyte Penetration and Intracytoplasmic Sperm Injection","authors":"Yifan Huang;Haoyuan Gu;Ziteng Liu;Chunfeng Yue;Changsheng Dai","doi":"10.1109/LRA.2026.3662617","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662617","url":null,"abstract":"Cell penetration and intracellular injection are indispensable procedures in many cell surgery tasks. Because of the complex layered structure of oocytes, conventional piezo-assisted penetration often causesunavoidable cellular damage. Moreover, the small volume of single cells and the nonlinear dynamics involved in the injection process make it challenging to achieve precise and rapid delivery of the injected material within the cytoplasm. This letter presents an integrated robotic system for automated oocyte penetration and intracellular sperm injection that enhances precision and minimizes cellular damage. A deep neural network enables robust segmentation of oocyte structural layers under low-contrast and partially occluded conditions, providing reliable feedback for penetration optimization. A dynamic model was developed to describe the interaction between the pump motion and the fluidic response. A model-predictive controller is then designed to compensate for delays and pressure deviations, ensuring smooth and accurate sperm delivery. Experimental validation demonstrates that the proposed system achieves penetration with a deformation of 4.76 <inline-formula><tex-math>$pm$</tex-math></inline-formula> 2.34 <inline-formula><tex-math>$mu$</tex-math></inline-formula>m and precise sperm positioning within <inline-formula><tex-math>$pm$</tex-math></inline-formula> 5 pixels (overshoot below 8 pixels), while ensuring a post-injection survival rate of 82.8%.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3764-3771"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1109/LRA.2026.3662616
Ziyuan Tang;Yitian Guo;Chenxi Xiao
Recent advancements in virtual reality and robotic teleoperation have greatly increased the variety of haptic information that must be conveyed to users. While existing haptic devices typically provide unimodal feedback to enhance situational awareness, a gap remains in their ability to deliver rich, multimodal sensory feedback encompassing force, pressure, and thermal sensations. To address this limitation, we present the Multimodal Feedback Exoskeleton (MFE), a hand exoskeleton designed to deliver hybrid haptic feedback. The MFE features 20 degrees of freedom for capturing hand pose. For force feedback, it employs an active mechanism capable of generating 3.5-8.1 N of pushing and pulling forces, enabling realistic interaction with deformable objects. The fingertips are equipped with flat actuators based on the electro-osmotic principle, providing pressure and vibration stimuli and achieving up to 2.47 kPa of contact pressure to render tactile sensations. For thermal feedback, the MFE integrates thermoelectric heat pumps capable of rendering temperatures from 10 $^circ$C to 55 $^circ$C. We validated the MFE by integrating it into a robotic teleoperation system using the X-Arm 6 and Inspire Hand manipulator. In user studies, participants successfully recognized and manipulated deformable objects and differentiated remote objects with varying temperatures. These results demonstrate that the MFE enhances situational awareness, as well as the usability and transparency of robotic teleoperation systems.
{"title":"MFE: A Multimodal Hand Exoskeleton With Interactive Force, Pressure and Thermo-Haptic Feedback","authors":"Ziyuan Tang;Yitian Guo;Chenxi Xiao","doi":"10.1109/LRA.2026.3662616","DOIUrl":"https://doi.org/10.1109/LRA.2026.3662616","url":null,"abstract":"Recent advancements in virtual reality and robotic teleoperation have greatly increased the variety of haptic information that must be conveyed to users. While existing haptic devices typically provide unimodal feedback to enhance situational awareness, a gap remains in their ability to deliver rich, multimodal sensory feedback encompassing force, pressure, and thermal sensations. To address this limitation, we present the <bold>Multimodal Feedback Exoskeleton (MFE)</b>, a hand exoskeleton designed to deliver hybrid haptic feedback. The MFE features 20 degrees of freedom for capturing hand pose. For force feedback, it employs an active mechanism capable of generating 3.5-8.1 N of pushing and pulling forces, enabling realistic interaction with deformable objects. The fingertips are equipped with flat actuators based on the electro-osmotic principle, providing pressure and vibration stimuli and achieving up to 2.47 kPa of contact pressure to render tactile sensations. For thermal feedback, the MFE integrates thermoelectric heat pumps capable of rendering temperatures from 10 <inline-formula><tex-math>$^circ$</tex-math></inline-formula>C to 55 <inline-formula><tex-math>$^circ$</tex-math></inline-formula>C. We validated the MFE by integrating it into a robotic teleoperation system using the X-Arm 6 and Inspire Hand manipulator. In user studies, participants successfully recognized and manipulated deformable objects and differentiated remote objects with varying temperatures. These results demonstrate that the MFE enhances situational awareness, as well as the usability and transparency of robotic teleoperation systems.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"11 3","pages":"3756-3763"},"PeriodicalIF":5.3,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}