While tactile sensing is widely accepted as an important and useful sensing modality, its use pales in comparison to other sensory modalities like vision and proprioception. AnySkin addresses the critical challenges that impede the use of tactile sensing -- versatility, replaceability, and data reusability. Building on the simplistic design of ReSkin, and decoupling the sensing electronics from the sensing interface, AnySkin simplifies integration making it as straightforward as putting on a phone case and connecting a charger. Furthermore, AnySkin is the first uncalibrated tactile-sensor with cross-instance generalizability of learned manipulation policies. To summarize, this work makes three key contributions: first, we introduce a streamlined fabrication process and a design tool for creating an adhesive-free, durable and easily replaceable magnetic tactile sensor; second, we characterize slip detection and policy learning with the AnySkin sensor; and third, we demonstrate zero-shot generalization of models trained on one instance of AnySkin to new instances, and compare it with popular existing tactile solutions like DIGIT and ReSkin.https://any-skin.github.io/
{"title":"AnySkin: Plug-and-play Skin Sensing for Robotic Touch","authors":"Raunaq Bhirangi, Venkatesh Pattabiraman, Enes Erciyes, Yifeng Cao, Tess Hellebrekers, Lerrel Pinto","doi":"arxiv-2409.08276","DOIUrl":"https://doi.org/arxiv-2409.08276","url":null,"abstract":"While tactile sensing is widely accepted as an important and useful sensing\u0000modality, its use pales in comparison to other sensory modalities like vision\u0000and proprioception. AnySkin addresses the critical challenges that impede the\u0000use of tactile sensing -- versatility, replaceability, and data reusability.\u0000Building on the simplistic design of ReSkin, and decoupling the sensing\u0000electronics from the sensing interface, AnySkin simplifies integration making\u0000it as straightforward as putting on a phone case and connecting a charger.\u0000Furthermore, AnySkin is the first uncalibrated tactile-sensor with\u0000cross-instance generalizability of learned manipulation policies. To summarize,\u0000this work makes three key contributions: first, we introduce a streamlined\u0000fabrication process and a design tool for creating an adhesive-free, durable\u0000and easily replaceable magnetic tactile sensor; second, we characterize slip\u0000detection and policy learning with the AnySkin sensor; and third, we\u0000demonstrate zero-shot generalization of models trained on one instance of\u0000AnySkin to new instances, and compare it with popular existing tactile\u0000solutions like DIGIT and ReSkin.https://any-skin.github.io/","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a novel approach to the development and deployment of an autonomous mosquito breeding place detector rover with the object and obstacle detection capabilities to control mosquitoes. Mosquito-borne diseases continue to pose significant health threats globally, with conventional control methods proving slow and inefficient. Amidst rising concerns over the rapid spread of these diseases, there is an urgent need for innovative and efficient strategies to manage mosquito populations and prevent disease transmission. To mitigate the limitations of manual labor and traditional methods, our rover employs autonomous control strategies. Leveraging our own custom dataset, the rover can autonomously navigate along a pre-defined path, identifying and mitigating potential breeding grounds with precision. It then proceeds to eliminate these breeding grounds by spraying a chemical agent, effectively eradicating mosquito habitats. Our project demonstrates the effectiveness that is absent in traditional ways of controlling and safeguarding public health. The code for this project is available on GitHub at - https://github.com/faiyazabdullah/MosquitoMiner
{"title":"MosquitoMiner: A Light Weight Rover for Detecting and Eliminating Mosquito Breeding Sites","authors":"Md. Adnanul Islam, Md. Faiyaz Abdullah Sayeedi, Jannatul Ferdous Deepti, Shahanur Rahman Bappy, Safrin Sanzida Islam, Fahim Hafiz","doi":"arxiv-2409.08078","DOIUrl":"https://doi.org/arxiv-2409.08078","url":null,"abstract":"In this paper, we present a novel approach to the development and deployment\u0000of an autonomous mosquito breeding place detector rover with the object and\u0000obstacle detection capabilities to control mosquitoes. Mosquito-borne diseases\u0000continue to pose significant health threats globally, with conventional control\u0000methods proving slow and inefficient. Amidst rising concerns over the rapid\u0000spread of these diseases, there is an urgent need for innovative and efficient\u0000strategies to manage mosquito populations and prevent disease transmission. To\u0000mitigate the limitations of manual labor and traditional methods, our rover\u0000employs autonomous control strategies. Leveraging our own custom dataset, the\u0000rover can autonomously navigate along a pre-defined path, identifying and\u0000mitigating potential breeding grounds with precision. It then proceeds to\u0000eliminate these breeding grounds by spraying a chemical agent, effectively\u0000eradicating mosquito habitats. Our project demonstrates the effectiveness that\u0000is absent in traditional ways of controlling and safeguarding public health.\u0000The code for this project is available on GitHub at -\u0000https://github.com/faiyazabdullah/MosquitoMiner","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sunny Katyara, Suchita Sharma, Praveen Damacharla, Carlos Garcia Santiago, Francis O'Farrell, Philip Long
Designing an efficient and resilient human-robot collaboration strategy that not only upholds the safety and ergonomics of shared workspace but also enhances the performance and agility of collaborative setup presents significant challenges concerning environment perception and robot control. In this research, we introduce a novel approach for collaborative environment monitoring and robot motion regulation to address this multifaceted problem. Our study proposes novel computation and division of safety monitoring zones, adhering to ISO 13855 and TS 15066 standards, utilizing 2D lasers information. These zones are not only configured in the standard three-layer arrangement but are also expanded into two adjacent quadrants, thereby enhancing system uptime and preventing unnecessary deadlocks. Moreover, we also leverage 3D visual information to track dynamic human articulations and extended intrusions. Drawing upon the fused sensory data from 2D and 3D perceptual spaces, our proposed hierarchical controller stably regulates robot velocity, validated using Lasalle in-variance principle. Empirical evaluations demonstrate that our approach significantly reduces task execution time and system response delay, resulting in improved efficiency and resilience within collaborative settings.
设计一种高效、有弹性的人机协作策略,不仅要维护共享工作空间的安全性和人体工程学,还要提高协作设置的性能和灵活性,这对环境感知和机器人控制提出了重大挑战。我们的研究根据 ISO 13855 和 TS 15066 标准,利用二维激光信息,提出了新颖的安全监控区域计算和划分方法。此外,我们还利用三维视觉信息来跟踪人类的动态动作和扩展入侵。利用来自二维和三维感知空间的融合感知数据,我们提出的分层控制器可以稳定地调节机器人的速度,并通过拉萨尔内方差原理进行了验证。经验评估表明,我们的方法大大缩短了任务执行时间和系统响应延迟,从而提高了协作环境下的效率和适应能力。
{"title":"Collaborating for Success: Optimizing System Efficiency and Resilience Under Agile Industrial Settings","authors":"Sunny Katyara, Suchita Sharma, Praveen Damacharla, Carlos Garcia Santiago, Francis O'Farrell, Philip Long","doi":"arxiv-2409.08166","DOIUrl":"https://doi.org/arxiv-2409.08166","url":null,"abstract":"Designing an efficient and resilient human-robot collaboration strategy that\u0000not only upholds the safety and ergonomics of shared workspace but also\u0000enhances the performance and agility of collaborative setup presents\u0000significant challenges concerning environment perception and robot control. In\u0000this research, we introduce a novel approach for collaborative environment\u0000monitoring and robot motion regulation to address this multifaceted problem.\u0000Our study proposes novel computation and division of safety monitoring zones,\u0000adhering to ISO 13855 and TS 15066 standards, utilizing 2D lasers information.\u0000These zones are not only configured in the standard three-layer arrangement but\u0000are also expanded into two adjacent quadrants, thereby enhancing system uptime\u0000and preventing unnecessary deadlocks. Moreover, we also leverage 3D visual\u0000information to track dynamic human articulations and extended intrusions.\u0000Drawing upon the fused sensory data from 2D and 3D perceptual spaces, our\u0000proposed hierarchical controller stably regulates robot velocity, validated\u0000using Lasalle in-variance principle. Empirical evaluations demonstrate that our\u0000approach significantly reduces task execution time and system response delay,\u0000resulting in improved efficiency and resilience within collaborative settings.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashwini Gundappa, Emilia Ellsiepen, Lukas Schmitz, Frederik Wiehr, Vera Demberg
The question of how cyber-physical systems should interact with human partners that can take over control or exert oversight is becoming more pressing, as these systems are deployed for an ever larger range of tasks. Drawing on the literatures on handing over control during semi-autonomous driving and human-robot interaction, we propose a design of a take-over request that combines an abstract pre-alert with an informative TOR: Relevant sensor information is highlighted on the controller's display, while a spoken message verbalizes the reason for the TOR. We conduct our study in the context of a semi-autonomous drone control scenario as our testbed. The goal of our online study is to assess in more detail what form a language-based TOR should take. Specifically, we compare a full sentence condition to shorter fragments, and test whether the visual highlighting should be done synchronously or asynchronously with the speech. Participants showed a higher accuracy in choosing the correct solution with our bi-modal TOR and felt that they were better able to recognize the critical situation. Using only fragments in the spoken message rather than full sentences did not lead to improved accuracy or faster reactions. Also, synchronizing the visual highlighting with the spoken message did not result in better accuracy and response times were even increased in this condition.
{"title":"The Design of Informative Take-Over Requests for Semi-Autonomous Cyber-Physical Systems: Combining Spoken Language and Visual Icons in a Drone-Controller Setting","authors":"Ashwini Gundappa, Emilia Ellsiepen, Lukas Schmitz, Frederik Wiehr, Vera Demberg","doi":"arxiv-2409.08253","DOIUrl":"https://doi.org/arxiv-2409.08253","url":null,"abstract":"The question of how cyber-physical systems should interact with human\u0000partners that can take over control or exert oversight is becoming more\u0000pressing, as these systems are deployed for an ever larger range of tasks.\u0000Drawing on the literatures on handing over control during semi-autonomous\u0000driving and human-robot interaction, we propose a design of a take-over request\u0000that combines an abstract pre-alert with an informative TOR: Relevant sensor\u0000information is highlighted on the controller's display, while a spoken message\u0000verbalizes the reason for the TOR. We conduct our study in the context of a\u0000semi-autonomous drone control scenario as our testbed. The goal of our online\u0000study is to assess in more detail what form a language-based TOR should take.\u0000Specifically, we compare a full sentence condition to shorter fragments, and\u0000test whether the visual highlighting should be done synchronously or\u0000asynchronously with the speech. Participants showed a higher accuracy in\u0000choosing the correct solution with our bi-modal TOR and felt that they were\u0000better able to recognize the critical situation. Using only fragments in the\u0000spoken message rather than full sentences did not lead to improved accuracy or\u0000faster reactions. Also, synchronizing the visual highlighting with the spoken\u0000message did not result in better accuracy and response times were even\u0000increased in this condition.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teng Yan, Zhendong Ruan, Yaobang Cai, Yu Han, Wenxian Li, Yang Zhang
As a data-driven paradigm, offline reinforcement learning (Offline RL) has been formulated as sequence modeling, where the Decision Transformer (DT) has demonstrated exceptional capabilities. Unlike previous reinforcement learning methods that fit value functions or compute policy gradients, DT adjusts the autoregressive model based on the expected returns, past states, and actions, using a causally masked Transformer to output the optimal action. However, due to the inconsistency between the sampled returns within a single trajectory and the optimal returns across multiple trajectories, it is challenging to set an expected return to output the optimal action and stitch together suboptimal trajectories. Decision ConvFormer (DC) is easier to understand in the context of modeling RL trajectories within a Markov Decision Process compared to DT. We propose the Q-value Regularized Decision ConvFormer (QDC), which combines the understanding of RL trajectories by DC and incorporates a term that maximizes action values using dynamic programming methods during training. This ensures that the expected returns of the sampled actions are consistent with the optimal returns. QDC achieves excellent performance on the D4RL benchmark, outperforming or approaching the optimal level in all tested environments. It particularly demonstrates outstanding competitiveness in trajectory stitching capability.
{"title":"Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning","authors":"Teng Yan, Zhendong Ruan, Yaobang Cai, Yu Han, Wenxian Li, Yang Zhang","doi":"arxiv-2409.08062","DOIUrl":"https://doi.org/arxiv-2409.08062","url":null,"abstract":"As a data-driven paradigm, offline reinforcement learning (Offline RL) has\u0000been formulated as sequence modeling, where the Decision Transformer (DT) has\u0000demonstrated exceptional capabilities. Unlike previous reinforcement learning\u0000methods that fit value functions or compute policy gradients, DT adjusts the\u0000autoregressive model based on the expected returns, past states, and actions,\u0000using a causally masked Transformer to output the optimal action. However, due\u0000to the inconsistency between the sampled returns within a single trajectory and\u0000the optimal returns across multiple trajectories, it is challenging to set an\u0000expected return to output the optimal action and stitch together suboptimal\u0000trajectories. Decision ConvFormer (DC) is easier to understand in the context\u0000of modeling RL trajectories within a Markov Decision Process compared to DT. We\u0000propose the Q-value Regularized Decision ConvFormer (QDC), which combines the\u0000understanding of RL trajectories by DC and incorporates a term that maximizes\u0000action values using dynamic programming methods during training. This ensures\u0000that the expected returns of the sampled actions are consistent with the\u0000optimal returns. QDC achieves excellent performance on the D4RL benchmark,\u0000outperforming or approaching the optimal level in all tested environments. It\u0000particularly demonstrates outstanding competitiveness in trajectory stitching\u0000capability.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William Thibault, Vidyasagar Rajendran, William Melek, Katja Mombaur
Learning-based methods have proven useful at generating complex motions for robots, including humanoids. Reinforcement learning (RL) has been used to learn locomotion policies, some of which leverage a periodic reward formulation. This work extends the periodic reward formulation of locomotion to skateboarding for the REEM-C robot. Brax/MJX is used to implement the RL problem to achieve fast training. Initial results in simulation are presented with hardware experiments in progress.
{"title":"Learning Skateboarding for Humanoid Robots through Massively Parallel Reinforcement Learning","authors":"William Thibault, Vidyasagar Rajendran, William Melek, Katja Mombaur","doi":"arxiv-2409.07846","DOIUrl":"https://doi.org/arxiv-2409.07846","url":null,"abstract":"Learning-based methods have proven useful at generating complex motions for\u0000robots, including humanoids. Reinforcement learning (RL) has been used to learn\u0000locomotion policies, some of which leverage a periodic reward formulation. This\u0000work extends the periodic reward formulation of locomotion to skateboarding for\u0000the REEM-C robot. Brax/MJX is used to implement the RL problem to achieve fast\u0000training. Initial results in simulation are presented with hardware experiments\u0000in progress.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samanta Rodriguez, Yiming Dou, Miquel Oller, Andrew Owens, Nima Fazeli
Today's touch sensors come in many shapes and sizes. This has made it challenging to develop general-purpose touch processing methods since models are generally tied to one specific sensor design. We address this problem by performing cross-modal prediction between touch sensors: given the tactile signal from one sensor, we use a generative model to estimate how the same physical contact would be perceived by another sensor. This allows us to apply sensor-specific methods to the generated signal. We implement this idea by training a diffusion model to translate between the popular GelSlim and Soft Bubble sensors. As a downstream task, we perform in-hand object pose estimation using GelSlim sensors while using an algorithm that operates only on Soft Bubble signals. The dataset, the code, and additional details can be found at https://www.mmintlab.com/research/touch2touch/.
{"title":"Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation","authors":"Samanta Rodriguez, Yiming Dou, Miquel Oller, Andrew Owens, Nima Fazeli","doi":"arxiv-2409.08269","DOIUrl":"https://doi.org/arxiv-2409.08269","url":null,"abstract":"Today's touch sensors come in many shapes and sizes. This has made it\u0000challenging to develop general-purpose touch processing methods since models\u0000are generally tied to one specific sensor design. We address this problem by\u0000performing cross-modal prediction between touch sensors: given the tactile\u0000signal from one sensor, we use a generative model to estimate how the same\u0000physical contact would be perceived by another sensor. This allows us to apply\u0000sensor-specific methods to the generated signal. We implement this idea by\u0000training a diffusion model to translate between the popular GelSlim and Soft\u0000Bubble sensors. As a downstream task, we perform in-hand object pose estimation\u0000using GelSlim sensors while using an algorithm that operates only on Soft\u0000Bubble signals. The dataset, the code, and additional details can be found at\u0000https://www.mmintlab.com/research/touch2touch/.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhanyue Zhao, Yang Wang, Charles Bales, Daniel Ruiz-Cadalso, Howard Zheng, Cosme Furlong-Vazquez, Gregory Fischer
Piezoelectric ultrasonic motors perform the advantages of compact design, faster reaction time, and simpler setup compared to other motion units such as pneumatic and hydraulic motors, especially its non-ferromagnetic property makes it a perfect match in MRI-compatible robotics systems compared to traditional DC motors. Hollow shaft motors address the advantages of being lightweight and comparable to solid shafts of the same diameter, low rotational inertia, high tolerance to rotational imbalance due to low weight, and tolerance to high temperature due to low specific mass. This article presents a prototype of a hollow cylindrical ultrasonic motor (HCM) to perform direct drive, eliminate mechanical non-linearity, and reduce the size and complexity of the actuator or end effector assembly. Two equivalent HCMs are presented in this work, and under 50g prepressure on the rotor, it performed 383.3333rpm rotation speed and 57.3504mNm torque output when applying 282$V_{pp}$ driving voltage.
{"title":"Characterization and Design of A Hollow Cylindrical Ultrasonic Motor","authors":"Zhanyue Zhao, Yang Wang, Charles Bales, Daniel Ruiz-Cadalso, Howard Zheng, Cosme Furlong-Vazquez, Gregory Fischer","doi":"arxiv-2409.07690","DOIUrl":"https://doi.org/arxiv-2409.07690","url":null,"abstract":"Piezoelectric ultrasonic motors perform the advantages of compact design,\u0000faster reaction time, and simpler setup compared to other motion units such as\u0000pneumatic and hydraulic motors, especially its non-ferromagnetic property makes\u0000it a perfect match in MRI-compatible robotics systems compared to traditional\u0000DC motors. Hollow shaft motors address the advantages of being lightweight and\u0000comparable to solid shafts of the same diameter, low rotational inertia, high\u0000tolerance to rotational imbalance due to low weight, and tolerance to high\u0000temperature due to low specific mass. This article presents a prototype of a\u0000hollow cylindrical ultrasonic motor (HCM) to perform direct drive, eliminate\u0000mechanical non-linearity, and reduce the size and complexity of the actuator or\u0000end effector assembly. Two equivalent HCMs are presented in this work, and\u0000under 50g prepressure on the rotor, it performed 383.3333rpm rotation speed and\u000057.3504mNm torque output when applying 282$V_{pp}$ driving voltage.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaohan Zhu, Ran Bu, Zhen Li, Fan Xu, Hesheng Wang
Soft manipulators are known for their superiority in coping with high-safety-demanding interaction tasks, e.g., robot-assisted surgeries, elderly caring, etc. Yet the challenges residing in real-time contact feedback have hindered further applications in precise manipulation. This paper proposes an end-to-end network to estimate the 3D contact force of the soft robot, with the aim of enhancing its capabilities in interactive tasks. The presented method features directly utilizing monocular images fused with multidimensional actuation information as the network inputs. This approach simplifies the preprocessing of raw data compared to related studies that utilize 3D shape information for network inputs, consequently reducing configuration reconstruction errors. The unified feature representation module is devised to elevate low-dimensional features from the system's actuation signals to the same level as image features, facilitating smoother integration of multimodal information. The proposed method has been experimentally validated in the soft robot testbed, achieving satisfying accuracy in 3D force estimation (with a mean relative error of 0.84% compared to the best-reported result of 2.2% in the related works).
{"title":"A three-dimensional force estimation method for the cable-driven soft robot based on monocular images","authors":"Xiaohan Zhu, Ran Bu, Zhen Li, Fan Xu, Hesheng Wang","doi":"arxiv-2409.08033","DOIUrl":"https://doi.org/arxiv-2409.08033","url":null,"abstract":"Soft manipulators are known for their superiority in coping with\u0000high-safety-demanding interaction tasks, e.g., robot-assisted surgeries,\u0000elderly caring, etc. Yet the challenges residing in real-time contact feedback\u0000have hindered further applications in precise manipulation. This paper proposes\u0000an end-to-end network to estimate the 3D contact force of the soft robot, with\u0000the aim of enhancing its capabilities in interactive tasks. The presented\u0000method features directly utilizing monocular images fused with multidimensional\u0000actuation information as the network inputs. This approach simplifies the\u0000preprocessing of raw data compared to related studies that utilize 3D shape\u0000information for network inputs, consequently reducing configuration\u0000reconstruction errors. The unified feature representation module is devised to\u0000elevate low-dimensional features from the system's actuation signals to the\u0000same level as image features, facilitating smoother integration of multimodal\u0000information. The proposed method has been experimentally validated in the soft\u0000robot testbed, achieving satisfying accuracy in 3D force estimation (with a\u0000mean relative error of 0.84% compared to the best-reported result of 2.2% in\u0000the related works).","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon de Moreau, Yasser Almehio, Andrei Bursuc, Hafid El-Idrissi, Bogdan Stanciulescu, Fabien Moutarde
Nighttime camera-based depth estimation is a highly challenging task, especially for autonomous driving applications, where accurate depth perception is essential for ensuring safe navigation. We aim to improve the reliability of perception systems at night time, where models trained on daytime data often fail in the absence of precise but costly LiDAR sensors. In this work, we introduce Light Enhanced Depth (LED), a novel cost-effective approach that significantly improves depth estimation in low-light environments by harnessing a pattern projected by high definition headlights available in modern vehicles. LED leads to significant performance boosts across multiple depth-estimation architectures (encoder-decoder, Adabins, DepthFormer) both on synthetic and real datasets. Furthermore, increased performances beyond illuminated areas reveal a holistic enhancement in scene understanding. Finally, we release the Nighttime Synthetic Drive Dataset, a new synthetic and photo-realistic nighttime dataset, which comprises 49,990 comprehensively annotated images.
{"title":"LED: Light Enhanced Depth Estimation at Night","authors":"Simon de Moreau, Yasser Almehio, Andrei Bursuc, Hafid El-Idrissi, Bogdan Stanciulescu, Fabien Moutarde","doi":"arxiv-2409.08031","DOIUrl":"https://doi.org/arxiv-2409.08031","url":null,"abstract":"Nighttime camera-based depth estimation is a highly challenging task,\u0000especially for autonomous driving applications, where accurate depth perception\u0000is essential for ensuring safe navigation. We aim to improve the reliability of\u0000perception systems at night time, where models trained on daytime data often\u0000fail in the absence of precise but costly LiDAR sensors. In this work, we\u0000introduce Light Enhanced Depth (LED), a novel cost-effective approach that\u0000significantly improves depth estimation in low-light environments by harnessing\u0000a pattern projected by high definition headlights available in modern vehicles.\u0000LED leads to significant performance boosts across multiple depth-estimation\u0000architectures (encoder-decoder, Adabins, DepthFormer) both on synthetic and\u0000real datasets. Furthermore, increased performances beyond illuminated areas\u0000reveal a holistic enhancement in scene understanding. Finally, we release the\u0000Nighttime Synthetic Drive Dataset, a new synthetic and photo-realistic\u0000nighttime dataset, which comprises 49,990 comprehensively annotated images.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}