Pub Date : 2024-10-30DOI: 10.1109/LRA.2024.3488400
L. Nowakowski;R. V. Patel
Accurately estimating tool-tissue interaction forces during robotics-assisted minimally invasive surgery is an important aspect of enabling haptics-based teleoperation. By collecting data regarding the state of a robot in a variety of configurations, neural networks can be trained to predict this interaction force. This paper extends existing work in this domain based on collecting one of the largest known ground truth force datasets for stationary as well as moving phantoms that replicate tissue motions found in clinical procedures. Existing methods, and a new transformer-based architecture, are evaluated to demonstrate the domain gap between stationary and moving phantom tissue data and the impact that data scaling has on each architecture's ability to generalize the force estimation task. It was found that temporal networks were more sensitive to the moving domain than single-sample Feed Forward Networks (FFNs) that were trained on stationary tissue data. However, the transformer approach results in the lowest Root Mean Square Error (RMSE) when evaluating networks trained on examples of both stationary and moving phantom tissue samples. The results demonstrate the domain gap between stationary and moving surgical environments and the effectiveness of scaling datasets for increased accuracy of interaction force prediction.
{"title":"Learning Based Estimation of Tool-Tissue Interaction Forces for Stationary and Moving Environments","authors":"L. Nowakowski;R. V. Patel","doi":"10.1109/LRA.2024.3488400","DOIUrl":"https://doi.org/10.1109/LRA.2024.3488400","url":null,"abstract":"Accurately estimating tool-tissue interaction forces during robotics-assisted minimally invasive surgery is an important aspect of enabling haptics-based teleoperation. By collecting data regarding the state of a robot in a variety of configurations, neural networks can be trained to predict this interaction force. This paper extends existing work in this domain based on collecting one of the largest known ground truth force datasets for stationary as well as moving phantoms that replicate tissue motions found in clinical procedures. Existing methods, and a new transformer-based architecture, are evaluated to demonstrate the domain gap between stationary and moving phantom tissue data and the impact that data scaling has on each architecture's ability to generalize the force estimation task. It was found that temporal networks were more sensitive to the moving domain than single-sample Feed Forward Networks (FFNs) that were trained on stationary tissue data. However, the transformer approach results in the lowest Root Mean Square Error (RMSE) when evaluating networks trained on examples of both stationary and moving phantom tissue samples. The results demonstrate the domain gap between stationary and moving surgical environments and the effectiveness of scaling datasets for increased accuracy of interaction force prediction.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11266-11273"},"PeriodicalIF":4.6,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142598640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1109/LRA.2024.3487490
Grzegorz Bartyzel
Transferability, along with sample efficiency, is a critical factor for a reinforcement learning (RL) agent's successful application in real-world contact-rich manipulation tasks, such as product assembly. For instance, in the case of the industrial insertion task on high-mix, low-volume (HMLV) production lines, transferability could eliminate the need for machine retooling, thus reducing production line downtimes. In our work, we introduce a method called Multimodal Variational DeepMDP (MVDeepMDP) that demonstrates the ability to generalize to various environmental variations not encountered during training. The key feature of our approach involves learning a multimodal latent dynamic representation. We demonstrate the effectiveness of our method in the context of an electronic parts insertion task, which is challenging for RL agents due to the diverse physical properties of the non-standardized components, as well as simple 3D printed blocks insertion. Furthermore, we evaluate the transferability of MVDeepMDP and analyze the impact of the balancing mechanism of the generalized Product-of-Experts