Self-Supervised Cyclic Diffeomorphic Mapping for Soft Tissue Deformation Recovery in Robotic Surgery Scenes

IEEE transactions on medical imaging Pub Date : 2024-08-07 DOI:10.1109/TMI.2024.3439701

Shizhan Gong;Yonghao Long;Kai Chen;Jiaqi Liu;Yuliang Xiao;Alexis Cheng;Zerui Wang;Qi Dou

{"title":"Self-Supervised Cyclic Diffeomorphic Mapping for Soft Tissue Deformation Recovery in Robotic Surgery Scenes","authors":"Shizhan Gong;Yonghao Long;Kai Chen;Jiaqi Liu;Yuliang Xiao;Alexis Cheng;Zerui Wang;Qi Dou","doi":"10.1109/TMI.2024.3439701","DOIUrl":null,"url":null,"abstract":"The ability to recover tissue deformation from surgical video is fundamental for many downstream applications in robotic surgery. Despite noticeable advancements, this task remains under-explored due to the complex dynamics of soft tissues manipulated by surgical instruments. Achieving dense and accurate tissue tracking is further complicated by ambiguous pixel correspondence in regions with homogeneous texture. In this paper, we introduce a novel self-supervised framework to recover tissue deformations from stereo surgical videos. Our approach integrates semantics, cross-frame motion flow, and long-range temporal dependencies to accurately represent tissue dynamics for deformation recovery. Moreover, we incorporate diffeomorphic mapping to regularize the warping field to be physically more realistic. To comprehensively evaluate our method, we collected stereo surgical video clips containing three types of tissue manipulation (i.e., pushing, dissection and retraction) from two surgical procedures (i.e., hemicolectomy and mesorectal excision). Our method demonstrates promising results in capturing tissue 3D deformation, and generalizes well across different actions and procedures. It also outperforms current state-of-the-art approaches based on non-rigid registration and optical flow estimation. To the best of our knowledge, this is the first work on self-supervised learning for dense tissue deformation modeling from stereo surgical videos. The paper’s code is available at: \n<uri>https://github.com/</uri>\n med-air/RecoverTissueDeform.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 12","pages":"4356-4367"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10630572","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10630572/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The ability to recover tissue deformation from surgical video is fundamental for many downstream applications in robotic surgery. Despite noticeable advancements, this task remains under-explored due to the complex dynamics of soft tissues manipulated by surgical instruments. Achieving dense and accurate tissue tracking is further complicated by ambiguous pixel correspondence in regions with homogeneous texture. In this paper, we introduce a novel self-supervised framework to recover tissue deformations from stereo surgical videos. Our approach integrates semantics, cross-frame motion flow, and long-range temporal dependencies to accurately represent tissue dynamics for deformation recovery. Moreover, we incorporate diffeomorphic mapping to regularize the warping field to be physically more realistic. To comprehensively evaluate our method, we collected stereo surgical video clips containing three types of tissue manipulation (i.e., pushing, dissection and retraction) from two surgical procedures (i.e., hemicolectomy and mesorectal excision). Our method demonstrates promising results in capturing tissue 3D deformation, and generalizes well across different actions and procedures. It also outperforms current state-of-the-art approaches based on non-rigid registration and optical flow estimation. To the best of our knowledge, this is the first work on self-supervised learning for dense tissue deformation modeling from stereo surgical videos. The paper’s code is available at: https://github.com/ med-air/RecoverTissueDeform.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于机器人手术场景中软组织变形恢复的自监督循环异构映射。

从视觉特征中恢复组织变形的能力是许多机器人手术应用的基础。这一直是计算机视觉领域的一个长期研究课题，但由于软组织在手术器械作用下的复杂动态特性，这一课题至今仍未得到解决。同质纹理造成的模糊像素对应关系使得实现密集而精确的组织跟踪更具挑战性。在本文中，我们提出了一种新颖的自监督框架来恢复立体手术视频中的组织变形。我们的方法整合了语义、跨帧运动流和长时程依赖性，使恢复的变形能够代表实际的组织动态。此外，我们还结合了差异形态映射技术，对扭曲场进行正则化处理，使其符合物理实际。为了全面评估我们的方法，我们收集了两种不同类型手术（即半结肠切除术和直肠系膜切除术）的立体手术视频剪辑，其中包含三种类型的组织操作（即推动、剥离和牵拉）。我们的方法在捕捉三维网状结构的形变方面取得了令人印象深刻的成果，并在各种操作和手术中具有良好的通用性。在非刚性配准和光流估计方面，它也优于目前最先进的方法。据我们所知，这是第一项从立体手术视频中对致密组织变形建模进行自我监督学习的工作。我们的代码即将发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量