论动态场景中任意形状针的单目三维姿势估计：一种高效的视觉学习和几何建模方法

IF 3.4 Q2 ENGINEERING, BIOMEDICAL IEEE transactions on medical robotics and bionics Pub Date : 2024-03-14 DOI:10.1109/TMRB.2024.3377357

Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu

{"title":"论动态场景中任意形状针的单目三维姿势估计：一种高效的视觉学习和几何建模方法","authors":"Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu","doi":"10.1109/TMRB.2024.3377357","DOIUrl":null,"url":null,"abstract":"Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Monocular 3-D Pose Estimation for Arbitrary Shaped Needle in Dynamic Scenes: An Efficient Visual Learning and Geometry Modeling Approach\",\"authors\":\"Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu\",\"doi\":\"10.1109/TMRB.2024.3377357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.\",\"PeriodicalId\":73318,\"journal\":{\"name\":\"IEEE transactions on medical robotics and bionics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical robotics and bionics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10472639/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10472639/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

图像引导的缝针姿态估计对机器人自主缝合至关重要，但由于缝针的视觉投影纤细，且手术环境多变，因此面临着巨大的挑战。目前最先进的方法依赖于额外的先验信息（如手部抓取、精确的运动学等）来实现亚毫米级的精度，这阻碍了它们在不同手术场景中的适用性。本文提出了一种新的单目针姿势估计通用框架：用于高效提取几何基元的视觉学习网络和用于精确姿势恢复的新型几何模型。为了精确捕捉针的基元，我们以多尺度方式引入了基于形态学的掩膜轮廓融合机制。然后，我们为针的姿态建立了一个新颖的状态表示，并开发了一个物理投影模型来推导其与基元的关系。我们制定了一个反闭塞目标，以联合优化推理基元的姿势和偏差，从而在闭塞情况下实现亚毫米级的精度。我们的方法既不需要 CAD 模型，也不需要圆形形状假设，还能广泛估计其他小型平面轴对称物体的姿势。体内/体外实验验证了估计中间基元和针的最终姿势的准确性。我们进一步在 dVRK 平台上部署了我们的框架，用于自动和精确的针操作，证明了在机器人手术中使用的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

On the Monocular 3-D Pose Estimation for Arbitrary Shaped Needle in Dynamic Scenes: An Efficient Visual Learning and Geometry Modeling Approach

Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助