Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu
{"title":"论动态场景中任意形状针的单目三维姿势估计:一种高效的视觉学习和几何建模方法","authors":"Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu","doi":"10.1109/TMRB.2024.3377357","DOIUrl":null,"url":null,"abstract":"Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.","PeriodicalId":73318,"journal":{"name":"IEEE transactions on medical robotics and bionics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Monocular 3-D Pose Estimation for Arbitrary Shaped Needle in Dynamic Scenes: An Efficient Visual Learning and Geometry Modeling Approach\",\"authors\":\"Bin Li;Bo Lu;Hongbin Lin;Yaxiang Wang;Fangxun Zhong;Qi Dou;Yun-Hui Liu\",\"doi\":\"10.1109/TMRB.2024.3377357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.\",\"PeriodicalId\":73318,\"journal\":{\"name\":\"IEEE transactions on medical robotics and bionics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical robotics and bionics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10472639/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical robotics and bionics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10472639/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
On the Monocular 3-D Pose Estimation for Arbitrary Shaped Needle in Dynamic Scenes: An Efficient Visual Learning and Geometry Modeling Approach
Image-guided needle pose estimation is crucial for robotic autonomous suturing, but it poses significant challenges due to the needle’s slender visual projection and dynamic surgical environments. Current state-of-the-art methods rely on additional prior information (e.g., in-hand grasp, accurate kinematics, etc.) to achieve sub-millimeter accuracy, hindering their applicability in varying surgical scenes. This paper presents a new generic framework for monocular needle pose estimation: Visual learning network for efficient geometric primitives extraction and novel geometry model for accurate pose recovery. To capture needle’s primitives precisely, we introduce a morphology-based mask contour fusion mechanism in a multi-scale manner. We then establish a novel state representation for needle pose and develop a physical projection model to derive its relationship with the primitives. An anti-occlusion objective is formulated to jointly optimize the pose and bias of inference primitives, achieving sub-millimeter accuracy under occlusion scenarios. Our approach requires neither CAD model nor circular shape assumption and can extensively estimate poses of other small planar axisymmetric objects. Experiments on ex-/in-vivo scenarios validate the accuracy of estimated intermediate primitives and final poses of needles. We further deploy our framework on the dVRK platform for automatic and precise needle manipulations, demonstrating the feasibility for use in robotic surgery.