SurgiTrack: Fine-grained multi-class multi-tool tracking in surgical videos.

IF 10.7 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Medical image analysis Pub Date : 2024-12-17 DOI:10.1016/j.media.2024.103438
Chinedu Innocent Nwoye, Nicolas Padoy
{"title":"SurgiTrack: Fine-grained multi-class multi-tool tracking in surgical videos.","authors":"Chinedu Innocent Nwoye, Nicolas Padoy","doi":"10.1016/j.media.2024.103438","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate tool tracking is essential for the success of computer-assisted intervention. Previous efforts often modeled tool trajectories rigidly, overlooking the dynamic nature of surgical procedures, especially tracking scenarios like out-of-body and out-of-camera views. Addressing this limitation, the new CholecTrack20 dataset provides detailed labels that account for multiple tool trajectories in three perspectives: (1) intraoperative, (2) intracorporeal, and (3) visibility, representing the different types of temporal duration of tool tracks. These fine-grained labels enhance tracking flexibility but also increase the task complexity. Re-identifying tools after occlusion or re-insertion into the body remains challenging due to high visual similarity, especially among tools of the same category. This work recognizes the critical role of the tool operators in distinguishing tool track instances, especially those belonging to the same tool category. The operators' information are however not explicitly captured in surgical videos. We therefore propose SurgiTrack, a novel deep learning method that leverages YOLOv7 for precise tool detection and employs an attention mechanism to model the originating direction of the tools, as a proxy to their operators, for tool re-identification. To handle diverse tool trajectory perspectives, SurgiTrack employs a harmonizing bipartite matching graph, minimizing conflicts and ensuring accurate tool identity association. Experimental results on CholecTrack20 demonstrate SurgiTrack's effectiveness, outperforming baselines and state-of-the-art methods with real-time inference capability. This work sets a new standard in surgical tool tracking, providing dynamic trajectories for more adaptable and precise assistance in minimally invasive surgeries.</p>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"101 ","pages":"103438"},"PeriodicalIF":10.7000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.media.2024.103438","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate tool tracking is essential for the success of computer-assisted intervention. Previous efforts often modeled tool trajectories rigidly, overlooking the dynamic nature of surgical procedures, especially tracking scenarios like out-of-body and out-of-camera views. Addressing this limitation, the new CholecTrack20 dataset provides detailed labels that account for multiple tool trajectories in three perspectives: (1) intraoperative, (2) intracorporeal, and (3) visibility, representing the different types of temporal duration of tool tracks. These fine-grained labels enhance tracking flexibility but also increase the task complexity. Re-identifying tools after occlusion or re-insertion into the body remains challenging due to high visual similarity, especially among tools of the same category. This work recognizes the critical role of the tool operators in distinguishing tool track instances, especially those belonging to the same tool category. The operators' information are however not explicitly captured in surgical videos. We therefore propose SurgiTrack, a novel deep learning method that leverages YOLOv7 for precise tool detection and employs an attention mechanism to model the originating direction of the tools, as a proxy to their operators, for tool re-identification. To handle diverse tool trajectory perspectives, SurgiTrack employs a harmonizing bipartite matching graph, minimizing conflicts and ensuring accurate tool identity association. Experimental results on CholecTrack20 demonstrate SurgiTrack's effectiveness, outperforming baselines and state-of-the-art methods with real-time inference capability. This work sets a new standard in surgical tool tracking, providing dynamic trajectories for more adaptable and precise assistance in minimally invasive surgeries.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SurgiTrack:在手术视频中进行细粒度多类多工具跟踪。
准确的工具跟踪对于计算机辅助干预的成功至关重要。以前的工作通常是严格地建模工具轨迹,忽略了手术过程的动态性,特别是跟踪像体外和镜头外视图这样的场景。为了解决这一限制,新的CholecTrack20数据集提供了详细的标签,从三个角度解释了多个工具轨迹:(1)术中,(2)体内,(3)可见性,代表了工具轨迹的不同类型的时间持续时间。这些细粒度标签增强了跟踪的灵活性,但也增加了任务的复杂性。由于高度的视觉相似性,特别是在同一类别的工具之间,在闭塞或重新插入体内后重新识别工具仍然具有挑战性。这项工作认识到工具操作员在区分工具轨迹实例中的关键作用,特别是那些属于同一工具类别的实例。然而,手术视频中并没有明确地捕捉到手术者的信息。因此,我们提出了SurgiTrack,这是一种新颖的深度学习方法,利用YOLOv7进行精确的工具检测,并采用注意力机制对工具的原始方向进行建模,作为其操作员的代理,以重新识别工具。为了处理不同的刀具轨迹视角,SurgiTrack采用协调的二部匹配图,最大限度地减少冲突,并确保准确的刀具身份关联。在CholecTrack20上的实验结果证明了SurgiTrack的有效性,具有实时推理能力,优于基线和最先进的方法。这项工作为外科手术工具跟踪设定了新的标准,为微创手术提供了更具适应性和更精确的辅助动态轨迹。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Medical image analysis
Medical image analysis 工程技术-工程:生物医学
CiteScore
22.10
自引率
6.40%
发文量
309
审稿时长
6.6 months
期刊介绍: Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.
期刊最新文献
Corrigendum to "Detection and analysis of cerebral aneurysms based on X-ray rotational angiography - the CADA 2020 challenge" [Medical Image Analysis, April 2022, Volume 77, 102333]. Editorial for Special Issue on Foundation Models for Medical Image Analysis. Few-shot medical image segmentation with high-fidelity prototypes. The Developing Human Connectome Project: A fast deep learning-based pipeline for neonatal cortical surface reconstruction. DSAM: A deep learning framework for analyzing temporal and spatial dynamics in brain networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1