Neural fields for 3D tracking of anatomy and surgical instruments in monocular laparoscopic video clips

IF 2.8 Q3 ENGINEERING, BIOMEDICAL Healthcare Technology Letters Pub Date : 2024-12-12 DOI:10.1049/htl2.12113
Beerend G. A. Gerats, Jelmer M. Wolterink, Seb P. Mol, Ivo A. M. J. Broeders
{"title":"Neural fields for 3D tracking of anatomy and surgical instruments in monocular laparoscopic video clips","authors":"Beerend G. A. Gerats,&nbsp;Jelmer M. Wolterink,&nbsp;Seb P. Mol,&nbsp;Ivo A. M. J. Broeders","doi":"10.1049/htl2.12113","DOIUrl":null,"url":null,"abstract":"<p>Laparoscopic video tracking primarily focuses on two target types: surgical instruments and anatomy. The former could be used for skill assessment, while the latter is necessary for the projection of virtual overlays. Where instrument and anatomy tracking have often been considered two separate problems, in this article, a method is proposed for joint tracking of all structures simultaneously. Based on a single 2D monocular video clip, a neural field is trained to represent a continuous spatiotemporal scene, used to create 3D tracks of all surfaces visible in at least one frame. Due to the small size of instruments, they generally cover a small part of the image only, resulting in decreased tracking accuracy. Therefore, enhanced class weighting is proposed to improve the instrument tracks. The authors evaluate tracking on video clips from laparoscopic cholecystectomies, where they find mean tracking accuracies of 92.4% for anatomical structures and 87.4% for instruments. Additionally, the quality of depth maps obtained from the method's scene reconstructions is assessed. It is shown that these pseudo-depths have comparable quality to a state-of-the-art pre-trained depth estimator. On laparoscopic videos in the SCARED dataset, the method predicts depth with an MAE of 2.9 mm and a relative error of 9.2%. These results show the feasibility of using neural fields for monocular 3D reconstruction of laparoscopic scenes. Code is available via GitHub: https://github.com/Beerend/Surgical-OmniMotion.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 6","pages":"411-417"},"PeriodicalIF":2.8000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11665779/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/htl2.12113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Laparoscopic video tracking primarily focuses on two target types: surgical instruments and anatomy. The former could be used for skill assessment, while the latter is necessary for the projection of virtual overlays. Where instrument and anatomy tracking have often been considered two separate problems, in this article, a method is proposed for joint tracking of all structures simultaneously. Based on a single 2D monocular video clip, a neural field is trained to represent a continuous spatiotemporal scene, used to create 3D tracks of all surfaces visible in at least one frame. Due to the small size of instruments, they generally cover a small part of the image only, resulting in decreased tracking accuracy. Therefore, enhanced class weighting is proposed to improve the instrument tracks. The authors evaluate tracking on video clips from laparoscopic cholecystectomies, where they find mean tracking accuracies of 92.4% for anatomical structures and 87.4% for instruments. Additionally, the quality of depth maps obtained from the method's scene reconstructions is assessed. It is shown that these pseudo-depths have comparable quality to a state-of-the-art pre-trained depth estimator. On laparoscopic videos in the SCARED dataset, the method predicts depth with an MAE of 2.9 mm and a relative error of 9.2%. These results show the feasibility of using neural fields for monocular 3D reconstruction of laparoscopic scenes. Code is available via GitHub: https://github.com/Beerend/Surgical-OmniMotion.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
单眼腹腔镜视频片段中用于解剖和手术器械三维跟踪的神经场。
腹腔镜视频跟踪主要关注两种目标类型:手术器械和解剖学。前者可用于技能评估,而后者是虚拟叠加投影所必需的。在仪器和解剖跟踪通常被认为是两个独立的问题的情况下,本文提出了一种同时跟踪所有结构的关节的方法。基于单个2D单目视频剪辑,训练神经场来表示连续的时空场景,用于创建至少一帧中可见的所有表面的3D轨迹。由于仪器体积小,它们通常只覆盖图像的一小部分,导致跟踪精度降低。因此,提出了增强类加权的方法来改善仪器轨迹。作者评估了腹腔镜胆囊切除术视频片段的跟踪,他们发现解剖结构的平均跟踪准确率为92.4%,器械的平均跟踪准确率为87.4%。此外,还对该方法的场景重建得到的深度图的质量进行了评估。结果表明,这些伪深度与最先进的预训练深度估计器具有相当的质量。在SCARED数据集中的腹腔镜视频中,该方法预测深度的MAE为2.9 mm,相对误差为9.2%。这些结果表明了利用神经场进行腹腔镜场景单眼三维重建的可行性。代码可通过GitHub: https://github.com/Beerend/Surgical-OmniMotion。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Healthcare Technology Letters
Healthcare Technology Letters Health Professions-Health Information Management
CiteScore
6.10
自引率
4.80%
发文量
12
审稿时长
22 weeks
期刊介绍: Healthcare Technology Letters aims to bring together an audience of biomedical and electrical engineers, physical and computer scientists, and mathematicians to enable the exchange of the latest ideas and advances through rapid online publication of original healthcare technology research. Major themes of the journal include (but are not limited to): Major technological/methodological areas: Biomedical signal processing Biomedical imaging and image processing Bioinstrumentation (sensors, wearable technologies, etc) Biomedical informatics Major application areas: Cardiovascular and respiratory systems engineering Neural engineering, neuromuscular systems Rehabilitation engineering Bio-robotics, surgical planning and biomechanics Therapeutic and diagnostic systems, devices and technologies Clinical engineering Healthcare information systems, telemedicine, mHealth.
期刊最新文献
Differential analysis of brain functional network parameters in MHE patients The Feasibility of Ambulatory Heart Rate Variability Monitoring in Non-Suicidal Self-Injury Signal-quality-aware multisensor fusion for atrial fibrillation detection Deep regression 2D-3D ultrasound registration for liver motion correction in focal tumour thermal ablation Writing the Signs: An Explainable Machine Learning Approach for Alzheimer's Disease Classification from Handwriting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1