用于机器人手术中转诊视频器械分割的视频器械协同网络

Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, Lei Zhu
{"title":"用于机器人手术中转诊视频器械分割的视频器械协同网络","authors":"Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, Lei Zhu","doi":"10.1109/TMI.2024.3426953","DOIUrl":null,"url":null,"abstract":"<p><p>Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.\",\"authors\":\"Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, Lei Zhu\",\"doi\":\"10.1109/TMI.2024.3426953\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).</p>\",\"PeriodicalId\":94033,\"journal\":{\"name\":\"IEEE transactions on medical imaging\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TMI.2024.3426953\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TMI.2024.3426953","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

手术器械分割对于促进机器人辅助手术中的认知智能至关重要。虽然现有的方法已经获得了精确的器械分割结果,但它们同时生成了所有器械的分割掩模,缺乏指定目标对象和实现交互体验的能力。本文的重点是机器人手术中一项新颖而重要的任务,即参考手术视频器械分割(RSVIS),其目的是从每个视频帧中自动识别和分割目标手术器械,并通过给定的语言表达进行参考。这种交互式功能提高了用户参与度和定制化体验,对下一代外科手术教育系统的开发大有裨益。为此,本文构建了两个手术视频数据集,以促进 RSVIS 的研究。然后,我们设计了一个新颖的视频-器械协同网络(VIS-Net)来学习视频级和器械级知识,以提高性能,而之前的工作只利用了视频级信息。同时,我们设计了一个基于图的关系感知模块(GRM)来模拟多模态信息(即文本描述和视频帧)之间的相关性,从而促进乐器级信息的提取。在两个 RSVIS 数据集上的大量实验结果表明,VIS-Net 的性能明显优于现有的最先进的指代分割方法。我们将发布我们的代码和数据集,供未来研究使用(Git)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.

Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis. Self-navigated 3D diffusion MRI using an optimized CAIPI sampling and structured low-rank reconstruction estimated navigator. Low-dose CT image super-resolution with noise suppression based on prior degradation estimator and self-guidance mechanism. Table of Contents LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1