首页 > 最新文献

International Journal of Computer Assisted Radiology and Surgery最新文献

英文 中文
Benchmarking variability in semantic segmentation in minimally invasive abdominal surgery. 微创腹部手术中语义分割的基准变异性。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-06 DOI: 10.1007/s11548-025-03562-3
L T Castro, C Barata, P Martins, F Afonso, M Pascoal, C Santiago, L Mennillo, P Mira, D Stoyanov, M Chand, S Bano, A S Soares

Purpose: Anatomical identification during abdominal surgery is subjective given unclear boundaries of anatomical structures. Semantic segmentation of these structures relies on an accurate identification of the boundaries which carries an unknown uncertainty. Given its inherent subjectivity, it is important to assess annotation adequacy. This study aims to evaluate variability in anatomical structure identification and segmentation using MedSAM by surgical residents.

Methods: Images from the Dresden Surgical Anatomy Dataset and the Endoscapes2023 Dataset were semantically annotated by a group of surgery residents using MedSAM in the following classes: abdominal wall, colon, liver, small bowel, spleen, stomach and gallbladder. Each class had 3 to 4 sets of annotations. Inter-annotator variability was assessed through DSC, ICC, BIoU and using the Simultaneous Truth and Performance Level Estimation algorithm to obtain a consensus mask and by calculating Fleiss' kappa agreement between all annotations and reference.

Results: The study showed strong inter-annotator agreement among surgical residents, with DSC values of 0.84-0.95 and Fleiss' kappa between 0.85 and 0.91. Surface area reliability was good to excellent (ICC = 0.62-0.91), while boundary delineation showed lower reproducibility (BIoU = 0.092-0.157). STAPLE consensus masks confirmed consistent overall shape annotations despite variability in boundary precision.

Conclusion: The study demonstrated low variability in the semantic segmentation of intraperitoneal organs in minimally invasive abdominal surgery, performed by surgical residents using MedSAM. While DSC and Fleiss' kappa values confirm strong inter-annotator agreement, the relatively low BIoU values point to challenges in boundary precision, especially for anatomically complex or variable structures. These results establish a benchmark for expanding annotation efforts to larger datasets and more detailed anatomical features.

目的:由于解剖结构界限不清,腹部手术解剖鉴定是主观的。这些结构的语义分割依赖于对带有未知不确定性的边界的准确识别。鉴于其固有的主观性,评估注释的充分性是很重要的。本研究旨在评估外科住院医师使用MedSAM识别和分割解剖结构的变异性。方法:由一组外科住院医师使用MedSAM对来自Dresden外科解剖数据集和Endoscapes2023数据集的图像进行语义注释,这些图像分为腹壁、结肠、肝脏、小肠、脾脏、胃和胆囊。每个类有3到4组注释。通过DSC、ICC、BIoU,并使用同步真值和性能水平估计算法来获得共识掩码,并通过计算所有注释和参考之间的Fleiss kappa协议来评估注释者间的可变性。结果:研究显示外科住院医师的注释者之间有很强的一致性,DSC值为0.84-0.95,Fleiss kappa值为0.85 - 0.91。表面区域可信度为良好至优异(ICC = 0.62 ~ 0.91),边界圈定重现性较低(BIoU = 0.092 ~ 0.157)。尽管边界精度存在差异,但STAPLE共识掩模证实了整体形状注释的一致性。结论:该研究表明,在微创腹部手术中,由外科住院医师使用MedSAM进行的腹膜内器官的语义分割具有低变异性。虽然DSC和Fleiss的kappa值证实了注释者之间的强烈一致性,但相对较低的BIoU值表明边界精度存在挑战,特别是对于解剖复杂或可变的结构。这些结果为将注释工作扩展到更大的数据集和更详细的解剖特征建立了基准。
{"title":"Benchmarking variability in semantic segmentation in minimally invasive abdominal surgery.","authors":"L T Castro, C Barata, P Martins, F Afonso, M Pascoal, C Santiago, L Mennillo, P Mira, D Stoyanov, M Chand, S Bano, A S Soares","doi":"10.1007/s11548-025-03562-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03562-3","url":null,"abstract":"<p><strong>Purpose: </strong>Anatomical identification during abdominal surgery is subjective given unclear boundaries of anatomical structures. Semantic segmentation of these structures relies on an accurate identification of the boundaries which carries an unknown uncertainty. Given its inherent subjectivity, it is important to assess annotation adequacy. This study aims to evaluate variability in anatomical structure identification and segmentation using MedSAM by surgical residents.</p><p><strong>Methods: </strong>Images from the Dresden Surgical Anatomy Dataset and the Endoscapes2023 Dataset were semantically annotated by a group of surgery residents using MedSAM in the following classes: abdominal wall, colon, liver, small bowel, spleen, stomach and gallbladder. Each class had 3 to 4 sets of annotations. Inter-annotator variability was assessed through DSC, ICC, BIoU and using the Simultaneous Truth and Performance Level Estimation algorithm to obtain a consensus mask and by calculating Fleiss' kappa agreement between all annotations and reference.</p><p><strong>Results: </strong>The study showed strong inter-annotator agreement among surgical residents, with DSC values of 0.84-0.95 and Fleiss' kappa between 0.85 and 0.91. Surface area reliability was good to excellent (ICC = 0.62-0.91), while boundary delineation showed lower reproducibility (BIoU = 0.092-0.157). STAPLE consensus masks confirmed consistent overall shape annotations despite variability in boundary precision.</p><p><strong>Conclusion: </strong>The study demonstrated low variability in the semantic segmentation of intraperitoneal organs in minimally invasive abdominal surgery, performed by surgical residents using MedSAM. While DSC and Fleiss' kappa values confirm strong inter-annotator agreement, the relatively low BIoU values point to challenges in boundary precision, especially for anatomically complex or variable structures. These results establish a benchmark for expanding annotation efforts to larger datasets and more detailed anatomical features.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical shape model-based estimation of registration error in computer-assisted total knee arthroplasty. 基于统计形状模型的计算机辅助全膝关节置换术配准误差估计。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-06 DOI: 10.1007/s11548-025-03566-z
Behnaz Gheflati, Morteza Mirzaei, Joel Zuhars, Sunil Rottoo, Hassan Rivaz

Purpose: Computer-assisted surgical navigation systems have been developed to improve the precision of total knee arthroplasty (TKA) by providing real-time guidance on implant alignment relative to patient anatomy. However, surface registration remains a key source of error that can propagate through the surgical workflow. This study investigates how patient-specific femoral bone geometry influences registration accuracy, aiming to enhance the reliability and consistency of computer-assisted orthopedic procedures.

Methods: Eighteen high-fidelity 3D-printed femur models were used to simulate intraoperative digitization. Surface points collected from the distal femur were registered to preoperative CT-derived models using a rigid iterative closest point (ICP) algorithm. Registration accuracy was quantified across six degrees of freedom. An in-house statistical shape model (SSM), built from 114 CT femurs, was employed to extract shape coefficients and correlate them with the measured registration errors. To verify robustness, additional analyses were conducted using synthetic and in silico CT-based femur datasets.

Results: Significant correlations (p-values < 0.05) were observed between specific shape coefficients and registration errors. The third and fourth principal shape modes showed the strongest associations with rotational misalignments, particularly flexion-extension and varus-valgus components. These findings demonstrate that geometric variability in the distal femur, especially condylar morphology, plays a major role in determining the stability and accuracy of surface-based registration.

Conclusions: Registration errors in TKA are strongly influenced by patient-specific bone geometry. Shape features derived from statistical shape models can serve as reliable predictors of registration performance, providing quantitative insight into how anatomical variability impacts surgical precision and alignment accuracy in computer-assisted total knee arthroplasty.

目的:计算机辅助手术导航系统的开发是为了提高全膝关节置换术(TKA)的精度,通过提供相对于患者解剖结构的植入物对齐的实时指导。然而,表面配准仍然是一个关键的错误来源,可以在整个手术工作流程中传播。本研究探讨了患者特定的股骨几何形状如何影响配准精度,旨在提高计算机辅助骨科手术的可靠性和一致性。方法:采用18个高保真3d打印股骨模型模拟术中数字化。从股骨远端收集的表面点使用刚性迭代最近点(ICP)算法注册到术前ct衍生模型。在六个自由度上量化配准精度。利用114根CT股骨建立的内部统计形状模型(SSM)提取形状系数,并将其与测量的配准误差相关联。为了验证稳健性,使用合成和基于计算机ct的股骨数据集进行了额外的分析。结果:显著相关性(p值)结论:TKA的配准误差受患者特异性骨几何形状的强烈影响。来自统计形状模型的形状特征可以作为注册性能的可靠预测因子,为计算机辅助全膝关节置换术中解剖变异如何影响手术精度和对齐精度提供定量见解。
{"title":"Statistical shape model-based estimation of registration error in computer-assisted total knee arthroplasty.","authors":"Behnaz Gheflati, Morteza Mirzaei, Joel Zuhars, Sunil Rottoo, Hassan Rivaz","doi":"10.1007/s11548-025-03566-z","DOIUrl":"https://doi.org/10.1007/s11548-025-03566-z","url":null,"abstract":"<p><strong>Purpose: </strong>Computer-assisted surgical navigation systems have been developed to improve the precision of total knee arthroplasty (TKA) by providing real-time guidance on implant alignment relative to patient anatomy. However, surface registration remains a key source of error that can propagate through the surgical workflow. This study investigates how patient-specific femoral bone geometry influences registration accuracy, aiming to enhance the reliability and consistency of computer-assisted orthopedic procedures.</p><p><strong>Methods: </strong>Eighteen high-fidelity 3D-printed femur models were used to simulate intraoperative digitization. Surface points collected from the distal femur were registered to preoperative CT-derived models using a rigid iterative closest point (ICP) algorithm. Registration accuracy was quantified across six degrees of freedom. An in-house statistical shape model (SSM), built from 114 CT femurs, was employed to extract shape coefficients and correlate them with the measured registration errors. To verify robustness, additional analyses were conducted using synthetic and in silico CT-based femur datasets.</p><p><strong>Results: </strong>Significant correlations (p-values < 0.05) were observed between specific shape coefficients and registration errors. The third and fourth principal shape modes showed the strongest associations with rotational misalignments, particularly flexion-extension and varus-valgus components. These findings demonstrate that geometric variability in the distal femur, especially condylar morphology, plays a major role in determining the stability and accuracy of surface-based registration.</p><p><strong>Conclusions: </strong>Registration errors in TKA are strongly influenced by patient-specific bone geometry. Shape features derived from statistical shape models can serve as reliable predictors of registration performance, providing quantitative insight into how anatomical variability impacts surgical precision and alignment accuracy in computer-assisted total knee arthroplasty.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence of high-performance image-to-image translation networks on clinical visual assessment and outcome prediction: utilizing ultrasound to MRI translation in prostate cancer. 高性能图像到图像转换网络对临床视觉评估和预后预测的影响:利用超声到MRI翻译前列腺癌。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-07-19 DOI: 10.1007/s11548-025-03481-3
Mohammad R Salmanpour, Amin Mousavi, Yixi Xu, William B Weeks, Ilker Hacihaliloglu

Purpose: Image-to-image (I2I) translation networks have emerged as promising tools for generating synthetic medical images; however, their clinical reliability and ability to preserve diagnostically relevant features remain underexplored. This study evaluates the performance of state-of-the-art 2D/3D I2I networks for converting ultrasound (US) images to synthetic MRI in prostate cancer (PCa) imaging. The novelty lies in combining radiomics, expert clinical evaluation, and classification performance to comprehensively benchmark these models for potential integration into real-world diagnostic workflows.

Methods: A dataset of 794 PCa patients was analyzed using ten leading I2I networks to synthesize MRI from US input. Radiomics feature (RF) analysis was performed using Spearman correlation to assess whether high-performing networks (SSIM > 0.85) preserved quantitative imaging biomarkers. A qualitative evaluation by seven experienced physicians assessed the anatomical realism, presence of artifacts, and diagnostic interpretability of synthetic images. Additionally, classification tasks using synthetic images were conducted using two machine learning and one deep learning model to assess the practical diagnostic benefit.

Results: Among all networks, 2D-Pix2Pix achieved the highest SSIM (0.855 ± 0.032). RF analysis showed that 76 out of 186 features were preserved post-translation, while the remainder were degraded or lost. Qualitative feedback revealed consistent issues with low-level feature preservation and artifact generation, particularly in lesion-rich regions. These evaluations were conducted to assess whether synthetic MRI retained clinically relevant patterns, supported expert interpretation, and improved diagnostic accuracy. Importantly, classification performance using synthetic MRI significantly exceeded that of US-based input, achieving average accuracy and AUC of ~ 0.93 ± 0.05.

Conclusion: Although 2D-Pix2Pix showed the best overall performance in similarity and partial RF preservation, improvements are still required in lesion-level fidelity and artifact suppression. The combination of radiomics, qualitative, and classification analyses offered a holistic view of the current strengths and limitations of I2I models, supporting their potential in clinical applications pending further refinement and validation.

目的:图像到图像(I2I)翻译网络已经成为生成合成医学图像的有前途的工具;然而,它们的临床可靠性和保留诊断相关特征的能力仍未得到充分探索。本研究评估了最先进的2D/3D I2I网络在前列腺癌(PCa)成像中将超声(US)图像转换为合成MRI的性能。其新颖之处在于将放射组学、专家临床评估和分类性能结合起来,对这些模型进行全面的基准测试,以潜在地集成到现实世界的诊断工作流程中。方法:使用10个领先的I2I网络对794例PCa患者的数据集进行分析,从US输入合成MRI。使用Spearman相关性进行放射组学特征(RF)分析,以评估高性能网络(SSIM > 0.85)是否保留了定量成像生物标志物。由7名经验丰富的医生进行定性评估,评估了解剖真实感、人工制品的存在以及合成图像的诊断可解释性。此外,使用两个机器学习模型和一个深度学习模型进行合成图像分类任务,以评估实际诊断效益。结果:2D-Pix2Pix网络SSIM最高(0.855±0.032)。RF分析显示,186个特征中有76个在翻译后被保留,而其余的则退化或丢失。定性反馈揭示了低水平特征保存和伪影生成的一致问题,特别是在病变丰富的区域。这些评估是为了评估合成MRI是否保留了临床相关的模式,支持专家解释,并提高了诊断的准确性。重要的是,使用合成MRI的分类性能显著优于基于基础的输入,达到了~ 0.93±0.05的平均准确率和AUC。结论:尽管2D-Pix2Pix在相似性和部分RF保存方面表现出最佳的整体性能,但在病变级保真度和伪影抑制方面仍需改进。放射组学、定性和分类分析的结合提供了对I2I模型当前优势和局限性的整体看法,支持其在临床应用中的潜力,有待进一步完善和验证。
{"title":"Influence of high-performance image-to-image translation networks on clinical visual assessment and outcome prediction: utilizing ultrasound to MRI translation in prostate cancer.","authors":"Mohammad R Salmanpour, Amin Mousavi, Yixi Xu, William B Weeks, Ilker Hacihaliloglu","doi":"10.1007/s11548-025-03481-3","DOIUrl":"10.1007/s11548-025-03481-3","url":null,"abstract":"<p><strong>Purpose: </strong>Image-to-image (I2I) translation networks have emerged as promising tools for generating synthetic medical images; however, their clinical reliability and ability to preserve diagnostically relevant features remain underexplored. This study evaluates the performance of state-of-the-art 2D/3D I2I networks for converting ultrasound (US) images to synthetic MRI in prostate cancer (PCa) imaging. The novelty lies in combining radiomics, expert clinical evaluation, and classification performance to comprehensively benchmark these models for potential integration into real-world diagnostic workflows.</p><p><strong>Methods: </strong>A dataset of 794 PCa patients was analyzed using ten leading I2I networks to synthesize MRI from US input. Radiomics feature (RF) analysis was performed using Spearman correlation to assess whether high-performing networks (SSIM > 0.85) preserved quantitative imaging biomarkers. A qualitative evaluation by seven experienced physicians assessed the anatomical realism, presence of artifacts, and diagnostic interpretability of synthetic images. Additionally, classification tasks using synthetic images were conducted using two machine learning and one deep learning model to assess the practical diagnostic benefit.</p><p><strong>Results: </strong>Among all networks, 2D-Pix2Pix achieved the highest SSIM (0.855 ± 0.032). RF analysis showed that 76 out of 186 features were preserved post-translation, while the remainder were degraded or lost. Qualitative feedback revealed consistent issues with low-level feature preservation and artifact generation, particularly in lesion-rich regions. These evaluations were conducted to assess whether synthetic MRI retained clinically relevant patterns, supported expert interpretation, and improved diagnostic accuracy. Importantly, classification performance using synthetic MRI significantly exceeded that of US-based input, achieving average accuracy and AUC of ~ 0.93 ± 0.05.</p><p><strong>Conclusion: </strong>Although 2D-Pix2Pix showed the best overall performance in similarity and partial RF preservation, improvements are still required in lesion-level fidelity and artifact suppression. The combination of radiomics, qualitative, and classification analyses offered a holistic view of the current strengths and limitations of I2I models, supporting their potential in clinical applications pending further refinement and validation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"125-135"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144668937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal consistency-aware network for renal artery segmentation in X-ray angiography. x线血管造影中肾动脉分割的时间一致性感知网络。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-08-02 DOI: 10.1007/s11548-025-03486-y
Botao Yang, Chunming Li, Simone Fezzi, Zehao Fan, Runguo Wei, Yankai Chen, Domenico Tavella, Flavio L Ribichini, Su Zhang, Faisal Sharif, Shengxian Tu

Purpose: Accurate segmentation of renal arteries from X-ray angiography videos is crucial for evaluating renal sympathetic denervation (RDN) procedures but remains challenging due to dynamic changes in contrast concentration and vessel morphology across frames. The purpose of this study is to propose TCA-Net, a deep learning model that improves segmentation consistency by leveraging local and global contextual information in angiography videos.

Methods: Our approach utilizes a novel deep learning framework that incorporates two key modules: a local temporal window vessel enhancement module and a global vessel refinement module (GVR). The local module fuses multi-scale temporal-spatial features to improve the semantic representation of vessels in the current frame, while the GVR module integrates decoupled attention strategies (video-level and object-level attention) and gating mechanisms to refine global vessel information and eliminate redundancy. To further improve segmentation consistency, a temporal perception consistency loss function is introduced during training.

Results: We evaluated our model using 195 renal artery angiography sequences for development and tested it on an external dataset from 44 patients. The results demonstrate that TCA-Net achieves an F1-score of 0.8678 for segmenting renal arteries, outperforming existing state-of-the-art segmentation methods.

Conclusion: We present TCA-Net, a deep learning-based model that significantly improves segmentation consistency for renal artery angiography videos. By effectively leveraging both local and global temporal contextual information, TCA-Net outperforms current methods and provides a reliable tool for assessing RDN procedures.

目的:从x线血管造影视频中准确分割肾动脉对于评估肾交感神经支配(RDN)手术至关重要,但由于对比剂浓度和跨框架血管形态的动态变化,仍然具有挑战性。本研究的目的是提出TCA-Net,这是一种深度学习模型,通过利用血管造影视频中的本地和全局上下文信息来提高分割一致性。方法:我们的方法采用了一种新的深度学习框架,该框架包含两个关键模块:局部时间窗口血管增强模块和全局血管细化模块(GVR)。局部模块融合了多尺度时空特征,以改善当前框架中船舶的语义表示,而GVR模块集成了解耦关注策略(视频级和对象级关注)和门控机制,以细化全局船舶信息并消除冗余。为了进一步提高分割一致性,在训练过程中引入了时间感知一致性损失函数。结果:我们使用195个肾动脉血管造影序列来评估我们的模型,并在来自44名患者的外部数据集上进行了测试。结果表明,TCA-Net分割肾动脉的f1得分为0.8678,优于现有的最先进的分割方法。结论:我们提出了一种基于深度学习的TCA-Net模型,该模型显著提高了肾动脉血管造影视频的分割一致性。通过有效地利用本地和全球时间上下文信息,TCA-Net优于当前方法,并为评估RDN过程提供了可靠的工具。
{"title":"Temporal consistency-aware network for renal artery segmentation in X-ray angiography.","authors":"Botao Yang, Chunming Li, Simone Fezzi, Zehao Fan, Runguo Wei, Yankai Chen, Domenico Tavella, Flavio L Ribichini, Su Zhang, Faisal Sharif, Shengxian Tu","doi":"10.1007/s11548-025-03486-y","DOIUrl":"10.1007/s11548-025-03486-y","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate segmentation of renal arteries from X-ray angiography videos is crucial for evaluating renal sympathetic denervation (RDN) procedures but remains challenging due to dynamic changes in contrast concentration and vessel morphology across frames. The purpose of this study is to propose TCA-Net, a deep learning model that improves segmentation consistency by leveraging local and global contextual information in angiography videos.</p><p><strong>Methods: </strong>Our approach utilizes a novel deep learning framework that incorporates two key modules: a local temporal window vessel enhancement module and a global vessel refinement module (GVR). The local module fuses multi-scale temporal-spatial features to improve the semantic representation of vessels in the current frame, while the GVR module integrates decoupled attention strategies (video-level and object-level attention) and gating mechanisms to refine global vessel information and eliminate redundancy. To further improve segmentation consistency, a temporal perception consistency loss function is introduced during training.</p><p><strong>Results: </strong>We evaluated our model using 195 renal artery angiography sequences for development and tested it on an external dataset from 44 patients. The results demonstrate that TCA-Net achieves an F1-score of 0.8678 for segmenting renal arteries, outperforming existing state-of-the-art segmentation methods.</p><p><strong>Conclusion: </strong>We present TCA-Net, a deep learning-based model that significantly improves segmentation consistency for renal artery angiography videos. By effectively leveraging both local and global temporal contextual information, TCA-Net outperforms current methods and provides a reliable tool for assessing RDN procedures.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"71-81"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144769280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Watch and learn: leveraging expert knowledge and language for surgical video understanding. 观看和学习:利用专家知识和语言来理解手术视频。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-07-02 DOI: 10.1007/s11548-025-03472-4
David Gastager, Ghazal Ghazaei, Constantin Patsch

Purpose: Automated surgical workflow analysis is a common yet challenging task with diverse applications in surgical education, research, and clinical decision-making. Although videos are commonly collected during surgical interventions, the lack of annotated datasets hinders the development of accurate and comprehensive workflow analysis solutions. We introduce a novel approach for addressing the sparsity and heterogeneity of annotated training data inspired by the human learning procedure of watching experts and understanding their explanations.

Methods: Our method leverages a video-language model trained on alignment, denoising, and generative tasks to learn short-term spatio-temporal and multimodal representations. A task-specific temporal model is then used to capture relationships across entire videos. To achieve comprehensive video-language understanding in the surgical domain, we introduce a data collection and filtering strategy to construct a large-scale pretraining dataset from educational YouTube videos. We then utilize parameter-efficient fine-tuning by projecting downstream task annotations from publicly available surgical datasets into the language domain.

Results: Extensive experiments in two surgical domains demonstrate the effectiveness of our approach, with performance improvements of up to 7% in phase segmentation tasks, 5% in zero-shot phase segmentation, and comparable capabilities to fully supervised models in few-shot settings. Harnessing our model's capabilities for long-range temporal localization and text generation, we present the first comprehensive solution for dense video captioning (DVC) of surgical videos, addressing this task despite the absence of existing DVC datasets in the surgical domain.

Conclusion: We introduce a novel approach to surgical workflow understanding that leverages video-language pretraining, large-scale video pretraining, and optimized fine-tuning. Our method improves performance over state-of-the-art techniques and enables new downstream tasks for surgical video understanding.

目的:自动化手术工作流程分析是一项常见但具有挑战性的任务,在外科教育、研究和临床决策中有着广泛的应用。虽然视频通常在手术干预期间收集,但缺乏注释数据集阻碍了准确和全面的工作流分析解决方案的发展。我们引入了一种新的方法来解决受人类观察专家和理解其解释的学习过程启发的带注释的训练数据的稀疏性和异质性。方法:我们的方法利用经过对齐、去噪和生成任务训练的视频语言模型来学习短期时空和多模态表示。然后使用特定于任务的时间模型来捕获整个视频之间的关系。为了在外科领域实现全面的视频语言理解,我们引入了一种数据收集和过滤策略,从YouTube教育视频中构建大规模的预训练数据集。然后,我们通过将公开可用的外科数据集的下游任务注释投影到语言领域,利用参数高效微调。结果:在两个外科领域的大量实验证明了我们的方法的有效性,在相位分割任务中性能提高了7%,在零镜头相位分割任务中性能提高了5%,并且在少量镜头设置中具有与完全监督模型相当的能力。利用我们的模型远程时间定位和文本生成的能力,我们提出了外科视频密集视频字幕(DVC)的第一个全面解决方案,尽管在外科领域缺乏现有的DVC数据集,但解决了这一任务。结论:我们介绍了一种利用视频语言预训练、大规模视频预训练和优化微调来理解手术工作流程的新方法。我们的方法比最先进的技术提高了性能,并为外科手术视频理解提供了新的下游任务。
{"title":"Watch and learn: leveraging expert knowledge and language for surgical video understanding.","authors":"David Gastager, Ghazal Ghazaei, Constantin Patsch","doi":"10.1007/s11548-025-03472-4","DOIUrl":"10.1007/s11548-025-03472-4","url":null,"abstract":"<p><strong>Purpose: </strong>Automated surgical workflow analysis is a common yet challenging task with diverse applications in surgical education, research, and clinical decision-making. Although videos are commonly collected during surgical interventions, the lack of annotated datasets hinders the development of accurate and comprehensive workflow analysis solutions. We introduce a novel approach for addressing the sparsity and heterogeneity of annotated training data inspired by the human learning procedure of watching experts and understanding their explanations.</p><p><strong>Methods: </strong>Our method leverages a video-language model trained on alignment, denoising, and generative tasks to learn short-term spatio-temporal and multimodal representations. A task-specific temporal model is then used to capture relationships across entire videos. To achieve comprehensive video-language understanding in the surgical domain, we introduce a data collection and filtering strategy to construct a large-scale pretraining dataset from educational YouTube videos. We then utilize parameter-efficient fine-tuning by projecting downstream task annotations from publicly available surgical datasets into the language domain.</p><p><strong>Results: </strong>Extensive experiments in two surgical domains demonstrate the effectiveness of our approach, with performance improvements of up to 7% in phase segmentation tasks, 5% in zero-shot phase segmentation, and comparable capabilities to fully supervised models in few-shot settings. Harnessing our model's capabilities for long-range temporal localization and text generation, we present the first comprehensive solution for dense video captioning (DVC) of surgical videos, addressing this task despite the absence of existing DVC datasets in the surgical domain.</p><p><strong>Conclusion: </strong>We introduce a novel approach to surgical workflow understanding that leverages video-language pretraining, large-scale video pretraining, and optimized fine-tuning. Our method improves performance over state-of-the-art techniques and enables new downstream tasks for surgical video understanding.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"185-194"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144546070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time corneal image segmentation for cataract surgery based on detection framework. 基于检测框架的白内障手术角膜图像实时分割。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-09-05 DOI: 10.1007/s11548-025-03506-x
Xueyi Shi, Dexun Zhang, Shenwen Liang, Wenjing Meng, Huoling Luo, Tianqiao Zhang

Objective: Cataract surgery is among the most frequently performed procedures worldwide. Accurate, real-time segmentation of the cornea and surgical instruments is vital for intraoperative guidance and surgical education. However, most existing deep learning-based segmentation methods depend on pixel-level annotations, which are time-consuming and limit practical deployment.

Methods: We present EllipseNet, an anchor-free framework utilizing ellipse-based modeling for real-time corneal segmentation in cataract surgery. Built upon the Hourglass network for feature extraction, EllipseNet requires only simple rectangular bounding box annotations from users. It then autonomously infers the major and minor axes of the corneal ellipse, generating elliptical bounding boxes that more precisely match corneal shapes.

Results: EllipseNet achieves efficient real-time performance by segmenting each image within 42 ms and attaining a Dice accuracy of 95.81%. It delivers segmentation speed nearly three times faster than state-of-the-art models, while maintaining similar accuracy levels.

Conclusion: EllipseNet provides rapid and accurate corneal segmentation in real time, significantly reducing annotation workload for practitioners. Its design streamlines the segmentation pipeline, lowering the barrier for clinical application. The source code is publicly available at: https://github.com/shixueyi/corneal-segmentation .

目的:白内障手术是世界上最常见的手术之一。准确、实时地分割角膜和手术器械对术中指导和手术教育至关重要。然而,大多数现有的基于深度学习的分割方法依赖于像素级注释,这既耗时又限制了实际部署。方法:我们提出了EllipseNet,这是一个基于椭圆模型的无锚框架,用于白内障手术中的实时角膜分割。基于沙漏网络进行特征提取,EllipseNet只需要用户提供简单的矩形边界框注释。然后自动推断出角膜椭圆的长轴和短轴,生成更精确匹配角膜形状的椭圆边界框。结果:EllipseNet在42 ms内对每张图像进行分割,实现了高效的实时性,Dice准确率达到95.81%。它提供的分割速度比最先进的模型快近三倍,同时保持相似的精度水平。结论:EllipseNet提供了快速、准确的实时角膜分割,显著减少了从业者的标注工作量。其设计简化了分割流程,降低了临床应用的障碍。源代码可以在:https://github.com/shixueyi/corneal-segmentation上公开获得。
{"title":"Real-time corneal image segmentation for cataract surgery based on detection framework.","authors":"Xueyi Shi, Dexun Zhang, Shenwen Liang, Wenjing Meng, Huoling Luo, Tianqiao Zhang","doi":"10.1007/s11548-025-03506-x","DOIUrl":"10.1007/s11548-025-03506-x","url":null,"abstract":"<p><strong>Objective: </strong>Cataract surgery is among the most frequently performed procedures worldwide. Accurate, real-time segmentation of the cornea and surgical instruments is vital for intraoperative guidance and surgical education. However, most existing deep learning-based segmentation methods depend on pixel-level annotations, which are time-consuming and limit practical deployment.</p><p><strong>Methods: </strong>We present EllipseNet, an anchor-free framework utilizing ellipse-based modeling for real-time corneal segmentation in cataract surgery. Built upon the Hourglass network for feature extraction, EllipseNet requires only simple rectangular bounding box annotations from users. It then autonomously infers the major and minor axes of the corneal ellipse, generating elliptical bounding boxes that more precisely match corneal shapes.</p><p><strong>Results: </strong>EllipseNet achieves efficient real-time performance by segmenting each image within 42 ms and attaining a Dice accuracy of 95.81%. It delivers segmentation speed nearly three times faster than state-of-the-art models, while maintaining similar accuracy levels.</p><p><strong>Conclusion: </strong>EllipseNet provides rapid and accurate corneal segmentation in real time, significantly reducing annotation workload for practitioners. Its design streamlines the segmentation pipeline, lowering the barrier for clinical application. The source code is publicly available at: https://github.com/shixueyi/corneal-segmentation .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"83-92"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The interpretable surgical temporal informer: explainable surgical time completion prediction. 可解释的手术时间信息:可解释的手术时间完成预测。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-08-23 DOI: 10.1007/s11548-025-03448-4
Roger D Soberanis-Mukul, Rohit Shankar, Lalithkumar Seenivasan, Jose L Porras, Masaru Ishii, Mathias Unberath

Purpose: Predicting surgical time completion helps streamline surgical workflow and OR utilization, enhancing hospital efficacy. When time prediction is based on interventional video of the surgical site, time predictions may correlate with technical proficiency of the surgeon because skill is a useful proxy of completion time. To understand features that are predictive of surgical time in surgical site video, we develop prototype-like visual explanations, making them applicable to video sequences.

Methods: We introduce an interpretable method for predicting surgical duration by identifying prototype-like patterns within egocentric video of the surgical site. Unlike conventional image-based prototype models that generate patch-based prototypes, our method extracts video-based explanations tied to segments of surgical videos with similar time deviation patterns. We achieve this by comparing the principal components of feature representation differences at various time points in the predictions. To effectively capture long-range dependencies in the prediction task, we employ an informer as the primary predictive model.

Results: This model is applied to a dataset of 42 point-of-view craniotomy videos, collected under an approved IRB protocol. On average, our interpretable model performs better than the baseline models in surgical time completion.

Conclusion: Our approach not only contributes to the interpretability of surgical time predictions but also takes full advantage of the detailed information provided by surgical video data.

目的:预测手术完成时间有助于简化手术流程和手术室利用率,提高医院疗效。当时间预测是基于手术部位的介入视频时,时间预测可能与外科医生的技术熟练程度相关,因为技能是完成时间的有用代表。为了理解手术现场视频中预测手术时间的特征,我们开发了类似原型的视觉解释,使其适用于视频序列。方法:我们介绍了一种可解释的方法,通过识别手术部位以自我为中心的视频中的原型模式来预测手术时间。与生成基于补丁的原型的传统基于图像的原型模型不同,我们的方法提取了与具有相似时间偏差模式的手术视频片段相关的基于视频的解释。我们通过比较预测中不同时间点特征表示差异的主成分来实现这一点。为了在预测任务中有效地捕获远程依赖关系,我们采用了一个信息者作为主要的预测模型。结果:该模型应用于42个视点开颅视频的数据集,这些视频是在批准的IRB协议下收集的。平均而言,我们的可解释模型在手术时间完成方面优于基线模型。结论:我们的方法不仅有助于手术时间预测的可解释性,而且充分利用了手术视频数据提供的详细信息。
{"title":"The interpretable surgical temporal informer: explainable surgical time completion prediction.","authors":"Roger D Soberanis-Mukul, Rohit Shankar, Lalithkumar Seenivasan, Jose L Porras, Masaru Ishii, Mathias Unberath","doi":"10.1007/s11548-025-03448-4","DOIUrl":"10.1007/s11548-025-03448-4","url":null,"abstract":"<p><strong>Purpose: </strong>Predicting surgical time completion helps streamline surgical workflow and OR utilization, enhancing hospital efficacy. When time prediction is based on interventional video of the surgical site, time predictions may correlate with technical proficiency of the surgeon because skill is a useful proxy of completion time. To understand features that are predictive of surgical time in surgical site video, we develop prototype-like visual explanations, making them applicable to video sequences.</p><p><strong>Methods: </strong>We introduce an interpretable method for predicting surgical duration by identifying prototype-like patterns within egocentric video of the surgical site. Unlike conventional image-based prototype models that generate patch-based prototypes, our method extracts video-based explanations tied to segments of surgical videos with similar time deviation patterns. We achieve this by comparing the principal components of feature representation differences at various time points in the predictions. To effectively capture long-range dependencies in the prediction task, we employ an informer as the primary predictive model.</p><p><strong>Results: </strong>This model is applied to a dataset of 42 point-of-view craniotomy videos, collected under an approved IRB protocol. On average, our interpretable model performs better than the baseline models in surgical time completion.</p><p><strong>Conclusion: </strong>Our approach not only contributes to the interpretability of surgical time predictions but also takes full advantage of the detailed information provided by surgical video data.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"11-19"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Liver mask-guided SAM-enhanced dual-decoder network for landmark segmentation in AR-guided surgery. 肝口罩引导的sam增强双解码器网络在ar引导手术中的地标分割。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-09-23 DOI: 10.1007/s11548-025-03516-9
Xukun Zhang, Sharib Ali, Yanlan Kang, Jingyi Zhu, Minghao Han, Le Wang, Xiaoying Wang, Lihua Zhang

Purpose: In augmented reality (AR)-guided laparoscopic liver surgery, accurate segmentation of liver landmarks is crucial for precise 3D-2D registration. However, existing methods struggle with complex structures, limited data, and class imbalance. In this study, we propose a novel approach to improve landmark segmentation performance by leveraging liver mask prediction.

Methods: We propose a dual-decoder model enhanced by a pre-trained segment anything model (SAM) encoder, where one decoder segments the liver and the other focuses on liver landmarks. The SAM encoder provides robust features for liver mask prediction, improving generalizability. A liver-guided consistency constraint establishes fine-grained spatial consistency between liver regions and landmarks, enhancing segmentation accuracy through detailed spatial modeling.

Results: The proposed method achieved state-of-the-art performance in liver landmark segmentation on two public laparoscopic datasets. By addressing feature entanglement, the dual-decoder framework with SAM and consistency constraints significantly improved segmentation in complex surgical scenarios.

Conclusion: The SAM-enhanced dual-decoder network, incorporating liver-guided consistency constraints, offers a promising solution for 2D landmark segmentation in AR-guided laparoscopic surgery. By mutually reinforcing liver mask and landmark segmentation, the method achieves improved accuracy and robustness for intraoperative applications.

目的:在增强现实(AR)引导下的腹腔镜肝脏手术中,肝脏标志物的准确分割对于精确的3D-2D配准至关重要。然而,现有的方法与复杂的结构、有限的数据和类的不平衡作斗争。在这项研究中,我们提出了一种利用肝膜预测来提高地标分割性能的新方法。方法:我们提出了一种双解码器模型,通过预训练的片段任意模型(SAM)编码器增强,其中一个解码器分割肝脏,另一个专注于肝脏地标。SAM编码器为肝掩膜预测提供了强大的功能,提高了通用性。肝脏引导的一致性约束在肝脏区域和地标之间建立了细粒度的空间一致性,通过详细的空间建模提高了分割精度。结果:所提出的方法在两个公开的腹腔镜数据集上取得了最先进的肝脏地标分割性能。通过解决特征纠缠,具有SAM和一致性约束的双解码器框架显著改善了复杂手术场景中的分割。结论:sam增强的双解码器网络,结合肝脏引导一致性约束,为ar引导下腹腔镜手术的二维地标分割提供了一种很有前景的解决方案。通过对肝掩膜和地标分割的相互强化,提高了算法的准确性和鲁棒性。
{"title":"Liver mask-guided SAM-enhanced dual-decoder network for landmark segmentation in AR-guided surgery.","authors":"Xukun Zhang, Sharib Ali, Yanlan Kang, Jingyi Zhu, Minghao Han, Le Wang, Xiaoying Wang, Lihua Zhang","doi":"10.1007/s11548-025-03516-9","DOIUrl":"10.1007/s11548-025-03516-9","url":null,"abstract":"<p><strong>Purpose: </strong>In augmented reality (AR)-guided laparoscopic liver surgery, accurate segmentation of liver landmarks is crucial for precise 3D-2D registration. However, existing methods struggle with complex structures, limited data, and class imbalance. In this study, we propose a novel approach to improve landmark segmentation performance by leveraging liver mask prediction.</p><p><strong>Methods: </strong>We propose a dual-decoder model enhanced by a pre-trained segment anything model (SAM) encoder, where one decoder segments the liver and the other focuses on liver landmarks. The SAM encoder provides robust features for liver mask prediction, improving generalizability. A liver-guided consistency constraint establishes fine-grained spatial consistency between liver regions and landmarks, enhancing segmentation accuracy through detailed spatial modeling.</p><p><strong>Results: </strong>The proposed method achieved state-of-the-art performance in liver landmark segmentation on two public laparoscopic datasets. By addressing feature entanglement, the dual-decoder framework with SAM and consistency constraints significantly improved segmentation in complex surgical scenarios.</p><p><strong>Conclusion: </strong>The SAM-enhanced dual-decoder network, incorporating liver-guided consistency constraints, offers a promising solution for 2D landmark segmentation in AR-guided laparoscopic surgery. By mutually reinforcing liver mask and landmark segmentation, the method achieves improved accuracy and robustness for intraoperative applications.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"115-124"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MR-safe robotic needle driver for real-time MRI-guided minimally invasive procedures: a feasibility study. 核磁共振安全机器人针驱动器用于实时核磁共振引导的微创手术:可行性研究。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-11-08 DOI: 10.1007/s11548-025-03545-4
Atharva Paralikar, Gang Li, Chima Oluigbo, Pavel Yarmolenko, Kevin Cleary, Reza Monfaredi

Purpose: This article reports on the development and feasibility testing of an MR-safe robotic needle driver. The needle driver is pneumatically actuated and designed for automatic insertion and extraction of needles along a straight trajectory within the MRI scanner.

Method: All parts use plastic resins and composite materials to ensure MR-safe operation. A needle could be clamped in the needle carriage using a pneumatically operated clamp. The clamp is designed to be easily attached and detached from the needle driver. Clamps with different opening sizes could accommodate a range of needles from 18 to 22 gauge. To mimic the manual procedure of needle insertion, a pneumatically operated rack-and-pinion mechanism simultaneously translates and rotates the needle carriage along a helical slot. Signal-to-noise ratio (SNR) and 2-D geometric distortion were measured to evaluate the MRI compatibility. Targeting was measured with an electromagnetic tracker. We also evaluated the maximum force that could be generated at the tip of the needle with different clamping pressures using a force sensor.

Results: We recorded the maximum percentage change in SNR for multiple configurations of needle drivers as 6.6% and the maximum geometric distortion at 0.24%. The needle driver's mean positioning accuracy for 105 targets at 50 mm depth was 2.38 ± 1.00 mm in a composite tissue phantom. The angulation error for the straight trajectory was 0.51°, and the mean linear trajectory deviation was statistically negligible. The measured force at the needle tip was 1.17N, 1.6N, and 2.12N at 30, 40, and 50 psi, respectively.

Conclusion: This preliminary study showed that the prototype of our robotic needle driver works as intended for the insertion and extraction of the needle. The driver is MR-safe and serves as a suitable platform for MRI-guided interventions.

目的:本文报道了一种核磁共振安全机器人打针器的研制和可行性测试。针头驱动器是气动驱动的,设计用于沿着MRI扫描仪内的直线轨迹自动插入和取出针头。方法:所有部件均采用塑料树脂和复合材料,确保核磁共振安全操作。可以使用气动钳将针夹在针架中。该夹具的设计是很容易连接和从针驱动器分离。不同开口尺寸的夹子可以容纳18到22号的针。为了模拟人工插针的过程,气动操作的齿条-小齿轮机构同时沿着螺旋槽平移和旋转针架。测量信噪比(SNR)和二维几何畸变来评估MRI兼容性。目标是用电磁跟踪器测量的。我们还使用力传感器评估了不同夹紧压力下针尖可能产生的最大力。结果:我们记录了多种配置的针驱动器的最大信噪比变化百分比为6.6%,最大几何畸变百分比为0.24%。在复合组织模体中,针驱动器对105个50 mm深度目标的平均定位精度为2.38±1.00 mm。直线轨迹的角度误差为0.51°,平均线性轨迹偏差在统计学上可以忽略不计。在30、40和50 psi的压力下,测得针尖处的力分别为1.17、1.6和2.12N。结论:这项初步研究表明,我们的机器人打针器原型在针的插入和拔出方面是预期的。驱动器是核磁共振安全的,可作为核磁共振引导干预的合适平台。
{"title":"MR-safe robotic needle driver for real-time MRI-guided minimally invasive procedures: a feasibility study.","authors":"Atharva Paralikar, Gang Li, Chima Oluigbo, Pavel Yarmolenko, Kevin Cleary, Reza Monfaredi","doi":"10.1007/s11548-025-03545-4","DOIUrl":"10.1007/s11548-025-03545-4","url":null,"abstract":"<p><strong>Purpose: </strong>This article reports on the development and feasibility testing of an MR-safe robotic needle driver. The needle driver is pneumatically actuated and designed for automatic insertion and extraction of needles along a straight trajectory within the MRI scanner.</p><p><strong>Method: </strong>All parts use plastic resins and composite materials to ensure MR-safe operation. A needle could be clamped in the needle carriage using a pneumatically operated clamp. The clamp is designed to be easily attached and detached from the needle driver. Clamps with different opening sizes could accommodate a range of needles from 18 to 22 gauge. To mimic the manual procedure of needle insertion, a pneumatically operated rack-and-pinion mechanism simultaneously translates and rotates the needle carriage along a helical slot. Signal-to-noise ratio (SNR) and 2-D geometric distortion were measured to evaluate the MRI compatibility. Targeting was measured with an electromagnetic tracker. We also evaluated the maximum force that could be generated at the tip of the needle with different clamping pressures using a force sensor.</p><p><strong>Results: </strong>We recorded the maximum percentage change in SNR for multiple configurations of needle drivers as 6.6% and the maximum geometric distortion at 0.24%. The needle driver's mean positioning accuracy for 105 targets at 50 mm depth was 2.38 ± 1.00 mm in a composite tissue phantom. The angulation error for the straight trajectory was 0.51°, and the mean linear trajectory deviation was statistically negligible. The measured force at the needle tip was 1.17N, 1.6N, and 2.12N at 30, 40, and 50 psi, respectively.</p><p><strong>Conclusion: </strong>This preliminary study showed that the prototype of our robotic needle driver works as intended for the insertion and extraction of the needle. The driver is MR-safe and serves as a suitable platform for MRI-guided interventions.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"39-47"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive AI annotation of medical images in a virtual reality environment. 虚拟现实环境下医学图像的交互式AI标注。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-01 Epub Date: 2025-08-18 DOI: 10.1007/s11548-025-03497-9
Lotta Orsmaa, Mikko Saukkoriipi, Jari Kangas, Nastaran Rasouli, Jorma Järnstedt, Helena Mehtonen, Jaakko Sahlsten, Joel Jaskari, Kimmo Kaski, Roope Raisamo

Purpose: Artificial intelligence (AI) achieves high-quality annotations of radiological images, yet often lacks the robustness required in clinical practice. Interactive annotation starts with an AI-generated delineation, allowing radiologists to refine it with feedback, potentially improving precision and reliability. These techniques have been explored in two-dimensional desktop environments, but are not validated by radiologists or integrated with immersive visualization technologies. We used a Virtual Reality (VR) system to determine whether (1) the annotation quality improves when radiologists can edit the AI annotation and (2) whether the extra work done by editing is worthwhile.

Methods: We evaluated the clinical feasibility of an interactive VR approach to annotate mandibular and mental foramina on segmented 3D mandibular models. Three experienced dentomaxillofacial radiologists reviewed AI-generated annotations and, when needed, refined them at the voxel level in 3D space through click-based interactions until clinical standards were met.

Results: Our results indicate that integrating expert feedback within an immersive VR environment enhances annotation accuracy, improves clinical usability, and offers valuable insights for developing medical image analysis systems incorporating radiologist input.

Conclusion: This study is the first to compare the quality of original and interactive AI annotation and to use radiologists' opinions as the measure. More research is needed for generalization.

目的:人工智能(AI)实现了高质量的放射图像注释,但在临床实践中往往缺乏鲁棒性。交互式注释从人工智能生成的描绘开始,允许放射科医生通过反馈对其进行改进,从而有可能提高精度和可靠性。这些技术已经在二维桌面环境中进行了探索,但没有得到放射科医生的验证,也没有与沉浸式可视化技术相结合。我们使用虚拟现实(VR)系统来确定(1)当放射科医生可以编辑人工智能注释时,注释质量是否得到改善;(2)编辑所做的额外工作是否值得。方法:我们评估了交互式VR方法在分段三维下颌模型上标注下颌和颏孔的临床可行性。三名经验丰富的牙颌面放射科医生审查了人工智能生成的注释,并在需要时通过基于点击的交互在3D空间的体素级别对其进行了改进,直到达到临床标准。结果:我们的研究结果表明,在沉浸式VR环境中集成专家反馈可以提高注释准确性,提高临床可用性,并为开发包含放射科医生输入的医学图像分析系统提供有价值的见解。结论:本研究首次比较了原始和交互式人工智能注释的质量,并使用放射科医生的意见作为衡量标准。需要更多的研究来推广。
{"title":"Interactive AI annotation of medical images in a virtual reality environment.","authors":"Lotta Orsmaa, Mikko Saukkoriipi, Jari Kangas, Nastaran Rasouli, Jorma Järnstedt, Helena Mehtonen, Jaakko Sahlsten, Joel Jaskari, Kimmo Kaski, Roope Raisamo","doi":"10.1007/s11548-025-03497-9","DOIUrl":"10.1007/s11548-025-03497-9","url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI) achieves high-quality annotations of radiological images, yet often lacks the robustness required in clinical practice. Interactive annotation starts with an AI-generated delineation, allowing radiologists to refine it with feedback, potentially improving precision and reliability. These techniques have been explored in two-dimensional desktop environments, but are not validated by radiologists or integrated with immersive visualization technologies. We used a Virtual Reality (VR) system to determine whether (1) the annotation quality improves when radiologists can edit the AI annotation and (2) whether the extra work done by editing is worthwhile.</p><p><strong>Methods: </strong>We evaluated the clinical feasibility of an interactive VR approach to annotate mandibular and mental foramina on segmented 3D mandibular models. Three experienced dentomaxillofacial radiologists reviewed AI-generated annotations and, when needed, refined them at the voxel level in 3D space through click-based interactions until clinical standards were met.</p><p><strong>Results: </strong>Our results indicate that integrating expert feedback within an immersive VR environment enhances annotation accuracy, improves clinical usability, and offers valuable insights for developing medical image analysis systems incorporating radiologist input.</p><p><strong>Conclusion: </strong>This study is the first to compare the quality of original and interactive AI annotation and to use radiologists' opinions as the measure. More research is needed for generalization.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"49-58"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12929304/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Assisted Radiology and Surgery
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1