首页 > 最新文献

International Journal of Computer Assisted Radiology and Surgery最新文献

英文 中文
Large language models with retrieval-augmented generation enhance expert modelling of Bayesian network for clinical decision support. 基于检索增强生成的大型语言模型增强了临床决策支持贝叶斯网络的专家建模。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-03 DOI: 10.1007/s11548-025-03524-9
Mario A Cypko, Muhammad Agus Salim, Aditya Kumar, Leonard Berliner, Andreas Dietz, Matthaeus Stoehr, Oliver Amft

Purpose: Bayesian networks (BNs) are valuable for clinical decision support due to their transparency and interpretability. However, BN modelling requires considerable manual effort. This study explores how integrating large language models (LLMs) with retrieval-augmented generation (RAG) can improve BN modelling by increasing efficiency, reducing cognitive workload, and ensuring accuracy.

Methods: We developed a web-based BN modelling service that integrates an LLM-RAG pipeline. A fine-tuned GTE-Large embedding model was employed for knowledge retrieval, optimised through recursive chunking and query expansion. To ensure accurate BN suggestions, we defined a causal structure for medical idioms by unifying existing BN frameworks. GPT-4 and Mixtral 8x7B were used to handle complex data interpretation and to generate modelling suggestions, respectively. A user study with four clinicians assessed usability, retrieval accuracy, and cognitive workload using NASA-TLX. The study demonstrated the system's potential for efficient and clinically relevant BN modelling.

Results: The RAG pipeline improved retrieval accuracy and answer relevance. Recursive chunking with the fine-tuned embedding model GTE-Large achieved the highest retrieval accuracy score (0.9). Query expansion and Hyde optimisation enhanced retrieval accuracy for semantic chunking (0.75 to 0.85). Responses maintained high faithfulness ( 0.9). However, the LLM occasionally failed to adhere to predefined causal structures and medical idioms. All clinicians, regardless of BN experience, created comprehensive models within one hour. Experienced clinicians produced more complex models, but occasionally introduced causality errors, while less experienced users adhered more accurately to predefined structures. The tool reduced cognitive workload (2/7 NASA-TLX) and was described as intuitive, although workflow interruptions and minor technical issues highlighted areas for improvement.

Conclusion: Integrating LLM-RAG into BN modelling enhances efficiency and accuracy. Future work may focus on automated preprocessing, refinements of the user interface, and extending the RAG pipeline with validation steps and external biomedical sources. Generative AI holds promise for expert-driven knowledge modelling.

目的:贝叶斯网络(BNs)因其透明性和可解释性在临床决策支持中具有重要价值。然而,BN建模需要大量的手工工作。本研究探讨了如何将大型语言模型(llm)与检索增强生成(RAG)相结合,通过提高效率、减少认知工作量和确保准确性来改善BN建模。方法:我们开发了一个基于web的BN建模服务,该服务集成了LLM-RAG管道。采用优化后的GTE-Large嵌入模型进行知识检索,并通过递归分块和查询扩展进行优化。为了确保准确的BN建议,我们通过统一现有的BN框架定义了医学习语的因果结构。GPT-4和Mixtral 8x7B分别用于处理复杂的数据解释和生成建模建议。一项由四位临床医生参与的用户研究评估了NASA-TLX的可用性、检索准确性和认知工作量。该研究证明了该系统在高效和临床相关的BN建模方面的潜力。结果:RAG流水线提高了检索准确率和答案相关性。采用微调嵌入模型GTE-Large的递归分块获得了最高的检索准确率得分(0.9)。查询扩展和Hyde优化提高了语义分块的检索精度(0.75到0.85)。应答保持高可信度(≥0.9)。然而,法学硕士偶尔不能坚持预定义的因果结构和医学习语。所有临床医生,无论BN经验如何,都在一小时内创建了全面的模型。经验丰富的临床医生产生了更复杂的模型,但偶尔会引入因果关系错误,而经验不足的用户更准确地遵循预定义的结构。该工具减少了认知工作量(2/7 NASA-TLX),并被描述为直观的,尽管工作流程中断和一些小的技术问题突出了需要改进的领域。结论:将LLM-RAG集成到BN建模中可以提高效率和准确性。未来的工作可能集中在自动化预处理,用户界面的改进,以及通过验证步骤和外部生物医学来源扩展RAG管道。生成式人工智能有望实现专家驱动的知识建模。
{"title":"Large language models with retrieval-augmented generation enhance expert modelling of Bayesian network for clinical decision support.","authors":"Mario A Cypko, Muhammad Agus Salim, Aditya Kumar, Leonard Berliner, Andreas Dietz, Matthaeus Stoehr, Oliver Amft","doi":"10.1007/s11548-025-03524-9","DOIUrl":"https://doi.org/10.1007/s11548-025-03524-9","url":null,"abstract":"<p><strong>Purpose: </strong>Bayesian networks (BNs) are valuable for clinical decision support due to their transparency and interpretability. However, BN modelling requires considerable manual effort. This study explores how integrating large language models (LLMs) with retrieval-augmented generation (RAG) can improve BN modelling by increasing efficiency, reducing cognitive workload, and ensuring accuracy.</p><p><strong>Methods: </strong>We developed a web-based BN modelling service that integrates an LLM-RAG pipeline. A fine-tuned GTE-Large embedding model was employed for knowledge retrieval, optimised through recursive chunking and query expansion. To ensure accurate BN suggestions, we defined a causal structure for medical idioms by unifying existing BN frameworks. GPT-4 and Mixtral 8x7B were used to handle complex data interpretation and to generate modelling suggestions, respectively. A user study with four clinicians assessed usability, retrieval accuracy, and cognitive workload using NASA-TLX. The study demonstrated the system's potential for efficient and clinically relevant BN modelling.</p><p><strong>Results: </strong>The RAG pipeline improved retrieval accuracy and answer relevance. Recursive chunking with the fine-tuned embedding model GTE-Large achieved the highest retrieval accuracy score (0.9). Query expansion and Hyde optimisation enhanced retrieval accuracy for semantic chunking (0.75 to 0.85). Responses maintained high faithfulness ( <math><mo>≥</mo></math> 0.9). However, the LLM occasionally failed to adhere to predefined causal structures and medical idioms. All clinicians, regardless of BN experience, created comprehensive models within one hour. Experienced clinicians produced more complex models, but occasionally introduced causality errors, while less experienced users adhered more accurately to predefined structures. The tool reduced cognitive workload (2/7 NASA-TLX) and was described as intuitive, although workflow interruptions and minor technical issues highlighted areas for improvement.</p><p><strong>Conclusion: </strong>Integrating LLM-RAG into BN modelling enhances efficiency and accuracy. Future work may focus on automated preprocessing, refinements of the user interface, and extending the RAG pipeline with validation steps and external biomedical sources. Generative AI holds promise for expert-driven knowledge modelling.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145440187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Stereo reconstruction from microscopic images for computer-assisted ophthalmic surgery. 更正:用于计算机辅助眼科手术的显微图像立体重建。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 DOI: 10.1007/s11548-024-03270-4
Rebekka Peter, Sofia Moreira, Eleonora Tagliabue, Matthias Hillenbrand, Rita G Nunes, Franziska Mathis-Ullrich
{"title":"Correction to: Stereo reconstruction from microscopic images for computer-assisted ophthalmic surgery.","authors":"Rebekka Peter, Sofia Moreira, Eleonora Tagliabue, Matthias Hillenbrand, Rita G Nunes, Franziska Mathis-Ullrich","doi":"10.1007/s11548-024-03270-4","DOIUrl":"10.1007/s11548-024-03270-4","url":null,"abstract":"","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2367-2369"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575599/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142717688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Super-resolution for localizing electrode grids as small, deformable objects during epilepsy surgery using augmented reality headsets. 在癫痫手术期间使用增强现实耳机将电极网格定位为小型可变形物体的超分辨率。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-06-19 DOI: 10.1007/s11548-025-03401-5
Hizirwan S Salim, Abdullah Thabit, Sem Hoogteijling, Maryse A van 't Klooster, Theo van Walsum, Maeike Zijlmans, Mohamed Benmahdjoub

Purpose: Epilepsy surgery is a potential curative treatment for people with focal epilepsy. Intraoperative electrocorticogram (ioECoG) recordings from the brain guide neurosurgeons during resection. Accurate localization of epileptic activity and thus the ioECoG grids is critical for successful outcomes. We aim to develop and evaluate the feasibility of a novel method for localizing small, deformable objects using augmented reality (AR) head-mounted displays (HMDs) and artificial intelligence (AI). AR HMDs combine cameras and patient overlay visualization in a compact design.

Methods: We developed an image processing method for the HoloLens 2 to localize a 64-electrode ioECoG grid even when individual electrodes are indistinguishable due to low resolution. The method combines object detection, super-resolution, and pose estimation AI models with stereo triangulation. A synthetic dataset of 90,000 images trained the super-resolution and pose estimation models. The system was tested in a controlled environment against an optical tracker as ground truth. Accuracy was evaluated at distances between 40 and 90 cm.

Results: The system achieved sub-5 mm accuracy in localizing the ioECoG grid at distances shorter than 60 cm. At 40 cm, the accuracy remained below 2 mm, with an average standard deviation of less than 0.5 mm. At 60 cm the method processed on average 24 stereo frames per second.

Conclusion: This study demonstrates the feasibility of localizing small, deformable objects like ioECoG grids using AR HMDs. While results indicate clinically acceptable accuracy, further research is needed to validate the method in clinical environments and assess its impact on surgical precision and outcomes.

目的:癫痫手术是局灶性癫痫的一种潜在治疗方法。术中脑皮质电图(ioECoG)记录指导神经外科医生在切除过程中。癫痫活动的准确定位和脑ecog网格是成功治疗的关键。我们的目标是开发和评估一种利用增强现实(AR)头戴式显示器(hmd)和人工智能(AI)定位小型可变形物体的新方法的可行性。AR头戴式显示器在紧凑的设计中结合了相机和患者覆盖可视化。方法:我们为HoloLens 2开发了一种图像处理方法,即使在单个电极由于低分辨率而无法区分的情况下,也可以定位64电极的ioECoG网格。该方法将目标检测、超分辨率和姿态估计人工智能模型与立体三角测量相结合。一个由9万张图像组成的合成数据集训练了超分辨率和姿态估计模型。该系统在受控环境下与光学跟踪器作为地面真值进行了测试。在距离为40至90厘米之间评估精度。结果:该系统在距离小于60 cm的ioECoG网格定位精度达到了5 mm以下。在40 cm处,精度保持在2 mm以下,平均标准偏差小于0.5 mm。在60厘米处,该方法平均每秒处理24个立体帧。结论:本研究证明了使用AR头显定位小型可变形物体(如ioECoG网格)的可行性。虽然结果表明临床可接受的准确性,但需要进一步的研究来验证该方法在临床环境中的有效性,并评估其对手术精度和结果的影响。
{"title":"Super-resolution for localizing electrode grids as small, deformable objects during epilepsy surgery using augmented reality headsets.","authors":"Hizirwan S Salim, Abdullah Thabit, Sem Hoogteijling, Maryse A van 't Klooster, Theo van Walsum, Maeike Zijlmans, Mohamed Benmahdjoub","doi":"10.1007/s11548-025-03401-5","DOIUrl":"10.1007/s11548-025-03401-5","url":null,"abstract":"<p><strong>Purpose: </strong>Epilepsy surgery is a potential curative treatment for people with focal epilepsy. Intraoperative electrocorticogram (ioECoG) recordings from the brain guide neurosurgeons during resection. Accurate localization of epileptic activity and thus the ioECoG grids is critical for successful outcomes. We aim to develop and evaluate the feasibility of a novel method for localizing small, deformable objects using augmented reality (AR) head-mounted displays (HMDs) and artificial intelligence (AI). AR HMDs combine cameras and patient overlay visualization in a compact design.</p><p><strong>Methods: </strong>We developed an image processing method for the HoloLens 2 to localize a 64-electrode ioECoG grid even when individual electrodes are indistinguishable due to low resolution. The method combines object detection, super-resolution, and pose estimation AI models with stereo triangulation. A synthetic dataset of 90,000 images trained the super-resolution and pose estimation models. The system was tested in a controlled environment against an optical tracker as ground truth. Accuracy was evaluated at distances between 40 and 90 cm.</p><p><strong>Results: </strong>The system achieved sub-5 mm accuracy in localizing the ioECoG grid at distances shorter than 60 cm. At 40 cm, the accuracy remained below 2 mm, with an average standard deviation of less than 0.5 mm. At 60 cm the method processed on average 24 stereo frames per second.</p><p><strong>Conclusion: </strong>This study demonstrates the feasibility of localizing small, deformable objects like ioECoG grids using AR HMDs. While results indicate clinically acceptable accuracy, further research is needed to validate the method in clinical environments and assess its impact on surgical precision and outcomes.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2319-2327"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575507/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PDZSeg: adapting the foundation model for dissection zone segmentation with visual prompts in robot-assisted endoscopic submucosal dissection. PDZSeg:在机器人辅助内镜下粘膜下剥离中,采用基于视觉提示的解剖带分割基础模型。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-06-20 DOI: 10.1007/s11548-025-03437-7
Mengya Xu, Wenjin Mo, Guankun Wang, Huxin Gao, An Wang, Ning Zhong, Zhen Li, Xiaoxiao Yang, Hongliang Ren

Purpose: The intricate nature of endoscopic surgical environments poses significant challenges for the task of dissection zone segmentation. Specifically, the boundaries between different tissue types lack clarity, which can result in significant segmentation errors, as the models may misidentify or overlook object edges altogether. Thus, the goal of this work is to achieve the precise dissection zone suggestion under these challenges during endoscopic submucosal dissection (ESD) procedures and enhance the overall safety of ESD.

Methods: We introduce a prompted-based dissection zone segmentation (PDZSeg) model, aimed at segmenting dissection zones and specifically designed to incorporate different visual prompts, such as scribbles and bounding boxes. Our approach overlays these visual cues directly onto the images, utilizing fine-tuning of the foundational model on a specialized dataset created to handle diverse visual prompt instructions. This shift toward more flexible input methods is intended to significantly improve both the performance of dissection zone segmentation and the overall user experience.

Results: We evaluate our approaches using the three experimental setups: in-domain evaluation, evaluation under variability in visual prompts availability, and robustness assessment. By validating our approaches on the ESD-DZSeg dataset, specifically focused on the dissection zone segmentation task of ESD, our experimental results show that our solution outperforms state-of-the-art segmentation methods for this task. To the best of our knowledge, this is the first study to incorporate visual prompt design in dissection zone segmentation.

Conclusion: We introduce the prompted-based dissection zone segmentation (PDZSeg) model, which is specifically designed for dissection zone segmentation and can effectively utilize various visual prompts, including scribbles and bounding boxes. This model improves segmentation performance and enhances user experience by integrating a specialized dataset with a novel visual referral method that optimizes the architecture and boosts the effectiveness of dissection zone suggestions. Furthermore, we present the ESD-DZSeg dataset for robot-assisted endoscopic submucosal dissection (ESD), which serves as a benchmark for assessing dissection zone suggestions and visual prompt interpretation, thus laying the groundwork for future research in this field. Our code is available at https://github.com/FrankMOWJ/PDZSeg .

目的:内镜手术环境的复杂性对解剖区分割的任务提出了重大挑战。具体来说,不同组织类型之间的边界缺乏清晰度,这可能导致严重的分割错误,因为模型可能会错误识别或完全忽略物体边缘。因此,本工作的目标是在内镜下粘膜下剥离(ESD)过程中,在这些挑战下实现精确的剥离区建议,提高ESD的整体安全性。方法:我们引入了一个基于提示的解剖区分割(PDZSeg)模型,旨在分割解剖区,并专门设计了不同的视觉提示,如涂鸦和边界框。我们的方法将这些视觉线索直接覆盖到图像上,利用在专门数据集上对基础模型进行微调,以处理各种视觉提示指令。这种向更灵活的输入法的转变旨在显著改善解剖区分割的性能和整体用户体验。结果:我们使用三种实验设置来评估我们的方法:领域内评估、视觉提示可用性可变性评估和鲁棒性评估。通过在ESD- dzseg数据集上验证我们的方法,特别是在ESD的解剖区分割任务上,我们的实验结果表明,我们的解决方案在该任务上优于最先进的分割方法。据我们所知,这是第一个将视觉提示设计纳入解剖区分割的研究。结论:我们引入了基于提示的解剖区分割(PDZSeg)模型,该模型是专门为解剖区分割而设计的,可以有效地利用各种视觉提示,包括涂鸦和边界框。该模型通过将一个专门的数据集与一种新的视觉推荐方法相结合,优化了结构,提高了解剖区建议的有效性,从而提高了分割性能,增强了用户体验。此外,我们提出了用于机器人辅助内镜下粘膜剥离(ESD)的ESD- dzseg数据集,作为评估剥离区建议和视觉提示解释的基准,为该领域的未来研究奠定基础。我们的代码可在https://github.com/FrankMOWJ/PDZSeg上获得。
{"title":"PDZSeg: adapting the foundation model for dissection zone segmentation with visual prompts in robot-assisted endoscopic submucosal dissection.","authors":"Mengya Xu, Wenjin Mo, Guankun Wang, Huxin Gao, An Wang, Ning Zhong, Zhen Li, Xiaoxiao Yang, Hongliang Ren","doi":"10.1007/s11548-025-03437-7","DOIUrl":"10.1007/s11548-025-03437-7","url":null,"abstract":"<p><strong>Purpose: </strong>The intricate nature of endoscopic surgical environments poses significant challenges for the task of dissection zone segmentation. Specifically, the boundaries between different tissue types lack clarity, which can result in significant segmentation errors, as the models may misidentify or overlook object edges altogether. Thus, the goal of this work is to achieve the precise dissection zone suggestion under these challenges during endoscopic submucosal dissection (ESD) procedures and enhance the overall safety of ESD.</p><p><strong>Methods: </strong>We introduce a prompted-based dissection zone segmentation (PDZSeg) model, aimed at segmenting dissection zones and specifically designed to incorporate different visual prompts, such as scribbles and bounding boxes. Our approach overlays these visual cues directly onto the images, utilizing fine-tuning of the foundational model on a specialized dataset created to handle diverse visual prompt instructions. This shift toward more flexible input methods is intended to significantly improve both the performance of dissection zone segmentation and the overall user experience.</p><p><strong>Results: </strong>We evaluate our approaches using the three experimental setups: in-domain evaluation, evaluation under variability in visual prompts availability, and robustness assessment. By validating our approaches on the ESD-DZSeg dataset, specifically focused on the dissection zone segmentation task of ESD, our experimental results show that our solution outperforms state-of-the-art segmentation methods for this task. To the best of our knowledge, this is the first study to incorporate visual prompt design in dissection zone segmentation.</p><p><strong>Conclusion: </strong>We introduce the prompted-based dissection zone segmentation (PDZSeg) model, which is specifically designed for dissection zone segmentation and can effectively utilize various visual prompts, including scribbles and bounding boxes. This model improves segmentation performance and enhances user experience by integrating a specialized dataset with a novel visual referral method that optimizes the architecture and boosts the effectiveness of dissection zone suggestions. Furthermore, we present the ESD-DZSeg dataset for robot-assisted endoscopic submucosal dissection (ESD), which serves as a benchmark for assessing dissection zone suggestions and visual prompt interpretation, thus laying the groundwork for future research in this field. Our code is available at https://github.com/FrankMOWJ/PDZSeg .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2335-2344"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575525/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-volume rendering using depth buffers for surgical planning in virtual reality. 在虚拟现实中使用深度缓冲进行手术计划的多体渲染。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-06-07 DOI: 10.1007/s11548-025-03432-y
Balázs Faludi, Marek Żelechowski, Maria Licci, Norbert Zentai, Attill Saemann, Daniel Studer, Georg Rauter, Raphael Guzman, Carol Hasler, Gregory F Jost, Philippe C Cattin

Purpose: Planning highly complex surgeries in virtual reality (VR) provides a user-friendly and natural way to navigate volumetric medical data and can improve the sense of depth and scale. Using ray marching-based volume rendering to display the data has several benefits over traditional mesh-based rendering, such as offering a more accurate and detailed visualization without the need for prior segmentation and meshing. However, volume rendering can be difficult to extend to support multiple intersecting volumes in a scene while maintaining a high enough update rate for a comfortable user experience in VR.

Methods: Upon loading a volume, a rough ad hoc segmentation is performed using a motion-tracked controller. The segmentation is not used to extract a surface mesh and does not need to precisely define the exact surfaces to be rendered, as it only serves to separate the volume into individual sub-volumes, which are rendered in multiple, consecutive volume rendering passes. For each pass, the ray lengths are written into the camera depth buffer at early ray termination and read in subsequent passes to ensure correct occlusion between individual volumes.

Results: We evaluate the performance of the multi-volume renderer using three different use cases and corresponding datasets. We show that the presented approach can avoid dropped frames at the typical update rate of 90 frames per second of a desktop-based VR system and, therefore, provide a comfortable user experience even in the presence of more than twenty individual volumes.

Conclusion: Our proof-of-concept implementation shows the feasibility of VR-based surgical planning systems, which require dynamic and direct manipulation of the original volumetric data without sacrificing rendering performance and user experience.

目的:在虚拟现实(VR)中规划高度复杂的手术提供了一种用户友好和自然的方式来导航体积医疗数据,并可以提高深度和尺度感。与传统的基于网格的渲染相比,使用基于射线行进的体渲染来显示数据有几个好处,例如提供更准确和详细的可视化,而不需要事先分割和网格划分。然而,体渲染很难扩展到支持场景中的多个相交体,同时在VR中保持足够高的更新率以获得舒适的用户体验。方法:加载卷后,使用运动跟踪控制器执行粗略的临时分割。分割不用于提取表面网格,也不需要精确定义要渲染的确切表面,因为它只用于将体分割成单独的子体,这些子体在多个连续的体渲染通道中渲染。对于每一个通道,光线长度被写入相机深度缓冲区在早期光线终止和读取在随后的通道,以确保正确的遮挡在各个体之间。结果:我们使用三种不同的用例和相应的数据集评估了多体渲染器的性能。我们表明,所提出的方法可以避免以桌面VR系统每秒90帧的典型更新速率掉帧,因此,即使在超过20个单独的卷存在的情况下,也能提供舒适的用户体验。结论:我们的概念验证实现显示了基于vr的手术计划系统的可行性,该系统需要在不牺牲渲染性能和用户体验的情况下动态和直接操作原始体积数据。
{"title":"Multi-volume rendering using depth buffers for surgical planning in virtual reality.","authors":"Balázs Faludi, Marek Żelechowski, Maria Licci, Norbert Zentai, Attill Saemann, Daniel Studer, Georg Rauter, Raphael Guzman, Carol Hasler, Gregory F Jost, Philippe C Cattin","doi":"10.1007/s11548-025-03432-y","DOIUrl":"10.1007/s11548-025-03432-y","url":null,"abstract":"<p><strong>Purpose: </strong>Planning highly complex surgeries in virtual reality (VR) provides a user-friendly and natural way to navigate volumetric medical data and can improve the sense of depth and scale. Using ray marching-based volume rendering to display the data has several benefits over traditional mesh-based rendering, such as offering a more accurate and detailed visualization without the need for prior segmentation and meshing. However, volume rendering can be difficult to extend to support multiple intersecting volumes in a scene while maintaining a high enough update rate for a comfortable user experience in VR.</p><p><strong>Methods: </strong>Upon loading a volume, a rough ad hoc segmentation is performed using a motion-tracked controller. The segmentation is not used to extract a surface mesh and does not need to precisely define the exact surfaces to be rendered, as it only serves to separate the volume into individual sub-volumes, which are rendered in multiple, consecutive volume rendering passes. For each pass, the ray lengths are written into the camera depth buffer at early ray termination and read in subsequent passes to ensure correct occlusion between individual volumes.</p><p><strong>Results: </strong>We evaluate the performance of the multi-volume renderer using three different use cases and corresponding datasets. We show that the presented approach can avoid dropped frames at the typical update rate of 90 frames per second of a desktop-based VR system and, therefore, provide a comfortable user experience even in the presence of more than twenty individual volumes.</p><p><strong>Conclusion: </strong>Our proof-of-concept implementation shows the feasibility of VR-based surgical planning systems, which require dynamic and direct manipulation of the original volumetric data without sacrificing rendering performance and user experience.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2251-2258"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575470/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144250776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The impact of 3-dimensional models and surgical navigation for open liver surgery. 三维模型及手术导航对肝脏开腹手术的影响。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-07-01 DOI: 10.1007/s11548-025-03455-5
Karin A Olthof, Matteo Fusaglia, Anne G den Hartog, Niels F M Kok, Theo J M Ruers, Koert F D Kuhlmann

Purpose: Understanding patient-specific liver anatomy is crucial for patient safety and achieving complete treatment of all tumors during surgery. This study evaluates the impact of the use of patient-specific 3D liver models and surgical navigation on procedural complexity in open liver surgery.

Methods: Patients with colorectal liver metastases scheduled for open liver surgery were included between June 2022 and October 2023 at the Netherlands Cancer Institute. Patient-specific 3D liver models could be used upon request during the surgical procedure. Subsequently, surgeons could request additional surgical navigation by landmark registration using an electromagnetically tracked ultrasound transducer. Postoperatively, surgeons assessed the impact of the use of the model and navigation on procedural complexity on a scale from 1 to 10.

Results: 35 patients were included in this study, with a median number of 8 (ranging from 3 to 25) tumors. 3D models were utilized in all procedures. Additional navigation was requested in 21/35 of patients to improve intraoperative planning and tumor localization. The mean procedural complexity score with navigation was 4.3 (95% CI [3.7, 5.0]), compared to 7.8 (95% CI [6.6, 9.0]) with the 3D model alone. Both visualization methods improved lesion localization and provided better anatomical insight.

Conclusion: 3D models and surgical navigation significantly reduce the complexity of open liver surgery, especially in patients with bilobar disease. These tools enhance intraoperative decision-making and may lead to better surgical outcomes. The stepwise implementation of the visualization techniques in this study underscores the added benefit of surgical navigation beyond 3D modeling alone, supporting its potential for broader clinical implementation.

目的:了解患者特异性肝脏解剖结构对患者安全和手术期间实现所有肿瘤的完全治疗至关重要。本研究评估了使用患者特异性3D肝脏模型和手术导航对开放肝脏手术程序复杂性的影响。方法:纳入2022年6月至2023年10月期间荷兰癌症研究所计划进行开放肝手术的结直肠肝转移患者。在手术过程中,可以根据需要使用患者特定的3D肝脏模型。随后,外科医生可以通过使用电磁跟踪超声换能器进行地标注册来要求额外的手术导航。术后,外科医生评估使用模型和导航对手术复杂性的影响,评分从1到10。结果:本研究纳入35例患者,肿瘤中位数为8例(3 ~ 25例)。所有过程均采用三维模型。21/35的患者需要额外的导航以改善术中计划和肿瘤定位。导航的平均程序复杂性评分为4.3 (95% CI[3.7, 5.0]),而单独使用3D模型的平均程序复杂性评分为7.8 (95% CI[6.6, 9.0])。这两种可视化方法都改善了病灶定位,并提供了更好的解剖洞察力。结论:三维模型和手术导航可显著降低开放肝手术的复杂性,特别是对于双叶疾病患者。这些工具可以增强术中决策,并可能导致更好的手术结果。在这项研究中,可视化技术的逐步实施强调了手术导航在单独的3D建模之外的额外好处,支持其在更广泛的临床应用的潜力。
{"title":"The impact of 3-dimensional models and surgical navigation for open liver surgery.","authors":"Karin A Olthof, Matteo Fusaglia, Anne G den Hartog, Niels F M Kok, Theo J M Ruers, Koert F D Kuhlmann","doi":"10.1007/s11548-025-03455-5","DOIUrl":"10.1007/s11548-025-03455-5","url":null,"abstract":"<p><strong>Purpose: </strong>Understanding patient-specific liver anatomy is crucial for patient safety and achieving complete treatment of all tumors during surgery. This study evaluates the impact of the use of patient-specific 3D liver models and surgical navigation on procedural complexity in open liver surgery.</p><p><strong>Methods: </strong>Patients with colorectal liver metastases scheduled for open liver surgery were included between June 2022 and October 2023 at the Netherlands Cancer Institute. Patient-specific 3D liver models could be used upon request during the surgical procedure. Subsequently, surgeons could request additional surgical navigation by landmark registration using an electromagnetically tracked ultrasound transducer. Postoperatively, surgeons assessed the impact of the use of the model and navigation on procedural complexity on a scale from 1 to 10.</p><p><strong>Results: </strong>35 patients were included in this study, with a median number of 8 (ranging from 3 to 25) tumors. 3D models were utilized in all procedures. Additional navigation was requested in 21/35 of patients to improve intraoperative planning and tumor localization. The mean procedural complexity score with navigation was 4.3 (95% CI [3.7, 5.0]), compared to 7.8 (95% CI [6.6, 9.0]) with the 3D model alone. Both visualization methods improved lesion localization and provided better anatomical insight.</p><p><strong>Conclusion: </strong>3D models and surgical navigation significantly reduce the complexity of open liver surgery, especially in patients with bilobar disease. These tools enhance intraoperative decision-making and may lead to better surgical outcomes. The stepwise implementation of the visualization techniques in this study underscores the added benefit of surgical navigation beyond 3D modeling alone, supporting its potential for broader clinical implementation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2213-2218"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575497/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144546069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Objective skill assessment for cataract surgery from surgical microscope video. 从手术显微镜视频看白内障手术技能的客观评价。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-04-25 DOI: 10.1007/s11548-025-03366-5
Rebecca Hisey, Henry Lee, Adrienne Duimering, John Liu, Vasudha Gupta, Tamas Ungi, Christine Law, Gabor Fichtinger, Matthew Holden

Objective: Video offers an accessible method for automated surgical skill evaluation; however, many platforms still rely on traditional six-degree-of-freedom (6-DOF) tracking systems, which can be costly, cumbersome, and challenging to apply clinically. This study aims to demonstrate that trainee skill in cataract surgery can be assessed effectively using only object detection from monocular surgical microscope video.

Methods: One ophthalmologist and four residents performed cataract surgery on a simulated eye five times each, generating 25 recordings. Recordings included both the surgical microscope video and 6-DOF instrument tracking data. Videos were graded by two expert ophthalmologists using the ICO-OSCAR:SICS rubric. We computed motion-based metrics using both object detection from video and 6-DOF tracking. We first examined correlations between each metric and expert scores for each rubric criteria. Then, using these findings, we trained an ordinal regression model to predict scores from each tracking modality and compared correlation strengths with expert scores.

Results: Metrics from object detection generally showed stronger correlations with expert scores than 6-DOF tracking. For score prediction, 6-DOF tracking showed no significant advantage, while scores predicted from object detection achieved significantly stronger correlations with expert scores for four scoring criteria.

Conclusion: Our results indicate that skill assessment from monocular surgical microscope video can match, and in some cases exceed, the correlation strengths of 6-DOF tracking assessments. This finding supports the feasibility of using object detection for skill assessment without additional hardware.

目的:视频为外科手术技能的自动评估提供了一种便捷的方法;然而,许多平台仍然依赖于传统的六自由度(6-DOF)跟踪系统,这种系统既昂贵又笨重,而且在临床应用中具有挑战性。本研究旨在证明仅使用单眼手术显微镜视频的目标检测就可以有效地评估受训者的白内障手术技能。方法:1名眼科医生和4名住院医师分别在模拟眼上进行5次白内障手术,产生25次记录。记录包括手术显微镜视频和6自由度仪器跟踪数据。视频由两位眼科专家使用ICO-OSCAR:SICS评分。我们使用来自视频的物体检测和6自由度跟踪计算基于运动的度量。我们首先检查了每个指标和每个标题标准的专家分数之间的相关性。然后,利用这些发现,我们训练了一个有序回归模型来预测每个跟踪模态的得分,并将相关强度与专家得分进行比较。结果:与6自由度跟踪相比,来自目标检测的指标通常与专家得分表现出更强的相关性。对于分数预测,六自由度跟踪没有显示出显著的优势,而从目标检测预测的分数在四个评分标准上与专家分数的相关性显著增强。结论:我们的研究结果表明,单眼外科显微镜视频的技能评估可以匹配甚至在某些情况下超过六自由度跟踪评估的相关强度。这一发现支持了在没有额外硬件的情况下使用目标检测进行技能评估的可行性。
{"title":"Objective skill assessment for cataract surgery from surgical microscope video.","authors":"Rebecca Hisey, Henry Lee, Adrienne Duimering, John Liu, Vasudha Gupta, Tamas Ungi, Christine Law, Gabor Fichtinger, Matthew Holden","doi":"10.1007/s11548-025-03366-5","DOIUrl":"10.1007/s11548-025-03366-5","url":null,"abstract":"<p><strong>Objective: </strong>Video offers an accessible method for automated surgical skill evaluation; however, many platforms still rely on traditional six-degree-of-freedom (6-DOF) tracking systems, which can be costly, cumbersome, and challenging to apply clinically. This study aims to demonstrate that trainee skill in cataract surgery can be assessed effectively using only object detection from monocular surgical microscope video.</p><p><strong>Methods: </strong>One ophthalmologist and four residents performed cataract surgery on a simulated eye five times each, generating 25 recordings. Recordings included both the surgical microscope video and 6-DOF instrument tracking data. Videos were graded by two expert ophthalmologists using the ICO-OSCAR:SICS rubric. We computed motion-based metrics using both object detection from video and 6-DOF tracking. We first examined correlations between each metric and expert scores for each rubric criteria. Then, using these findings, we trained an ordinal regression model to predict scores from each tracking modality and compared correlation strengths with expert scores.</p><p><strong>Results: </strong>Metrics from object detection generally showed stronger correlations with expert scores than 6-DOF tracking. For score prediction, 6-DOF tracking showed no significant advantage, while scores predicted from object detection achieved significantly stronger correlations with expert scores for four scoring criteria.</p><p><strong>Conclusion: </strong>Our results indicate that skill assessment from monocular surgical microscope video can match, and in some cases exceed, the correlation strengths of 6-DOF tracking assessments. This finding supports the feasibility of using object detection for skill assessment without additional hardware.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2219-2230"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144059347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-end 2D/3D registration from pre-operative MRI to intra-operative fluoroscopy for orthopedic procedures. 从术前MRI到术中透视的端到端2D/3D配准。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-05-30 DOI: 10.1007/s11548-025-03426-w
Ping-Cheng Ku, Mingxu Liu, Robert Grupp, Andrew Harris, Julius K Oni, Simon C Mears, Alejandro Martin-Gomez, Mehran Armand

Purpose: Soft tissue pathologies and bone defects are not easily visible in intra-operative fluoroscopic images; therefore, we develop an end-to-end MRI-to-fluoroscopic image registration framework, aiming to enhance intra-operative visualization for surgeons during orthopedic procedures.

Methods: The proposed framework utilizes deep learning to segment MRI scans and generate synthetic CT (sCT) volumes. These sCT volumes are then used to produce digitally reconstructed radiographs (DRRs), enabling 2D/3D registration with intra-operative fluoroscopic images. The framework's performance was validated through simulation and cadaver studies for core decompression (CD) surgery, focusing on the registration accuracy of femur and pelvic regions.

Results: The framework achieved a mean translational registration accuracy of 2.4 ± 1.0 mm and rotational accuracy of 1.6 ± 0 . 8 for the femoral region in cadaver studies. The method successfully enabled intra-operative visualization of necrotic lesions that were not visible on conventional fluoroscopic images, marking a significant advancement in image guidance for femur and pelvic surgeries.

Conclusion: The MRI-to-fluoroscopic registration framework offers a novel approach to image guidance in orthopedic surgeries, exclusively using MRI without the need for CT scans. This approach enhances the visualization of soft tissues and bone defects, reduces radiation exposure, and provides a safer, more effective alternative for intra-operative surgical guidance.

目的:术中透视图像不易发现软组织病变和骨缺损;因此,我们开发了一个端到端的mri -透视图像配准框架,旨在增强外科医生在骨科手术过程中的术中可视化。方法:提出的框架利用深度学习来分割MRI扫描并生成合成CT (sCT)体积。然后使用这些sCT体积生成数字重建x线片(DRRs),实现与术中透视图像的2D/3D配准。通过核心减压(CD)手术的模拟和尸体研究验证了该框架的性能,重点关注股骨和骨盆区域的配准准确性。结果:该框架的平均平移配准精度为2.4±1.0 mm,旋转精度为1.6±0。8°用于尸体研究中的股骨区域。该方法成功地实现了术中坏死病变的可视化,这些坏死病变在常规透视图像上是看不到的,标志着股骨和骨盆手术图像引导的重大进步。结论:MRI-透视配准框架为骨科手术提供了一种新的图像引导方法,仅使用MRI而无需CT扫描。这种方法增强了软组织和骨缺损的可视化,减少了辐射暴露,并为术中外科指导提供了一种更安全、更有效的选择。
{"title":"End-to-end 2D/3D registration from pre-operative MRI to intra-operative fluoroscopy for orthopedic procedures.","authors":"Ping-Cheng Ku, Mingxu Liu, Robert Grupp, Andrew Harris, Julius K Oni, Simon C Mears, Alejandro Martin-Gomez, Mehran Armand","doi":"10.1007/s11548-025-03426-w","DOIUrl":"10.1007/s11548-025-03426-w","url":null,"abstract":"<p><strong>Purpose: </strong>Soft tissue pathologies and bone defects are not easily visible in intra-operative fluoroscopic images; therefore, we develop an end-to-end MRI-to-fluoroscopic image registration framework, aiming to enhance intra-operative visualization for surgeons during orthopedic procedures.</p><p><strong>Methods: </strong>The proposed framework utilizes deep learning to segment MRI scans and generate synthetic CT (sCT) volumes. These sCT volumes are then used to produce digitally reconstructed radiographs (DRRs), enabling 2D/3D registration with intra-operative fluoroscopic images. The framework's performance was validated through simulation and cadaver studies for core decompression (CD) surgery, focusing on the registration accuracy of femur and pelvic regions.</p><p><strong>Results: </strong>The framework achieved a mean translational registration accuracy of 2.4 ± 1.0 mm and rotational accuracy of 1.6 ± <math><mrow><mn>0</mn> <mo>.</mo> <msup><mn>8</mn> <mo>∘</mo></msup> </mrow> </math> for the femoral region in cadaver studies. The method successfully enabled intra-operative visualization of necrotic lesions that were not visible on conventional fluoroscopic images, marking a significant advancement in image guidance for femur and pelvic surgeries.</p><p><strong>Conclusion: </strong>The MRI-to-fluoroscopic registration framework offers a novel approach to image guidance in orthopedic surgeries, exclusively using MRI without the need for CT scans. This approach enhances the visualization of soft tissues and bone defects, reduces radiation exposure, and provides a safer, more effective alternative for intra-operative surgical guidance.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2355-2366"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144188521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Near-infrared beacons: tracking anatomy with biocompatible fluorescent dots for mixed reality surgical navigation. 近红外信标:用于混合现实外科导航的生物相容性荧光点跟踪解剖。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-05-01 DOI: 10.1007/s11548-025-03379-0
Wenhao Gu, Justin D Opfermann, Jonathan Knopf, Axel Krieger, Mathias Unberath

Purpose: Mixed reality for surgical navigation is an emerging tool for precision surgery. Achieving reliable surgical guidance hinges on robust tracking of the mixed reality device relative to patient anatomy. Contemporary approaches either introduce bulky fiducials that need to be invasively attached to the anatomy or make strong assumptions about the patient remaining stationary.

Methods: We present an approach to anatomy tracking that relies on biocompatible near-infrared fluorescent (NIRF) dots. Dots are quickly placed on the anatomy intra-operatively and the pose is tracked reliably via PnP-type methods. We demonstrate the potential of our NIRF dots approach to track patient movements after image registration on a 3D printed model, simulating an image-guided navigation process with a tablet-based mixed reality scenario.

Results: The dot-based pose tracking demonstrated an average accuracy of 1.13 mm in translation and 0.69 degrees in rotation under static conditions, and 1.39 mm/1.10 degrees, respectively, under dynamic conditions.

Conclusion: Our results are promising and encourage further research in the viability of integrating NIRF dots in mixed reality surgical navigation. These biocompatible dots may allow for reliable tracking of patient motion post-registration, providing a convenient alternative to invasive marker arrays. While our initial tests used a tablet, adaptation to head-mounted displays is plausible with suitable sensors.

目的:用于手术导航的混合现实是一种新兴的精确手术工具。实现可靠的手术指导取决于混合现实设备相对于患者解剖结构的鲁棒跟踪。当代的方法要么引入庞大的基准,需要侵入性地附着在解剖结构上,要么对患者保持静止做出强有力的假设。方法:提出了一种基于生物相容性近红外荧光(NIRF)点的解剖跟踪方法。术中快速将点放置在解剖结构上,并通过pnp型方法可靠地跟踪姿势。我们展示了我们的NIRF点方法在3D打印模型上图像配准后跟踪患者运动的潜力,模拟了基于平板电脑的混合现实场景的图像引导导航过程。结果:基于点的姿态跟踪在静态条件下的平均平移精度为1.13 mm,旋转精度为0.69°,在动态条件下的平均平移精度为1.39 mm/1.10°。结论:我们的结果是有希望的,并鼓励进一步研究在混合现实手术导航中整合NIRF点的可行性。这些生物相容性的点可以在注册后可靠地跟踪患者的运动,为侵入性标记阵列提供了一种方便的替代方案。虽然我们最初的测试使用的是平板电脑,但如果有合适的传感器,适应头戴式显示器是可行的。
{"title":"Near-infrared beacons: tracking anatomy with biocompatible fluorescent dots for mixed reality surgical navigation.","authors":"Wenhao Gu, Justin D Opfermann, Jonathan Knopf, Axel Krieger, Mathias Unberath","doi":"10.1007/s11548-025-03379-0","DOIUrl":"10.1007/s11548-025-03379-0","url":null,"abstract":"<p><strong>Purpose: </strong>Mixed reality for surgical navigation is an emerging tool for precision surgery. Achieving reliable surgical guidance hinges on robust tracking of the mixed reality device relative to patient anatomy. Contemporary approaches either introduce bulky fiducials that need to be invasively attached to the anatomy or make strong assumptions about the patient remaining stationary.</p><p><strong>Methods: </strong>We present an approach to anatomy tracking that relies on biocompatible near-infrared fluorescent (NIRF) dots. Dots are quickly placed on the anatomy intra-operatively and the pose is tracked reliably via PnP-type methods. We demonstrate the potential of our NIRF dots approach to track patient movements after image registration on a 3D printed model, simulating an image-guided navigation process with a tablet-based mixed reality scenario.</p><p><strong>Results: </strong>The dot-based pose tracking demonstrated an average accuracy of 1.13 mm in translation and 0.69 degrees in rotation under static conditions, and 1.39 mm/1.10 degrees, respectively, under dynamic conditions.</p><p><strong>Conclusion: </strong>Our results are promising and encourage further research in the viability of integrating NIRF dots in mixed reality surgical navigation. These biocompatible dots may allow for reliable tracking of patient motion post-registration, providing a convenient alternative to invasive marker arrays. While our initial tests used a tablet, adaptation to head-mounted displays is plausible with suitable sensors.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2309-2318"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144063150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training a deep learning model to predict the anatomy irradiated in fluoroscopic x-ray images. 训练一个深度学习模型来预测透视x射线图像中照射的解剖结构。
IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL Pub Date : 2025-11-01 Epub Date: 2025-05-26 DOI: 10.1007/s11548-025-03422-0
Lunchi Guo, Dennis Trujillo, James R Duncan, M Allan Thomas

Purpose: Accurate patient dosimetry estimates from fluoroscopically-guided interventions (FGIs) are hindered by limited knowledge of the specific anatomy that was irradiated. Current methods use data reported by the equipment to estimate the patient anatomy exposed during each irradiation event. We propose a deep learning algorithm to automatically match 2D fluoroscopic images with corresponding anatomical regions in computational phantoms, enabling more precise patient dose estimates.

Methods: Our method involves two main steps: (1) simulating 2D fluoroscopic images, and (2) developing a deep learning algorithm to predict anatomical coordinates from these images. For part (1), we utilized DeepDRR for fast and realistic simulation of 2D x-ray images from 3D computed tomography datasets. We generated a diverse set of simulated fluoroscopic images from various regions with different field sizes. In part (2), we employed a Residual Neural Network (ResNet) architecture combined with metadata processing to effectively integrate patient-specific information (age and gender) to learn the transformation between 2D images and specific anatomical coordinates in each representative phantom. For the Modified ResNet model, we defined an allowable error range of ± 10 mm.

Results: The proposed method achieved over 90% of predictions within ± 10 mm, with strong alignment between predicted and true coordinates as confirmed by Bland-Altman analysis. Most errors were within ± 2%, with outliers beyond ± 5% primarily in Z-coordinates for infant phantoms due to their limited representation in the training data. These findings highlight the model's accuracy and its potential for precise spatial localization, while emphasizing the need for improved performance in specific anatomical regions.

Conclusion: In this work, a comprehensive simulated 2D fluoroscopy image dataset was developed, addressing the scarcity of real clinical datasets and enabling effective training of deep-learning models. The modified ResNet successfully achieved precise prediction of anatomical coordinates from the simulated fluoroscopic images, enabling the goal of more accurate patient-specific dosimetry.

目的:通过透视引导干预(FGIs)准确估计患者剂量,由于对照射的特定解剖结构的了解有限而受到阻碍。目前的方法使用设备报告的数据来估计每次辐照事件中暴露的患者解剖结构。我们提出了一种深度学习算法来自动匹配二维透视图像与计算幻影中相应的解剖区域,从而实现更精确的患者剂量估计。方法:我们的方法包括两个主要步骤:(1)模拟二维透视图像;(2)开发一种深度学习算法,从这些图像中预测解剖坐标。在第(1)部分中,我们利用DeepDRR对来自3D计算机断层扫描数据集的2D x射线图像进行快速逼真的模拟。我们生成了一组不同的模拟透视图像,这些图像来自不同的区域,具有不同的场大小。在第(2)部分中,我们采用残差神经网络(ResNet)架构结合元数据处理,有效整合患者特定信息(年龄和性别),学习每个代表性幻影中2D图像与特定解剖坐标之间的转换。结果:该方法在±10 mm的范围内实现了90%以上的预测结果,Bland-Altman分析证实了预测坐标与真实坐标之间的强一致性。大多数误差在±2%以内,由于婴儿幻影在训练数据中的代表性有限,其异常值主要在z坐标中超过±5%。这些发现突出了该模型的准确性及其在精确空间定位方面的潜力,同时强调了在特定解剖区域改进性能的必要性。结论:在这项工作中,开发了一个全面的模拟二维透视图像数据集,解决了真实临床数据集的缺乏性,并实现了深度学习模型的有效训练。改进的ResNet成功地从模拟的透视图像中精确预测解剖坐标,从而实现更准确的患者特异性剂量测定目标。
{"title":"Training a deep learning model to predict the anatomy irradiated in fluoroscopic x-ray images.","authors":"Lunchi Guo, Dennis Trujillo, James R Duncan, M Allan Thomas","doi":"10.1007/s11548-025-03422-0","DOIUrl":"10.1007/s11548-025-03422-0","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate patient dosimetry estimates from fluoroscopically-guided interventions (FGIs) are hindered by limited knowledge of the specific anatomy that was irradiated. Current methods use data reported by the equipment to estimate the patient anatomy exposed during each irradiation event. We propose a deep learning algorithm to automatically match 2D fluoroscopic images with corresponding anatomical regions in computational phantoms, enabling more precise patient dose estimates.</p><p><strong>Methods: </strong>Our method involves two main steps: (1) simulating 2D fluoroscopic images, and (2) developing a deep learning algorithm to predict anatomical coordinates from these images. For part (1), we utilized DeepDRR for fast and realistic simulation of 2D x-ray images from 3D computed tomography datasets. We generated a diverse set of simulated fluoroscopic images from various regions with different field sizes. In part (2), we employed a Residual Neural Network (ResNet) architecture combined with metadata processing to effectively integrate patient-specific information (age and gender) to learn the transformation between 2D images and specific anatomical coordinates in each representative phantom. For the Modified ResNet model, we defined an allowable error range of ± 10 mm.</p><p><strong>Results: </strong>The proposed method achieved over 90% of predictions within ± 10 mm, with strong alignment between predicted and true coordinates as confirmed by Bland-Altman analysis. Most errors were within ± 2%, with outliers beyond ± 5% primarily in Z-coordinates for infant phantoms due to their limited representation in the training data. These findings highlight the model's accuracy and its potential for precise spatial localization, while emphasizing the need for improved performance in specific anatomical regions.</p><p><strong>Conclusion: </strong>In this work, a comprehensive simulated 2D fluoroscopy image dataset was developed, addressing the scarcity of real clinical datasets and enabling effective training of deep-learning models. The modified ResNet successfully achieved precise prediction of anatomical coordinates from the simulated fluoroscopic images, enabling the goal of more accurate patient-specific dosimetry.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"2345-2353"},"PeriodicalIF":2.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Computer Assisted Radiology and Surgery
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1