Pub Date : 2026-02-25DOI: 10.1007/s11548-026-03572-9
Deepak Raina, Lidia Al-Zogbi, Brian Teixeira, Vivek Singh, Ankur Kapoor, Thorsten Fleiter, Muyinatu A Lediju Bell, Vinciya Pandian, Axel Krieger
Purpose: Central venous catheterization (CVC) is a critical medical procedure for vascular access, hemodynamic monitoring, and life-saving interventions. Its success remains challenging due to the need for continuous ultrasound-guided visualization of a target vessel and approaching needle, which is further complicated by anatomical variability and operator dependency. Errors in needle placement can lead to life-threatening complications. While robotic systems offer a potential solution, achieving full autonomy remains challenging. In this work, we propose an end-to-end robotic ultrasound-guided CVC pipeline, from scan initialization to needle insertion.
Methods: We introduce a deep-learning model to identify clinically relevant anatomical landmarks from a depth image of the patient's neck, obtained using an RGB-D camera, to autonomously define the scanning region and paths. Then, a robot motion planning framework is proposed to scan, segment, reconstruct, and localize vessels (veins and arteries), followed by the identification of the optimal insertion zone. Finally, a needle guidance module plans the insertion under ultrasound guidance with operator's feedback. This pipeline was validated on a high-fidelity commercial phantom across 10 simulated clinical scenarios.
Results: The proposed pipeline achieved 10 out of 10 successful needle placements on the first attempt. Vessels were reconstructed with a mean error of 2.15 mm, and autonomous needle insertion was performed with an error less than or close to 1 mm.
Conclusion: To our knowledge, this is the first robotic CVC system demonstrated on a high-fidelity phantom with integrated planning, scanning, and insertion. Experimental results show its potential for clinical translation.
{"title":"AURA-CVC: Autonomous Ultrasound-guided Robotic Assistance for Central Venous Catheterization.","authors":"Deepak Raina, Lidia Al-Zogbi, Brian Teixeira, Vivek Singh, Ankur Kapoor, Thorsten Fleiter, Muyinatu A Lediju Bell, Vinciya Pandian, Axel Krieger","doi":"10.1007/s11548-026-03572-9","DOIUrl":"https://doi.org/10.1007/s11548-026-03572-9","url":null,"abstract":"<p><strong>Purpose: </strong>Central venous catheterization (CVC) is a critical medical procedure for vascular access, hemodynamic monitoring, and life-saving interventions. Its success remains challenging due to the need for continuous ultrasound-guided visualization of a target vessel and approaching needle, which is further complicated by anatomical variability and operator dependency. Errors in needle placement can lead to life-threatening complications. While robotic systems offer a potential solution, achieving full autonomy remains challenging. In this work, we propose an end-to-end robotic ultrasound-guided CVC pipeline, from scan initialization to needle insertion.</p><p><strong>Methods: </strong>We introduce a deep-learning model to identify clinically relevant anatomical landmarks from a depth image of the patient's neck, obtained using an RGB-D camera, to autonomously define the scanning region and paths. Then, a robot motion planning framework is proposed to scan, segment, reconstruct, and localize vessels (veins and arteries), followed by the identification of the optimal insertion zone. Finally, a needle guidance module plans the insertion under ultrasound guidance with operator's feedback. This pipeline was validated on a high-fidelity commercial phantom across 10 simulated clinical scenarios.</p><p><strong>Results: </strong>The proposed pipeline achieved 10 out of 10 successful needle placements on the first attempt. Vessels were reconstructed with a mean error of 2.15 mm, and autonomous needle insertion was performed with an error less than or close to 1 mm.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first robotic CVC system demonstrated on a high-fidelity phantom with integrated planning, scanning, and insertion. Experimental results show its potential for clinical translation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147285971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Anatomical tunnel placement in the anterior cruciate ligament (ACL) reconstruction is essential for better functional recovery and reduced complications. However, subjective factors and ambiguous definitions reduce the accuracy and efficiency of quantifying ACL position on 3D models. This study aims to develop and validate a fully automated framework for standardized 3D quadrant coordinate computation of femoral and tibial ACL footprints, enabling objective preoperative planning and postoperative evaluation.
Methods: An nnUNet-based foundation network was fine-tuned to reconstruct distal femur and proximal tibia 3D models from CT or MRI data. Automated template registration and morphological analysis were used to determine anatomical planes and generate individualized quadrant coordinate systems. The pipeline was validated on both CT and MRI datasets, comparing location accuracy, calculation repeatability, and time efficiency against manual methods.
Results: The 3D distances between the actual and automatic predicted centroids were 1.72 ± 0.94 mm and 1.47 ± 1.06 mm for femur and tibia, respectively, while the errors of manual method were 1.89 ± 1.42 mm and 2.11 ± 1.27 mm. The method achieved an average repeatability of 0.992 in quadrant calculations with different initializations, while the ICCs of two manual annotation were 0.961 (A), 0.946 (B), and 0.882 (A&B). The processing time for generating quadrant coordinate systems was significantly reduced to an average of 4.7 ± 1.3 s, compared to 8.5 ± 2.1 min for manual annotation.
Conclusion: This study presented the first fully automated, modality-independent method for 3D quadrant coordinate computation in knee surgery. The proposed framework delivers robust and standardized ACL anatomical locations across both CT and MRI data, enhancing the clinical efficiency of the preoperative planning and postoperative assessment of ACL reconstruction.
{"title":"Standardizing ACL tunnel placement: an automated method for knee quadrant computation.","authors":"Yufan Wang, Zhengliang Li, Yangyang Yang, Yinghui Hua, Tsung-Yuan Tsai","doi":"10.1007/s11548-026-03578-3","DOIUrl":"https://doi.org/10.1007/s11548-026-03578-3","url":null,"abstract":"<p><strong>Purpose: </strong>Anatomical tunnel placement in the anterior cruciate ligament (ACL) reconstruction is essential for better functional recovery and reduced complications. However, subjective factors and ambiguous definitions reduce the accuracy and efficiency of quantifying ACL position on 3D models. This study aims to develop and validate a fully automated framework for standardized 3D quadrant coordinate computation of femoral and tibial ACL footprints, enabling objective preoperative planning and postoperative evaluation.</p><p><strong>Methods: </strong>An nnUNet-based foundation network was fine-tuned to reconstruct distal femur and proximal tibia 3D models from CT or MRI data. Automated template registration and morphological analysis were used to determine anatomical planes and generate individualized quadrant coordinate systems. The pipeline was validated on both CT and MRI datasets, comparing location accuracy, calculation repeatability, and time efficiency against manual methods.</p><p><strong>Results: </strong>The 3D distances between the actual and automatic predicted centroids were 1.72 ± 0.94 mm and 1.47 ± 1.06 mm for femur and tibia, respectively, while the errors of manual method were 1.89 ± 1.42 mm and 2.11 ± 1.27 mm. The method achieved an average repeatability of 0.992 in quadrant calculations with different initializations, while the ICCs of two manual annotation were 0.961 (A), 0.946 (B), and 0.882 (A&B). The processing time for generating quadrant coordinate systems was significantly reduced to an average of 4.7 ± 1.3 s, compared to 8.5 ± 2.1 min for manual annotation.</p><p><strong>Conclusion: </strong>This study presented the first fully automated, modality-independent method for 3D quadrant coordinate computation in knee surgery. The proposed framework delivers robust and standardized ACL anatomical locations across both CT and MRI data, enhancing the clinical efficiency of the preoperative planning and postoperative assessment of ACL reconstruction.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147285960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-23DOI: 10.1007/s11548-026-03581-8
Karin A Olthof, Theo J M Ruers, Tiziano Natali, Lisanne P J Venix, Jasper N Smit, Anne G den Hartog, Niels F M Kok, Matteo Fusaglia, Koert F D Kuhlmann
Purpose: This proof-of-concept study evaluates the feasibility and accuracy of an ultrasound-based navigation system for open liver surgery. Unlike most conventional systems that rely on registration to preoperative imaging, the proposed system provides navigation-guided resection using 3D models generated from intraoperative ultrasound.
Methods: A pilot study was conducted in 25 patients undergoing resection of liver metastases. The first 5 cases served to optimize the workflow. Intraoperatively, an electromagnetic sensor compensated for organ motion, after which an ultrasound volume was acquired. Vasculature was segmented automatically and tumors semi-automatically using region-growing (n = 15) or a deep learning algorithm (n = 5). The resulting 3D model was visualized alongside tracked surgical instruments. Accuracy was assessed by comparing the distance between surgical clips and tumors in the navigation software with the same distance on a postoperative CT of the resected specimen.
Results: Navigation was successfully established in all 20 patients. However, 4 cases were excluded from the accuracy assessment due to intraoperative sensor detachment (n = 3) or incorrect data recording (n = 1). The complete navigation workflow was operational within 5-10 min. In 16 evaluable patients, 78 clip-to-tumor distances were analyzed. The median navigation accuracy was 3.2 mm [IQR: 2.8-4.8 mm], and an R0 resection was achieved in 15/16 (93.8%) patients, and one patient had an R1 vascular resection.
Conclusion: Navigation based solely on intraoperative ultrasound is feasible and accurate for liver surgery. This approach paves the way for simpler and more accurate image guidance systems.
{"title":"Navigated hepatic tumor resection using intraoperative ultrasound imaging.","authors":"Karin A Olthof, Theo J M Ruers, Tiziano Natali, Lisanne P J Venix, Jasper N Smit, Anne G den Hartog, Niels F M Kok, Matteo Fusaglia, Koert F D Kuhlmann","doi":"10.1007/s11548-026-03581-8","DOIUrl":"https://doi.org/10.1007/s11548-026-03581-8","url":null,"abstract":"<p><strong>Purpose: </strong>This proof-of-concept study evaluates the feasibility and accuracy of an ultrasound-based navigation system for open liver surgery. Unlike most conventional systems that rely on registration to preoperative imaging, the proposed system provides navigation-guided resection using 3D models generated from intraoperative ultrasound.</p><p><strong>Methods: </strong>A pilot study was conducted in 25 patients undergoing resection of liver metastases. The first 5 cases served to optimize the workflow. Intraoperatively, an electromagnetic sensor compensated for organ motion, after which an ultrasound volume was acquired. Vasculature was segmented automatically and tumors semi-automatically using region-growing (n = 15) or a deep learning algorithm (n = 5). The resulting 3D model was visualized alongside tracked surgical instruments. Accuracy was assessed by comparing the distance between surgical clips and tumors in the navigation software with the same distance on a postoperative CT of the resected specimen.</p><p><strong>Results: </strong>Navigation was successfully established in all 20 patients. However, 4 cases were excluded from the accuracy assessment due to intraoperative sensor detachment (n = 3) or incorrect data recording (n = 1). The complete navigation workflow was operational within 5-10 min. In 16 evaluable patients, 78 clip-to-tumor distances were analyzed. The median navigation accuracy was 3.2 mm [IQR: 2.8-4.8 mm], and an R0 resection was achieved in 15/16 (93.8%) patients, and one patient had an R1 vascular resection.</p><p><strong>Conclusion: </strong>Navigation based solely on intraoperative ultrasound is feasible and accurate for liver surgery. This approach paves the way for simpler and more accurate image guidance systems.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Accurate and automated assessment of the critical view of safety (CVS) is crucial for preventing bile duct injuries during laparoscopic cholecystectomy (LC). Existing methods often rely on costly segmentation labels or sequential inputs, limiting generalization and spatiotemporal understanding. This study proposes an efficient framework that removes the need for segmentation annotations while enhancing model robustness and temporal-spatial comprehension.
Methods: We introduce SMIL framework, a novel framework for automated CVS assessment that combines distillation-based self-supervised pretraining and multiple instance learning. A video transformer is first pretrained using label-free self-distillation to capture rich spatiotemporal features. We conducted a benchmark evaluation on the public Endoscapes2023 dataset, comprising 201 LC videos whose CVS-relevant frames are released at 1 fps (58,813 frames in total). Training/validation/testing followed the official video-level split of 120/41/40 videos. It is then fine-tuned via MIL by fusing global and local representations for multi-label CVS classification.
Results: Experimental results on the official test partition show that SMIL framework outperforms state-of-the-art methods without relying on segmentation labels. Compared to the strongest label-free baseline, SMIL achieves gains of 3.21% in mean average precision and 2.74% in balanced accuracy, setting a new benchmark for automated CVS assessment without dense annotations. Notably, SMIL also surpasses segmentation-supervised models in mAP, further highlighting its efficient learning capability.
Conclusion: The SMIL framework enables automated CVS assessment without segmentation annotations or sequential inputs. By combining self-supervised and multiple instance learning, it enhances spatiotemporal understanding and generalization in LC surgeries, offering both theoretical insights and practical value for surgical safety.
{"title":"CVS assessment via distillation-based self-supervised and multiple instance learning in laparoscopic cholecystectomy.","authors":"Hao Wang, Yutao Zhang, Yuxuan Yang, Yuanbo Zhu, Rui Xu","doi":"10.1007/s11548-026-03580-9","DOIUrl":"https://doi.org/10.1007/s11548-026-03580-9","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate and automated assessment of the critical view of safety (CVS) is crucial for preventing bile duct injuries during laparoscopic cholecystectomy (LC). Existing methods often rely on costly segmentation labels or sequential inputs, limiting generalization and spatiotemporal understanding. This study proposes an efficient framework that removes the need for segmentation annotations while enhancing model robustness and temporal-spatial comprehension.</p><p><strong>Methods: </strong>We introduce SMIL framework, a novel framework for automated CVS assessment that combines distillation-based self-supervised pretraining and multiple instance learning. A video transformer is first pretrained using label-free self-distillation to capture rich spatiotemporal features. We conducted a benchmark evaluation on the public Endoscapes2023 dataset, comprising 201 LC videos whose CVS-relevant frames are released at 1 fps (58,813 frames in total). Training/validation/testing followed the official video-level split of 120/41/40 videos. It is then fine-tuned via MIL by fusing global and local representations for multi-label CVS classification.</p><p><strong>Results: </strong>Experimental results on the official test partition show that SMIL framework outperforms state-of-the-art methods without relying on segmentation labels. Compared to the strongest label-free baseline, SMIL achieves gains of 3.21% in mean average precision and 2.74% in balanced accuracy, setting a new benchmark for automated CVS assessment without dense annotations. Notably, SMIL also surpasses segmentation-supervised models in mAP, further highlighting its efficient learning capability.</p><p><strong>Conclusion: </strong>The SMIL framework enables automated CVS assessment without segmentation annotations or sequential inputs. By combining self-supervised and multiple instance learning, it enhances spatiotemporal understanding and generalization in LC surgeries, offering both theoretical insights and practical value for surgical safety.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146221831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: In rectal cancer endoscopic surgery, the tissue hangs down and obstructs the surgical field when using a scalpel to make an incision to expose the tumor. Thus, there is a need for a tissue elimination device because the surgical field can be secured by lifting the hanging tissue during this surgery.
Methods: We developed a wire-driven film-type device for tissue elimination that can be driven even in limited spaces such as the spatial constraints in the rectum. The device is composed of a polyethylene terephthalate film and stainless steel with a belt-loop structure to reduce film swelling. The belt-loop prevents film swelling and facilitates device movement without restriction of the obstacles above it. In addition, the device generates a force exceeding 1 N at a displacement of 0-10 mm. We used a mechanical model to analyze the relationship between the force at the device tip and the tensile force acting at the belt-loop position. This analysis facilitated the optimization of the belt-loop position.
Results: A maximum force of 4.25 N was achieved at a tensile force of 40 N under the specified displacement conditions. Further, the device with the optimized belt-loop position achieved a maximum force of 10.3 N at a tensile force of 40 N, which was approximately 2.4 times higher than that of the device before optimization.
Conclusion: The wire-driven film-type device for tissue elimination can optimize the belt-loop position to satisfy the required specifications. Furthermore, evaluation results of the devices indicate it possesses sufficient performance for use in rectal cancer endoscopic surgery.
{"title":"Development of a wire-driven film-type device for tissue elimination in rectal cancer endoscopic surgery.","authors":"Masaaki Kuruma, Ryoto Fukunaka, Hiro Hasegawa, Masaaki Ito, Satoshi Konishi","doi":"10.1007/s11548-026-03570-x","DOIUrl":"https://doi.org/10.1007/s11548-026-03570-x","url":null,"abstract":"<p><strong>Purpose: </strong>In rectal cancer endoscopic surgery, the tissue hangs down and obstructs the surgical field when using a scalpel to make an incision to expose the tumor. Thus, there is a need for a tissue elimination device because the surgical field can be secured by lifting the hanging tissue during this surgery.</p><p><strong>Methods: </strong>We developed a wire-driven film-type device for tissue elimination that can be driven even in limited spaces such as the spatial constraints in the rectum. The device is composed of a polyethylene terephthalate film and stainless steel with a belt-loop structure to reduce film swelling. The belt-loop prevents film swelling and facilitates device movement without restriction of the obstacles above it. In addition, the device generates a force exceeding 1 N at a displacement of 0-10 mm. We used a mechanical model to analyze the relationship between the force at the device tip and the tensile force acting at the belt-loop position. This analysis facilitated the optimization of the belt-loop position.</p><p><strong>Results: </strong>A maximum force of 4.25 N was achieved at a tensile force of 40 N under the specified displacement conditions. Further, the device with the optimized belt-loop position achieved a maximum force of 10.3 N at a tensile force of 40 N, which was approximately 2.4 times higher than that of the device before optimization.</p><p><strong>Conclusion: </strong>The wire-driven film-type device for tissue elimination can optimize the belt-loop position to satisfy the required specifications. Furthermore, evaluation results of the devices indicate it possesses sufficient performance for use in rectal cancer endoscopic surgery.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-12DOI: 10.1007/s11548-025-03544-5
Diego Andrés Del Aguila Moraga, Huitao Wang, Satoshi Ando, Hayato Hoshina, Hiroshi Kawahira, Yukihiro Nomura, Toshiya Nakaguchi
Purpose: Preserving neurovascular bundles (NVB) during robot-assisted radical prostatectomy (RARP) is vital for reducing postoperative complications such as urinary incontinence and erectile dysfunction. Building on our previous work in ensemble-based NVB classification, we propose the hybrid self-supervised teacher-student model (Hybrid T-S model) that leverages multi-task learning to predict NVB preservation in prostatectomy videos.
Methods: Our approach integrates a self-supervised framework (DINO) as an online self-distillation objective on multi-crop views to learn robust embeddings in a limited data setting, rather than as a stand-alone large-scale pretraining. A teacher encoder, which is an exponential moving average (EMA) of the student encoder, and a reconstruction decoder are trained jointly with a classification head in a single end-to-end framework. This model is evaluated on single frames from patients who underwent RARP surgery.
Results: Our experimental evaluation shows that the Hybrid T-S model outperforms previous NVB classification methods. This highlights the benefits of integrating self-supervised learning and multi-task objectives in this surgical context. We achieved an average accuracy of 86.55%, precision of 83.93%, recall of 90.73%, F1-score of 87%, and AUROC of 88.35%, based on fivefold cross-validation.
Conclusion: Incorporating representation learning through self-distillation, classification, and reconstruction provides complementary signals that enhance the prediction of NVB preservation. Our Hybrid T-S model can assist surgeons in real decision-making and improve patient recovery.
{"title":"A hybrid self-supervised teacher-student model for predicting neurovascular bundle preservation in prostatectomy videos.","authors":"Diego Andrés Del Aguila Moraga, Huitao Wang, Satoshi Ando, Hayato Hoshina, Hiroshi Kawahira, Yukihiro Nomura, Toshiya Nakaguchi","doi":"10.1007/s11548-025-03544-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03544-5","url":null,"abstract":"<p><strong>Purpose: </strong>Preserving neurovascular bundles (NVB) during robot-assisted radical prostatectomy (RARP) is vital for reducing postoperative complications such as urinary incontinence and erectile dysfunction. Building on our previous work in ensemble-based NVB classification, we propose the hybrid self-supervised teacher-student model (Hybrid T-S model) that leverages multi-task learning to predict NVB preservation in prostatectomy videos.</p><p><strong>Methods: </strong>Our approach integrates a self-supervised framework (DINO) as an online self-distillation objective on multi-crop views to learn robust embeddings in a limited data setting, rather than as a stand-alone large-scale pretraining. A teacher encoder, which is an exponential moving average (EMA) of the student encoder, and a reconstruction decoder are trained jointly with a classification head in a single end-to-end framework. This model is evaluated on single frames from patients who underwent RARP surgery.</p><p><strong>Results: </strong>Our experimental evaluation shows that the Hybrid T-S model outperforms previous NVB classification methods. This highlights the benefits of integrating self-supervised learning and multi-task objectives in this surgical context. We achieved an average accuracy of 86.55%, precision of 83.93%, recall of 90.73%, F1-score of 87%, and AUROC of 88.35%, based on fivefold cross-validation.</p><p><strong>Conclusion: </strong>Incorporating representation learning through self-distillation, classification, and reconstruction provides complementary signals that enhance the prediction of NVB preservation. Our Hybrid T-S model can assist surgeons in real decision-making and improve patient recovery.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-12DOI: 10.1007/s11548-025-03561-4
Paul F R Wilson, Mohamed Harmanani, Minh Nguyen Nhat To, Amoon Jamzad, Tarek Elghareb, Zhuoxin Guo, Adam Kinnaird, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousavi
Purpose: Medical foundation models (FMs) offer a path to build high-performance diagnostic systems. However, their application to prostate cancer (PCa) detection from micro-ultrasound ( US) remains untested in clinical settings. We present ProstNFound+, an adaptation of FMs for PCa detection from US, along with its first prospective validation.
Methods: ProstNFound+ incorporates a medical FM, adapter tuning, and a custom prompt encoder that embeds PCa-specific clinical biomarkers. The model generates a cancer heatmap and a risk score for clinically significant PCa. Following training on multicenter retrospective data, the model is prospectively evaluated on data acquired five years later from a new clinical site. Model predictions are benchmarked against standard clinical scoring protocols (PRI-MUS and PI-RADS).
Results: ProstNFound+ shows strong generalization to the prospective data, with no performance degradation compared to retrospective evaluation. It aligns closely with clinical scores and produces interpretable heatmaps consistent with biopsy-confirmed lesions.
Conclusion: The results highlight its potential for clinical deployment, offering a scalable and interpretable alternative to expert-driven protocols.
{"title":"ProstNFound+: A Prospective Study using Medical Foundation Models for Prostate Cancer Detection.","authors":"Paul F R Wilson, Mohamed Harmanani, Minh Nguyen Nhat To, Amoon Jamzad, Tarek Elghareb, Zhuoxin Guo, Adam Kinnaird, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousavi","doi":"10.1007/s11548-025-03561-4","DOIUrl":"https://doi.org/10.1007/s11548-025-03561-4","url":null,"abstract":"<p><strong>Purpose: </strong>Medical foundation models (FMs) offer a path to build high-performance diagnostic systems. However, their application to prostate cancer (PCa) detection from micro-ultrasound ( <math><mi>μ</mi></math> US) remains untested in clinical settings. We present ProstNFound+, an adaptation of FMs for PCa detection from <math><mi>μ</mi></math> US, along with its first prospective validation.</p><p><strong>Methods: </strong>ProstNFound+ incorporates a medical FM, adapter tuning, and a custom prompt encoder that embeds PCa-specific clinical biomarkers. The model generates a cancer heatmap and a risk score for clinically significant PCa. Following training on multicenter retrospective data, the model is prospectively evaluated on data acquired five years later from a new clinical site. Model predictions are benchmarked against standard clinical scoring protocols (PRI-MUS and PI-RADS).</p><p><strong>Results: </strong>ProstNFound+ shows strong generalization to the prospective data, with no performance degradation compared to retrospective evaluation. It aligns closely with clinical scores and produces interpretable heatmaps consistent with biopsy-confirmed lesions.</p><p><strong>Conclusion: </strong>The results highlight its potential for clinical deployment, offering a scalable and interpretable alternative to expert-driven protocols.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Low-light endoscopic images often lack contrast and clarity, obscuring anatomical details and reducing diagnostic accuracy. This study develops a method to enhance image brightness and visibility, enabling clearer visualization of critical structures to support precise medical diagnoses and improve patient outcomes.
Methods: To specifically address nonuniform illumination, we propose BrightVAE, a model that uses a dual-receptive-field architecture to decouple global brightness correction from local texture preservation. Integrated attention-based modules (Attencoder and Attenquant) explicitly target and amplify underexposed regions while preventing over-saturation, thereby recovering human-evaluable details in shadowed areas. The model was trained and tested on a public endoscopic dataset, and its performance was evaluated against other techniques using quality metrics.
Results: The model outperformed alternatives, improving PSNR by 3.252 units, structural detail by 0.045, and perceptual quality by 0.014 compared to the best model before us, achieving a PSNR of 30.576, SSIM of 0.879, and LPIPS of 0.133, ensuring superior visibility of shadowed regions.
Conclusion: This approach advances endoscopic imaging by delivering sharper, reliable images, enhancing diagnostic precision in clinical practice. Improved visualization supports better detection of abnormalities, potentially leading to more effective treatment decisions and enhanced patient care.
{"title":"BrightVAE: luminosity enhancement in underexposed endoscopic images.","authors":"Farzaneh Koohestani, Zahra Nabizadeh, Nader Karimi, Shahram Shirani, Shadrokh Samavi","doi":"10.1007/s11548-026-03573-8","DOIUrl":"https://doi.org/10.1007/s11548-026-03573-8","url":null,"abstract":"<p><strong>Purpose: </strong>Low-light endoscopic images often lack contrast and clarity, obscuring anatomical details and reducing diagnostic accuracy. This study develops a method to enhance image brightness and visibility, enabling clearer visualization of critical structures to support precise medical diagnoses and improve patient outcomes.</p><p><strong>Methods: </strong>To specifically address nonuniform illumination, we propose BrightVAE, a model that uses a dual-receptive-field architecture to decouple global brightness correction from local texture preservation. Integrated attention-based modules (Attencoder and Attenquant) explicitly target and amplify underexposed regions while preventing over-saturation, thereby recovering human-evaluable details in shadowed areas. The model was trained and tested on a public endoscopic dataset, and its performance was evaluated against other techniques using quality metrics.</p><p><strong>Results: </strong>The model outperformed alternatives, improving PSNR by 3.252 units, structural detail by 0.045, and perceptual quality by 0.014 compared to the best model before us, achieving a PSNR of 30.576, SSIM of 0.879, and LPIPS of 0.133, ensuring superior visibility of shadowed regions.</p><p><strong>Conclusion: </strong>This approach advances endoscopic imaging by delivering sharper, reliable images, enhancing diagnostic precision in clinical practice. Improved visualization supports better detection of abnormalities, potentially leading to more effective treatment decisions and enhanced patient care.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-09-22DOI: 10.1007/s11548-025-03521-y
Teodoro Martín-Noguerol, Pilar López-Úbeda, Carolina Díaz-Angulo, Antonio Luna
Purpose: The application of large language models (LLMs) to radiology reports aims to enhance the extraction of meaningful textual data, improving clinical decision-making and patient management. Similar to radiomics in image analysis, lexomics seeks to reveal hidden patterns in radiology reports to support diagnosis, classification, and structured reporting.
Methods: LLMs and natural language processing (NLP) algorithms analyze radiology reports to extract relevant information, refine differential diagnoses, and integrate clinical data. These models process structured and unstructured text, identifying patterns and correlations that may otherwise go unnoticed. Applications include automated structured reporting, quality control, and enhanced communication of incidental and urgent findings.
Results: LLMs have demonstrated the ability to assist radiologists in real-time, standardizing classifications, improving report clarity, and enhancing the integration of radiology reports into electronic health records (EHRs). They support radiologists by reducing redundancies, structuring free-text reports, and detecting clinically relevant insights. Unlike radiomics, lexomics requires minimal computational power, making it more accessible in clinical settings.
Conclusion: Lexomics represents a significant advancement in AI-driven radiology, optimizing report utilization and communication. Future research should focus on addressing challenges such as data privacy, bias mitigation, and validation in diverse clinical scenarios to ensure ethical and effective implementation in radiological practice.
目的:将大语言模型(large language models, LLMs)应用于放射学报告,旨在增强对有意义的文本数据的提取,改善临床决策和患者管理。与图像分析中的放射组学类似,词汇组学旨在揭示放射学报告中的隐藏模式,以支持诊断、分类和结构化报告。方法:llm和自然语言处理(NLP)算法对放射学报告进行分析,提取相关信息,细化鉴别诊断,整合临床数据。这些模型处理结构化和非结构化文本,识别可能被忽视的模式和相关性。应用程序包括自动结构化报告、质量控制和增强的偶然和紧急发现的沟通。结果:法学硕士已经证明能够实时协助放射科医生,标准化分类,提高报告清晰度,并加强放射学报告与电子健康记录(EHRs)的集成。他们通过减少冗余、构建自由文本报告和检测临床相关见解来支持放射科医生。与放射组学不同,词汇组学需要最小的计算能力,使其更易于在临床环境中使用。结论:Lexomics代表了人工智能驱动放射学的重大进步,优化了报告的利用和交流。未来的研究应侧重于解决诸如数据隐私、减轻偏见和在不同临床情况下的验证等挑战,以确保在放射实践中伦理和有效地实施。
{"title":"Lexomics, or why to extract relevant information from radiology reports through LLMs.","authors":"Teodoro Martín-Noguerol, Pilar López-Úbeda, Carolina Díaz-Angulo, Antonio Luna","doi":"10.1007/s11548-025-03521-y","DOIUrl":"10.1007/s11548-025-03521-y","url":null,"abstract":"<p><strong>Purpose: </strong>The application of large language models (LLMs) to radiology reports aims to enhance the extraction of meaningful textual data, improving clinical decision-making and patient management. Similar to radiomics in image analysis, lexomics seeks to reveal hidden patterns in radiology reports to support diagnosis, classification, and structured reporting.</p><p><strong>Methods: </strong>LLMs and natural language processing (NLP) algorithms analyze radiology reports to extract relevant information, refine differential diagnoses, and integrate clinical data. These models process structured and unstructured text, identifying patterns and correlations that may otherwise go unnoticed. Applications include automated structured reporting, quality control, and enhanced communication of incidental and urgent findings.</p><p><strong>Results: </strong>LLMs have demonstrated the ability to assist radiologists in real-time, standardizing classifications, improving report clarity, and enhancing the integration of radiology reports into electronic health records (EHRs). They support radiologists by reducing redundancies, structuring free-text reports, and detecting clinically relevant insights. Unlike radiomics, lexomics requires minimal computational power, making it more accessible in clinical settings.</p><p><strong>Conclusion: </strong>Lexomics represents a significant advancement in AI-driven radiology, optimizing report utilization and communication. Future research should focus on addressing challenges such as data privacy, bias mitigation, and validation in diverse clinical scenarios to ensure ethical and effective implementation in radiological practice.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"223-225"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145114545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-12-03DOI: 10.1007/s11548-025-03552-5
Nuno S Rodrigues, Pedro Morais, Lukas R Buschle, Estevão Lima, João L Vilaça
Purpose: Minimally invasive surgical approaches are currently the standard of care for men with prostate cancer, presenting higher rates of erectile function preservation. With these laparoscopic techniques, there is an increasing amount of data and information available. Adaptive systems can play an important role, acting as an intelligent information filter, assuring that all the available information can become useful for the procedure and not overwhelming for the surgeon. Standardizing and structuring the surgical workflow are key requirements for such smart assistants to recognize the different surgical steps through context information about the environment. This work aims to do a detailed characterization of a laparoscopic radical prostatectomy procedure, focusing on the formalization of medical expert knowledge, via surgical process modeling.
Methods: Data were acquired manually, via online and offline observation, and discussion with medical experts. A total of 14 procedures were observed. Both manual laparoscopic radical prostatectomy and robot-assisted laparoscopic prostatectomy were studied. The derived SPM focuses only on the intraoperatory part of the procedure, with constant feedback from the endoscopic camera. For surgery observation, a dedicated Excel template was developed.
Results: The final model is represented in a descriptive and numerical format, combining task description with a workflow diagram arrangement for ease of interpretation. Practical applications of the generated surgical process model are exemplified with the creation of activation trees for surgical phase identification. Anatomical structures are reported for each phase, distinguishing between visible and inferable ones. Additionally, the surgeons involved are identified, surgical instruments, and actions performed in each phase. A total of 11 phases were identified and characterized. Average surgery duration is 87 min.
Conclusion: The generated surgical process model is a first step toward the development of a context-aware surgical assistant and can potentially be used as a roadmap by other research teams, operating room managers and surgical teams.
{"title":"In-depth characterization of a laparoscopic radical prostatectomy procedure based on surgical process modeling.","authors":"Nuno S Rodrigues, Pedro Morais, Lukas R Buschle, Estevão Lima, João L Vilaça","doi":"10.1007/s11548-025-03552-5","DOIUrl":"10.1007/s11548-025-03552-5","url":null,"abstract":"<p><strong>Purpose: </strong>Minimally invasive surgical approaches are currently the standard of care for men with prostate cancer, presenting higher rates of erectile function preservation. With these laparoscopic techniques, there is an increasing amount of data and information available. Adaptive systems can play an important role, acting as an intelligent information filter, assuring that all the available information can become useful for the procedure and not overwhelming for the surgeon. Standardizing and structuring the surgical workflow are key requirements for such smart assistants to recognize the different surgical steps through context information about the environment. This work aims to do a detailed characterization of a laparoscopic radical prostatectomy procedure, focusing on the formalization of medical expert knowledge, via surgical process modeling.</p><p><strong>Methods: </strong>Data were acquired manually, via online and offline observation, and discussion with medical experts. A total of 14 procedures were observed. Both manual laparoscopic radical prostatectomy and robot-assisted laparoscopic prostatectomy were studied. The derived SPM focuses only on the intraoperatory part of the procedure, with constant feedback from the endoscopic camera. For surgery observation, a dedicated Excel template was developed.</p><p><strong>Results: </strong>The final model is represented in a descriptive and numerical format, combining task description with a workflow diagram arrangement for ease of interpretation. Practical applications of the generated surgical process model are exemplified with the creation of activation trees for surgical phase identification. Anatomical structures are reported for each phase, distinguishing between visible and inferable ones. Additionally, the surgeons involved are identified, surgical instruments, and actions performed in each phase. A total of 11 phases were identified and characterized. Average surgery duration is 87 min.</p><p><strong>Conclusion: </strong>The generated surgical process model is a first step toward the development of a context-aware surgical assistant and can potentially be used as a roadmap by other research teams, operating room managers and surgical teams.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"279-289"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}